# Course Schedule

The course will meet on Monday and Wednesday from 3:30 to 5:20 PM EST. 

Here is the current week-by-week schedule 📅 . We may adjust as we go along. 

To get started, let's create the course schedule using the pandas module in Python. **Click "show" to see the underlying code!**

In [1]:
# import modules
import pandas as pd
import re
import numpy as np


# tell python to display output and print multiple objects
from IPython.display import display, HTML
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# create range b/t start and end date of course
start_date = pd.to_datetime("2022-09-12")
end_date = pd.to_datetime("2022-11-15")
st_alldates = pd.date_range(start_date, end_date)

# subset to days in that range equal to Monday or Wednesday
st_mw = st_alldates[st_alldates.day_name().isin(['Monday', 'Wednesday'])]

# create DataFrame with that information
st_dates = [re.sub("2022\\-", "", str(day.date())) for day in st_mw] 
course_sched = pd.DataFrame({'dow': st_mw.day_name(),
                             'date': st_dates})
course_sched['day_date'] = course_sched.dow.astype(str) + " " + \
            course_sched.date.astype(str) 

# display the resulting date sequence
display(course_sched.day_date)

0        Monday 09-12
1     Wednesday 09-14
2        Monday 09-19
3     Wednesday 09-21
4        Monday 09-26
5     Wednesday 09-28
6        Monday 10-03
7     Wednesday 10-05
8        Monday 10-10
9     Wednesday 10-12
10       Monday 10-17
11    Wednesday 10-19
12       Monday 10-24
13    Wednesday 10-26
14       Monday 10-31
15    Wednesday 11-02
16       Monday 11-07
17    Wednesday 11-09
18       Monday 11-14
Name: day_date, dtype: object

The next few blocks of code creates the actual schedule content by joining the above list of dates with course concepts. 

In [2]:
# create basic schedule content

# list of concepts
concepts = ["Course intro. & software setup",
            "Pandas: aggregation, joins, lambda",
            "Pandas wrap-up & user-defined functions",
            "Workflow basics: command line & Github",
            "Catchup and LaTeX",
            "Introduction to merging",
            "Regular expressions (Regex)",
            "Probabilistic matching",
            "Text as data: part one",
            "Text as data: part two",
            "APIs: part one (NAEP & Yelp)",
            "APIs: part two (Twitter)",
            "Supervised machine learning: part 1",
            "Supervised machine learning: part 2",
            "Web-scraping: part one",
            "Web-scraping: part two (Scrapy)",
            "SQL",
            "Final project work session",
            "Final presentations"]

# check that concepts match number of weeks
assert len(course_sched.day_date) == len(concepts)

# combine dates with concepts
course_sched_concepts = pd.DataFrame({'Date': course_sched.day_date,
                                     'Concepts': concepts})

df = course_sched_concepts.copy()

print(df)

               Date                                 Concepts
0      Monday 09-12           Course intro. & software setup
1   Wednesday 09-14       Pandas: aggregation, joins, lambda
2      Monday 09-19  Pandas wrap-up & user-defined functions
3   Wednesday 09-21   Workflow basics: command line & Github
4      Monday 09-26                        Catchup and LaTeX
5   Wednesday 09-28                  Introduction to merging
6      Monday 10-03              Regular expressions (Regex)
7   Wednesday 10-05                   Probabilistic matching
8      Monday 10-10                   Text as data: part one
9   Wednesday 10-12                   Text as data: part two
10     Monday 10-17             APIs: part one (NAEP & Yelp)
11  Wednesday 10-19                 APIs: part two (Twitter)
12     Monday 10-24      Supervised machine learning: part 1
13  Wednesday 10-26      Supervised machine learning: part 2
14     Monday 10-31                   Web-scraping: part one
15  Wednesday 11-02     

In [3]:
# add DataCamp modules to schedule, matching to concepts conditionally
match_col = "Concepts" # concepts column to match on

tomatch = [df[match_col] == "Pandas wrap-up & user-defined functions",
           df[match_col] == "Workflow basics: command line & Github",
           df[match_col] == "Introduction to merging",
           df[match_col] == "Probabilistic matching",
           df[match_col] == "Supervised machine learning: part 1"]

# define DataCamp modules
modules = ["Data Manipulation with pandas (course)",
           "Manipulating files and directories (ch. in Intro to Shell)",
           "Joining Data with pandas (chs. 1-2)",
           "Regular Expressions for Pattern Matching (chapter)",
           "Supervised Learning with scikit-learn (course)"]

'''
**Optional DataCamp courses & chapters for further learning:**

- Intermediate python: loops
- Python data science toolbox (Part 1)
- Object-Oriented Programming in Python: OOP Fundamentals
- Regular expressions in Python (first three chapters)
- Introduction to natural language processing in Python
- Introduction to databases in Python
- Intermediate importing data in python
- Intermediate SQL queries
- Web scraping in python
- Introduction to data visualization with MatPlotLib
- Introduction to data visualization with ggplot2
'''

df["DataCamp module(s) (if any)"] = np.select(tomatch, 
                                              modules, 
                                              default = "")

'\n**Optional DataCamp courses & chapters for further learning:**\n\n- Intermediate python: loops\n- Python data science toolbox (Part 1)\n- Object-Oriented Programming in Python: OOP Fundamentals\n- Regular expressions in Python (first three chapters)\n- Introduction to natural language processing in Python\n- Introduction to databases in Python\n- Intermediate importing data in python\n- Intermediate SQL queries\n- Web scraping in python\n- Introduction to data visualization with MatPlotLib\n- Introduction to data visualization with ggplot2\n'

In [4]:
# add assignments to schedule, matching to dates/concepts conditionally
date_col = "Date" # date column to match on

due_dates = [df[date_col] == "Wednesday 09-21",
             df[date_col] == "Wednesday 10-05",
             df[date_col] == "Wednesday 10-19",
             df[date_col] == "Wednesday 11-02",
             df[date_col] == "Monday 11-14"]

# define assignments
assignments = ["Problem set one (due Sunday 09-25)",
               "Problem set two (due Sunday 10-09)",
               "Final project milestone 1 (due Friday 10-21);<br>Problem set three (due Sunday 10-23)",
               "Problem set four (due Friday 11-04);<br>Final project milestone 2 (due Sunday 11-06) ",
               "Problem set five (due Friday 11-18);<br>Final project presentation (paper due Tuesday 11-22)"]

df["Due (11:59 PM EST unless otherwise specified)"] = np.select(due_dates,
                                                                assignments,
                                                                default = "")

In [5]:
HTML(df.to_html(index=False, escape = False))

Date,Concepts,DataCamp module(s) (if any),Due (11:59 PM EST unless otherwise specified)
Monday 09-12,Course intro. & software setup,,
Wednesday 09-14,"Pandas: aggregation, joins, lambda",,
Monday 09-19,Pandas wrap-up & user-defined functions,Data Manipulation with pandas (course),
Wednesday 09-21,Workflow basics: command line & Github,Manipulating files and directories (ch. in Intro to Shell),Problem set one (due Sunday 09-25)
Monday 09-26,Catchup and LaTeX,,
Wednesday 09-28,Introduction to merging,Joining Data with pandas (chs. 1-2),
Monday 10-03,Regular expressions (Regex),,
Wednesday 10-05,Probabilistic matching,Regular Expressions for Pattern Matching (chapter),Problem set two (due Sunday 10-09)
Monday 10-10,Text as data: part one,,
Wednesday 10-12,Text as data: part two,,
