# Jupyter Notebook for Further "Automating Our Syllabus Updating" Each Semester

### Introductory Stuff:

In [1]:
from canvasapi import Canvas
from datetime import datetime, timedelta
import json
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_colwidth', None)
pd.set_option('display.width', 00)
from pandas.io.clipboards import to_clipboard
from tabulate import tabulate


API_URL = # This is where your institution's Instructure address goes ...
API_KEY = # Your key for accessing the Canvas API ...
course_number = # Just as you would expect, this is the unique id for your Canvas course ...

canvas = Canvas(API_URL, API_KEY)
course = canvas.get_course(course_number)

### Getting the Data We Need for Altering the Links to Each Weekly Assignment:

One of the things I need to get are all the id numbers for each of the discussion post assignments for every week. Since I've got three sections and the assignments are specific for each section (i.e. all the Section D students respond only to the "Section D" discussion board each week), I need to get all those id numbers so that we can link to them properly in the final "Reading Schedule" table. The ```canvasapi``` library has a function for this: ```get_discussion_topics()``` (documentation is [here](https://canvasapi.readthedocs.io/en/stable/course-ref.html?highlight=get_discussion_topics#canvasapi.course.Course.get_discussion_topics).)

In [2]:
discussion_topics = course.get_discussion_topics()
dt_list = []
for discussion in discussion_topics:
    print(discussion)
    dt_list.append(discussion)
print(len(dt_list))


Week 0 Discussion Board Post (Section E) (3771490)
Week 0 Discussion Board Post (Section F) (3771488)
Week 1 Discussion Board Post (Section E) (3771486)
Week 1 Discussion Board Post (Section F) (3771484)
Week 2 Discussion Board Post (Section E) (3771482)
Week 2 Discussion Board Post (Section F) (3771480)
Week 3 Discussion Board Post (Section E) (3771478)
Week 3 Discussion Board Post (Section F) (3771476)
Week 4 Discussion Board Post (Section E) (3771474)
Week 4 Discussion Board Post (Section F) (3771472)
Week 5 Discussion Board Post (Section E) (3771470)
Week 5 Discussion Board Post (Section F) (3771468)
Week 6 Discussion Board Post (Section E) (3771466)
Week 6 Discussion Board Post (Section F) (3771464)
Week 7 Discussion Board Post (Section E) (3771462)
Week 7 Discussion Board Post (Section F) (3771460)
Week 8 Discussion Board Post (Section E) (3771458)
Week 8 Discussion Board Post (Section F) (3771456)
Week 10 Discussion Board Post (Section E) (3771454)
Week 10 Discussion Board Post 

If you just append each of the discussions to the list and print the final list, you get all of the metadata for the post—I'm not taking a path here where I need all this information, so the output from the previous cell is all that I need. (A screenshot of the full ```DiscussionTopic``` object follows below.)

![full_dt_object_printout](discussion_topic_object.png)

In [3]:
string_list = []
for item in dt_list:
    string_list.append(str(item))
print(string_list)

['Week 0 Discussion Board Post (Section E) (3771490)', 'Week 0 Discussion Board Post (Section F) (3771488)', 'Week 1 Discussion Board Post (Section E) (3771486)', 'Week 1 Discussion Board Post (Section F) (3771484)', 'Week 2 Discussion Board Post (Section E) (3771482)', 'Week 2 Discussion Board Post (Section F) (3771480)', 'Week 3 Discussion Board Post (Section E) (3771478)', 'Week 3 Discussion Board Post (Section F) (3771476)', 'Week 4 Discussion Board Post (Section E) (3771474)', 'Week 4 Discussion Board Post (Section F) (3771472)', 'Week 5 Discussion Board Post (Section E) (3771470)', 'Week 5 Discussion Board Post (Section F) (3771468)', 'Week 6 Discussion Board Post (Section E) (3771466)', 'Week 6 Discussion Board Post (Section F) (3771464)', 'Week 7 Discussion Board Post (Section E) (3771462)', 'Week 7 Discussion Board Post (Section F) (3771460)', 'Week 8 Discussion Board Post (Section E) (3771458)', 'Week 8 Discussion Board Post (Section F) (3771456)', 'Week 10 Discussion Board P

So, now I have all the assignment ids, which are the seven-digit numbers in the parentheses within every string. I also have the Week number for each post as well, which I will need for the weekly schedule, of course. So let's write a simple little function to pull out the week number for each post:

In [4]:
import re

def extract_week_number(data):
    data = str(data)
    week_number_pattern = re.compile(r"\D([0-9]{1,2}\b)")
    week_result = week_number_pattern.findall(data)
    return week_result

week_number_list = list(map(extract_week_number, string_list))
week_number_list

[['0'],
 ['0'],
 ['1'],
 ['1'],
 ['2'],
 ['2'],
 ['3'],
 ['3'],
 ['4'],
 ['4'],
 ['5'],
 ['5'],
 ['6'],
 ['6'],
 ['7'],
 ['7'],
 ['8'],
 ['8'],
 ['10'],
 ['10'],
 ['11'],
 ['11'],
 ['12'],
 ['12'],
 ['13'],
 ['13'],
 ['14'],
 ['14'],
 ['15'],
 ['15'],
 ['1'],
 ['0'],
 ['2'],
 ['3'],
 ['4'],
 ['5'],
 ['6'],
 ['7'],
 ['8'],
 ['11'],
 ['12'],
 ['13'],
 ['14'],
 ['15'],
 ['10']]

I'm also going to want to pull out the section letter (D, E, or F) for each individual discussion topic:

In [5]:
def extract_section_number(data):
    data = str(data)
    section_pattern = re.compile(r"\(\D*\)")
    section_result = section_pattern.findall(data)
    return section_result

section_list = list(map(extract_section_number, string_list))

Now I need the unique id for  each individual discussion topic:

In [6]:
def extract_dt_id_pattern(data):
    data = str(data)
    dt_id_pattern = re.compile(r"(\d{7})")
    dt_id_result = dt_id_pattern.findall(data)
    return dt_id_result

dt_id_list = list(map(extract_dt_id_pattern, string_list))

Now I have all the information I need. I'm doing this the long way here, zipping together all three lists—one could skip the next cell and just pass everything into the dataframe.

In [7]:
zipped = zip(week_number_list, section_list, dt_id_list)
zipped_list = list(zipped)
len(zipped_list)

45

In [8]:
#df = pd.DataFrame(string_list)
df = pd.DataFrame(zipped_list)
df.columns = ['week', 'section', 'td_id']
df = df.explode('week')
df = df.explode('section')
df = df.explode('td_id')
df['section'] = df['section'].str.replace('(', '')
df['section'] = df['section'].str.replace(')', '')
df['week'] = df['week'].astype('int')

  df['section'] = df['section'].str.replace('(', '')
  df['section'] = df['section'].str.replace(')', '')


Once again, just for illustrative purposes here, I'm going the "long way 'round," so to speak. 

In [9]:
grouped = df.groupby("week", as_index=False).agg({'section': ', '.join, 'td_id' : ', '.join})
#grouped = grouped.reset_index(drop=True)
grouped

Unnamed: 0,week,section,td_id
0,0,"Section E, Section F, Section D","3771490, 3771488, 3771428"
1,1,"Section E, Section F, Section D","3771486, 3771484, 3771430"
2,2,"Section E, Section F, Section D","3771482, 3771480, 3771426"
3,3,"Section E, Section F, Section D","3771478, 3771476, 3771424"
4,4,"Section E, Section F, Section D","3771474, 3771472, 3771422"
5,5,"Section E, Section F, Section D","3771470, 3771468, 3771420"
6,6,"Section E, Section F, Section D","3771466, 3771464, 3771418"
7,7,"Section E, Section F, Section D","3771462, 3771460, 3771416"
8,8,"Section E, Section F, Section D","3771458, 3771456, 3771414"
9,10,"Section E, Section F, Section D","3771454, 3771452, 3771402"


In [10]:
for index, row in grouped.iterrows():
    td_id_string_to_split = row['td_id']
    td_id_split_string = td_id_string_to_split.split(',')
    item_e = td_id_split_string[0]
    item_f = td_id_split_string[1]
    item_d = td_id_split_string[2]
    grouped.loc[index, "td_id_e"] = item_e
    grouped.loc[index, "td_id_f"] = item_f
    grouped.loc[index, "td_id_d"] = item_d

    section_string_to_split = row['section']
    section_split_string = section_string_to_split.split(',')
    item_se = section_split_string[0]
    item_sf = section_split_string[1]
    item_sd = section_split_string[2]
    grouped.loc[index, "section_e"] = item_se
    grouped.loc[index, "section_f"] = item_sf
    grouped.loc[index, "section_d"] = item_sd
   
grouped.drop(['section', 'td_id'], axis=1, inplace=True)
grouped.head()

Unnamed: 0,week,td_id_e,td_id_f,td_id_d,section_e,section_f,section_d
0,0,3771490,3771488,3771428,Section E,Section F,Section D
1,1,3771486,3771484,3771430,Section E,Section F,Section D
2,2,3771482,3771480,3771426,Section E,Section F,Section D
3,3,3771478,3771476,3771424,Section E,Section F,Section D
4,4,3771474,3771472,3771422,Section E,Section F,Section D


In [11]:
grouped.drop(['section_d', 'section_e', 'section_f'], axis=1, inplace=True)
#grouped.set_index('week')
grouped['td_id_d'] = grouped['td_id_d'].str.strip()
grouped['td_id_f'] = grouped['td_id_f'].str.strip()

Now I want to take all this information wrangled together in the way I want it and then use that information to create the cell in the final column of the reading table with the correct links to the right discussion board assignment for each week, for each section:

In [12]:
for index, row in grouped.iterrows():
    section_d_hyperlink = f"https://stfrancis.instructure.com/courses/1174984/discussion_topics/{row['td_id_d']}"
    section_e_hyperlink = f"https://stfrancis.instructure.com/courses/1174984/discussion_topics/{row['td_id_e']}"
    section_f_hyperlink = f"https://stfrancis.instructure.com/courses/1174984/discussion_topics/{row['td_id_f']}"
    grouped.loc[index, "Reading Assignment Linked"] = f"Week {row['week']} Discussion Board Post—[Section D]({section_d_hyperlink}), [Section E]({section_e_hyperlink}), [Section F]({section_f_hyperlink})"

grouped = grouped.set_index('week')
#print(grouped)

# Just a place to keep this data we've now gotten wrangled all together
grouped.to_json('one_more_test.json', orient='index')

Here we want to do a little "sanity check" to see that things are looking the way we want them to once we output everything into the Markdown table using [```pandas.DataFrame.to_markdown()```](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_markdown.html):

In [13]:
print(grouped.to_markdown(tablefmt="github"))

|   week |   td_id_e |   td_id_f |   td_id_d | Reading Assignment Linked                                                                                                                                                                                                                                                                                  |
|--------|-----------|-----------|-----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|      0 |   3771490 |   3771488 |   3771428 | Week 0 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771428), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771490), [Section F](https://stfrancis.instructure.com/co

Looking good!

### Code to Alter the Dates in the Reading Schedule:

In [14]:
week_list = [i for i in range(1, 18) for j in range(4)]

In [15]:
start_date = datetime(2023, 1, 9)  # Start date
end_date = datetime(2023, 5, 5)    # End date

# Create a list of dates between the start and end dates
dates = []
while start_date <= end_date:
    # If the day of the week is Monday, Wednesday, Friday, or Sunday, append it to the list
    if start_date.weekday() in [0, 2, 4, 6]:
        dates.append(start_date.strftime('%A, %B %d'))
    start_date += timedelta(days=1)
print(len(week_list))
print(len(dates))

68
67


In [16]:
# We're removing the last item in the week_list because that week doesn't have a Sunday meeting time
week_list.pop()

# Sanity Check Once More!
print(len(week_list))
print(len(dates))

67
67


Here I'm reading in the ```.csv``` file that contains all of the reading for each day of the week:

In [17]:
table_df = pd.read_csv('course_schedule.csv', encoding='utf-8')
reading_assignments = table_df["Reading Assignment"].values.tolist()
assignments = table_df["Assignment"].values.tolist()

In [18]:
# Opening JSON file
with open('one_more_test.json') as json_file:
    data = json.load(json_file)

data["0"]

{'td_id_e': '3771490',
 'td_id_f': '3771488',
 'td_id_d': '3771428',
 'Reading Assignment Linked': 'Week 0 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771428), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771490), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771488)'}

In [19]:
schedule_data = {
    'Week': week_list,
    'Day/Date': dates,
    'Reading Assignment': reading_assignments,
    'Assignment': assignments
}

df_1 = pd.DataFrame(schedule_data)
df_1 = df_1.replace(np.nan, '', regex=True)
extracted_column = grouped["Reading Assignment Linked"]
df_2 = df_1.join(extracted_column)
df_2.head()

Unnamed: 0,Week,Day/Date,Reading Assignment,Assignment,Reading Assignment Linked
0,1,"Monday, January 09","Read Michael Moorcock's ""Foreweird"" (in _WC_) and the ""Introduction,"" by the VanderMeers (_WC_).",,"Week 0 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771428), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771490), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771488)"
1,1,"Wednesday, January 11","Choose one of the following readings that will give us some theoretical terms and terminology to use going forward: 1.) ""Section I: Introduction"" of H. P. Lovecraft's essay, ""Supernatural Horror in Literature"" available [here](https://www.hplovecraft.com/writings/texts/essays/shil.aspx) and his ""Notes on Writing Weird Fiction"" [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481303); 2.) Sigmund Freud's essay, ""The Uncanny"" available [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481313) and Chapter 2, ""Psychoanalytic Criticism,"" in Lois Tyson's _Critical Theory Today: A User-Friendly Guide_ available [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481281); 3.) Mark Fisher's _The Weird and the Eerie_ [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481319).","Week 0 Discussion Board Post (Section D, Section E, Section F)","Week 1 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771430), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771486), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771484)"
2,1,"Friday, January 13",Choose another one of the texts listed above for Wednesday and have a quick read-through.,,"Week 2 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771426), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771482), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771480)"
3,1,"Sunday, January 15",,"Week 1 Discussion Board Post (Section D, Section E, Section F)","Week 3 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771424), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771478), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771476)"
4,2,"Monday, January 16","Read Kubin, Blackwood, Saki, James, & Dunsany (_WC_, pp. 1-70)—for those interested in a graphic novel version of Saki's story, see Laura Neato's seven-page version [here](https://www.deviantart.com/lauraneato/art/Sredni-Vashtar-COVER-52751877).",,"Week 4 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771422), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771474), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771472)"


In [20]:
# Ugh, I hardcoded this as I do have a "Week 0" assignment, but not a "0" in the "Week" column ... 

df_1.iloc[1, 0] = 0
df_1.head()

Unnamed: 0,Week,Day/Date,Reading Assignment,Assignment
0,1,"Monday, January 09","Read Michael Moorcock's ""Foreweird"" (in _WC_) and the ""Introduction,"" by the VanderMeers (_WC_).",
1,0,"Wednesday, January 11","Choose one of the following readings that will give us some theoretical terms and terminology to use going forward: 1.) ""Section I: Introduction"" of H. P. Lovecraft's essay, ""Supernatural Horror in Literature"" available [here](https://www.hplovecraft.com/writings/texts/essays/shil.aspx) and his ""Notes on Writing Weird Fiction"" [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481303); 2.) Sigmund Freud's essay, ""The Uncanny"" available [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481313) and Chapter 2, ""Psychoanalytic Criticism,"" in Lois Tyson's _Critical Theory Today: A User-Friendly Guide_ available [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481281); 3.) Mark Fisher's _The Weird and the Eerie_ [here](https://learn.stfrancis.edu/api/v1/courses/1174984/files/62481319).","Week 0 Discussion Board Post (Section D, Section E, Section F)"
2,1,"Friday, January 13",Choose another one of the texts listed above for Wednesday and have a quick read-through.,
3,1,"Sunday, January 15",,"Week 1 Discussion Board Post (Section D, Section E, Section F)"
4,2,"Monday, January 16","Read Kubin, Blackwood, Saki, James, & Dunsany (_WC_, pp. 1-70)—for those interested in a graphic novel version of Saki's story, see Laura Neato's seven-page version [here](https://www.deviantart.com/lauraneato/art/Sredni-Vashtar-COVER-52751877).",


In [21]:
data["0"]

{'td_id_e': '3771490',
 'td_id_f': '3771488',
 'td_id_d': '3771428',
 'Reading Assignment Linked': 'Week 0 Discussion Board Post—[Section D](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771428), [Section E](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771490), [Section F](https://stfrancis.instructure.com/courses/1174984/discussion_topics/3771488)'}

Now we want to get the links to the topics substituted for the text that didn't have any of the links in them. The ```course_schedule.csv``` file didn't have any of the links to any of the assignments, those we want to change each semester ...

In [22]:
for index, row in df_1.iterrows():
    if row['Assignment'] == "":
        pass
    else:
        week_number = row['Week']
        week_number_string = f"{str(week_number)}"
        assignment_string = row['Assignment']
        if week_number == 9:
            pass
        elif week_number == 17:
            pass
        else:
            if week_number_string in assignment_string:
                df_1.loc[index, "Assignment"] = data[week_number_string]["Reading Assignment Linked"]

Okay, one more sanity check here to see how we're doing!

In [23]:
print(df_1.to_markdown(index=False, tablefmt="github"))

|   Week | Day/Date               | Reading Assignment                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Assignment                                                                                                                 

Of course, if you're not in the mood to select all of the output in the previous cell and then copy and paste it into another markdown file, pandas has this nice little [```to_clipboard()```](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_clipboard.html) function that takes the output and copies it to the clipboard. All my thanks for this little shortcut goes to [Joel McClune](https://joelmccune.com/pandas-dataframe-to-markdown/)!

In [24]:
to_clipboard(df_1.to_markdown(index=False), tablefmt="grid", excel=False)