# Courses Demo
This Jupyter notebook is for exploring the data set courses20-21.json
which consists of all Brandeis courses in the 20-21 academic year (Fall20, Spr21, Sum21) 
which had at least 1 student enrolled.

First we need to read the json file into a list of Python dictionaries

In [None]:
import json
from schedule import *
from course_search import *

In [None]:
with open("courses20-21.json","r",encoding='utf-8') as jsonfile:
    courses = json.load(jsonfile)

## Structure of a course
Next we look at the fields of each course dictionary and their values

In [None]:
print('there are',len(courses),'courses in the dataset')
print('here is the data for course 1246')
courses[1246]

## Cleaning the data
If we want to sort courses by instructor or by code, we need to replace the lists with tuples (which are immutable lists)

In [None]:
for course in courses:
        course['instructor'] = tuple(course['instructor'])
        course['coinstructors'] = tuple([tuple(f) for f in course['coinstructors']])
        course['code']= tuple(course['code'])

In [None]:
print('notice that the instructor and code are tuples now')
courses[1246]

# Which terms are represented?

In [None]:
terms = {c['term'] for c in courses}
print(terms)

# What are all the subjects

In [None]:
subjects = {c['subject'] for c in courses}
print("There are " + str(len(subjects)) + " subjects. They are:")
print(subjects)

# 5.a: How many instructors taught at Brandeis last year?

In [None]:
# I'm not sure which terms are considered to be last year, so this includes all terms
instructors = {c['instructor'] for c in courses}
print("There are " + str(len(instructors)) + " instuctors.")

In [None]:
instructors = {c['instructor'] for c in courses if c['enrolled']>=10}
print("There are " + str(len(instructors)) + " instuctors that taught a class with at least 10 students.")

# What are the 5 largest course sections?

In [None]:
largest_courses = sorted(courses, key = lambda course: course['enrolled'], reverse=True)
[(course['enrolled'], course['name']) for course in largest_courses[:5]]

# 5.b: What is the total number of students taking COSI courses last year?

In [None]:
cosi_students=[course['enrolled'] for course in courses if course['subject']=="COSI"]
print("There were "+str(sum(cosi_students))+" students who took COSI courses.")

# 5.c: What was the median size of a COSI course last year (counting only those courses with at least 10 students)

In [None]:
cosi_enrolled=sorted([course['enrolled'] for course in courses if course['subject']=="COSI" and course['enrolled']>=10])
median = (cosi_enrolled[len(cosi_enrolled)//2-1]/2.0+cosi_enrolled[len(cosi_enrolled)//2]/2.0, (cosi_enrolled)[len(cosi_enrolled)//2])[len(cosi_enrolled) % 2]
print("The median size of a COSI course last year was " + str(median) + ".")

# 5.d: Create a list of tuples (E,S) where S is a subject and E is the number of students enrolled in courses in that subject, sort it and print the top 10. This shows the top 10 subjects in terms of number of students taught.


In [None]:
def num_enrollments(subject):
    enroll = [course['enrolled'] for course in courses if course['subject']==subject]
    return sum(enroll)

subject_enrollments = sorted([(num_enrollments(subject),subject) for subject in subjects],reverse=True)[:10]
print("Top 10 subjects in terms of number of students taught:\n",[sub[1] for sub in subject_enrollments],"\n---------\n","More info (# of students,subject):\n", subject_enrollments)

# 5.e: Do the same as in (d) but print the top 10 subjects in terms of number of courses offered

In [None]:
def courses_offered(subject):
    return len([c for c in courses if c['subject']==subject])
subject_courses = sorted([(courses_offered(subject),subject) for subject in subjects],reverse=True)[:10]
print("Top 10 subjects in terms of number of courses offered:\n",[sub[1] for sub in subject_courses],"\n---------\n","More info (# of courses,subject):\n", subject_courses)


# 5.f: Do the same as (d) but print the top 10 subjects in terms of number of faculty teaching courses in that subject

In [None]:
def num_faculty(subject):
    return len({course['instructor'] for course in courses if course['subject']==subject})
faculty_subject = sorted([(num_faculty(subject),subject) for subject in subjects],reverse=True)[:10]
print("Top 10 subjects in terms of number of faculty teaching courses:\n",[sub[1] for sub in faculty_subject],"\n---------\n","More info (# of faculty,subject):\n", faculty_subject)
# print(sorted(faculty_subject,reverse=True)[:10])

# 5.g: List the top 20 faculty in terms of number of students they taught

In [None]:
def students_taught(instructor):
    return sum({course['enrolled'] for course in courses if course['instructor']==instructor})
top_facu=sorted([(students_taught(instructor),instructor) for instructor in instructors],reverse=True)[:10]
print("Top 20 faculty in terms of number of students they taught:\n",[sub[1] for sub in top_facu],"\n---------\n","More info (# of enrollment,faculty):\n",top_facu)

# 5.h: List the top 20 courses in terms of number of students taking that course across semesters and sections

In [None]:
tops = {}
for c in courses:
    name = c['name']
    enrolled = c['enrolled']
    if name in tops:
        tops[name] = tops[name] + enrolled
    else:
        tops[name] = enrolled
sorted_tops = [(k,v) for k, v in sorted(tops.items(), key=lambda x: x[1], reverse=True)]
count = 1
for course in sorted_tops:
    if (count > 20):
        break
    print(str(count) + ". " + course[0] + " - " + str(course[1]))
    count += 1

# 5.i Creative Questions

Angelo's question - What is the average amount of students in a COSI courses that have at least 1 student?

In [None]:
cosi_enrollments = [c['enrolled'] for c in courses if c['subject'] == "COSI" if c['enrolled'] > 0]
sum3 = 0
for val in cosi_enrollments:
    sum3 += val
avg = sum3 // len(cosi_enrollments)
print("The average amount of students in a COSI course that has students is " + str(avg) + "!")


Su Lei's question - What percentage of available courses required instructor's signature to enroll?

In [None]:
sign_required=[course for course in courses if "Instructor's Signature Required".lower() in course['details'].lower()]
percentage = len(sign_required)/len(courses)
print("{:.2f}".format(percentage),"% of the courses required instructor's signature.")

Josh Liu: Which classes have descriptions of which every word is in alphabetical order?

In [None]:
alphabetical_classes = []
for course in courses:
    isAlphabetical = True
    letter = "a"
    word_array = course['description'].split()
    for word in word_array:
        word = word.lower()
        if word[0] < letter:
            isAlphabetical = False
        else:
            letter = word[0]
    if isAlphabetical and len(word_array) > 1:
        alphabetical_classes.append(course)
print("There are " + str(len(alphabetical_classes)) + " classes with alphabetically ordered descriptions!")
print("... many of them independent studies with a very short description. Oh well, it technically fits.")

# 6.a,b,c: Showing the title, description and custom filter

In [None]:
s = Schedule(courses)
#print(s.courses[:3])
titles = s.title("computer")
descriptions = s.description("human")
ind_studies = s.independent_study_filter(True)
print(titles.courses[0]['name'])
print("-----")
print(descriptions.courses[0]['description'])
print("-----")
print(ind_studies.courses[0]['name'] + " - " + str(ind_studies.courses[0]['independent_study']))