# Grade boxplots
To help quickly compare results within sections, this script lets me quickly (interactively) gather all the grades from the class (from a CSV from Canvas), split grades into sections, notify me of missing grades (maybe a grader isn't finished yet), and make boxplots for each section.

To preserve privacy, the grades are stored in a specific folder on my computer a) separate from this git repository and b) that is regularly erased with `rm -Prf` as to not persist student data on my computer.

To further that end, this notebook is saved without boxplots embedded, as this would reveal some element of student data. Although aggregated, it seems inappropriate.

However, the `get_grades` function is generally useful for reading from CSVs where we want to compare groups (in this case, in column `sectionCol=4`) by looking at the data in column `myCol` which can be set interactively. Other parameters are specific to my setting and could easily be adapted to a variety of gradebook formats.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import csv

In [None]:
def get_grades(myCol=-1, groups=['SDS302(55875)', 'SDS302(55880)', 'SDS302(55885)'], sectionCol = 4,
               hwStartCol = 5, nHomework=10, skipRows=[1, 2], sectionHeader="Section"):
    '''
    Return the grades in a dictionary of group to numpy array, along with the assignment name from the header row (row 0).
    
    Params:
        myCol: column for the grades, -1 for interactive mode
        groups: list of groups to compare (helps to refine the list of possible groups)
        sectionCol: column in the CSV with the group names
        hwStartCol: specific to my class, the homework is the first column of assignments (after student data) and
            all clustered together in order, so interactive mode for homework is straightforward
        nHomework: how many homework assignments, again for interactive mode to skip ahead to other assignments
        skipRows: list of row indices to not include (in my setting, a metadata row and a points possible row)
        sectionHeader: a checksum to ensure the header of the group column (in column sectionCol) matches what is expected,
            in case the CSV is generated in an unexpected order
    '''
    with open('/Users/Evan/Downloads/temp/grades.csv', 'r') as csvfile:
        reader = csv.reader(csvfile)
        grades = {group: [] for group in groups}
        missing = {group: 0 for group in groups}

        name = ""
        for (rowIdx, row) in enumerate(reader):
            if rowIdx in skipRows:
                continue
            if rowIdx == 0 and row[sectionCol] != sectionHeader:
                print("Section has moved")
                break
            if rowIdx == 0 and myCol >= 0:
                name = row[myCol]
                continue
            if rowIdx == 0 and myCol < 0:
                choice = raw_input("homework? ")
                if choice == "y":
                    hw = raw_input("number? ")
                    myCol = hwStartCol + int(hw) - 1
                    name = row[myCol]
                    continue
                for (idx, assignment) in enumerate(row[hwStartCol + nHomework:]):
                    print(idx, assignment)
                choice = raw_input("one of these? ")
                try:
                    myCol = hwStartCol + nHomework + int(choice)
                    name = row[myCol]
                    continue
                except:
                    pass
                if myCol == -1:
                    print("no assignment chosen")
                    break
            for section in grades:
                if row[sectionCol] == section:
                    try:
                        grades[section].append(float(row[myCol]))
                    except:
                        missing[section] += 1
        del reader
        for section in missing:
            if missing[section] > 0:
                print("Missing %d assignment(s) from %s" % (missing[section], section))
            grades[section] = np.array(grades[section])
    return (grades, name)

## Interactive

In [None]:
grades, assignment = get_grades()
labels = [section for section in grades]
labels.sort()
boxgrades = [grades[section] for section in labels]
plt.boxplot(boxgrades, labels=labels)
m = max(max(np.max(grades[labels[0]]), np.max(grades[labels[1]])), np.max(grades[labels[2]]))
plt.ylim((-1, m * 1.1))
plt.title(assignment)
plt.show()

## Specific index

In [None]:
grades, assignment = get_grades(5)
labels = [section for section in grades]
labels.sort()
boxgrades = [grades[section] for section in labels]
plt.boxplot(boxgrades, labels=labels)
m = max(max(np.max(grades[labels[0]]), np.max(grades[labels[1]])), np.max(grades[labels[2]]))
plt.ylim((-1, m * 1.1))
plt.title(assignment)
plt.show()