# Grading curve

Per [SIPA policy](https://bulletin.columbia.edu/sipa/academic-policies/grading-system-academic-progress/):

> Grades submitted for SIPA core courses must have an average GPA between 3.2 and 3.4, with the goal being 3.3. Courses with enrollments over 35 are also recommended to follow this rule.

In [1]:
MIN_AVG_GPA = 3.2
MAX_AVG_GPA = 3.4

This notebook shows the methodology by which the grade cutoffs are computed. **This methodology (and thus your estimated course grade) is subject to change.**

## Getting your estimated course grade

1. [Open {{lms_name}}.]({{lms_url}})
1. Go to `Grades`.
1. Get your `Total` percentage.
1. Jump to the [new cutoffs](#new-cutoffs).
1. Find the letter grade with the `min_score` that is closest to but still below your `Total` percentage.

## Load current scores

The scores below are the total scores based on everything that has grades released thus far across both sections. See the timestamp in the filename below to know when it was updated. The grade data is anonymous for [privacy reasons](https://www.registrar.columbia.edu/content/privacy-rights-ferpa).

In [2]:
import pandas as pd

path = "/Users/afeld/Downloads/2023-09-28T0111_Grades-INAFU6504_ALL_2023_3_-_Python_for_Public_Policy.csv"
grades = pd.read_csv(path, skiprows=[1, 2])

# exclude the test student built into CourseWorks
grades = grades[grades["Student"] != "Student, Test"]

# obfuscate whose score is whose
grades = grades[["Current Score"]]
grades = grades.sort_values("Current Score").reset_index(drop=True)

grades

Unnamed: 0,Current Score
0,2.22
1,6.67
2,50.16
3,53.33
4,53.57
...,...
62,100.00
63,100.00
64,100.00
65,100.00


### Distribution

In [3]:
import plotly.io as pio

# hack to remove Plotly MIME type that JupyterBook complains about
pio.renderers.default = "notebook_connected+pdf"

In [4]:
import plotly.express as px

fig = px.histogram(
    grades,
    x="Current Score",
    title="Distribution of the overall grades as a percentage, computed by CourseWorks",
    labels={"Current Score": "Current Score (percent)"},
)
fig.update_layout(yaxis_title_text="Number of students")
fig.show()

## Match to letter grades / GPAs

Creating the [grading notation table](https://bulletin.columbia.edu/sipa/academic-policies/grading-system-academic-progress/) in Pandas:

In [5]:
letter_grade_equivalents = pd.DataFrame(
    index=["A+", "A", "A-", "B+", "B", "B-", "C+", "C", "C-", "D", "F"],
    data={"gpa": [4.33, 4.00, 3.67, 3.33, 3.00, 2.67, 2.33, 2.00, 1.67, 1.00, 0.00]},
)
letter_grade_equivalents

Unnamed: 0,gpa
A+,4.33
A,4.0
A-,3.67
B+,3.33
B,3.0
B-,2.67
C+,2.33
C,2.0
C-,1.67
D,1.0


Assign starting minimum scores:

In [6]:
# based on https://stackoverflow.com/a/48109733/358804

desired_lower = 70.0  # typical C-
desired_upper = 100.0
actual_lower = letter_grade_equivalents.at["C-", "gpa"]
actual_upper = letter_grade_equivalents.at["A+", "gpa"]

desired_diff = desired_upper - desired_lower
actual_diff = actual_upper - actual_lower

letter_grade_equivalents["min_score"] = (
    desired_lower + (letter_grade_equivalents["gpa"] - actual_lower) * desired_diff / actual_diff
)
# manually set the lower ones
letter_grade_equivalents.at["D", "min_score"] = 60.0
letter_grade_equivalents.at["F", "min_score"] = 0.0

letter_grade_equivalents

Unnamed: 0,gpa,min_score
A+,4.33,100.0
A,4.0,96.278195
A-,3.67,92.556391
B+,3.33,88.721805
B,3.0,85.0
B-,2.67,81.278195
C+,2.33,77.443609
C,2.0,73.721805
C-,1.67,70.0
D,1.0,60.0


## Adjust cutoffs

Raise the minimum scores for each grade (not including A+ and F) until the average GPA is in the acceptable range.

In [7]:
# merge_asof() needs columns sorted ascending
grade_cutoffs = letter_grade_equivalents.sort_values(by="min_score")

grades_to_adjust = ~grade_cutoffs.index.isin(["A+", "F"])

adjustment = 0
STEP_SIZE = 0.1

while True:
    grade_cutoffs.loc[grades_to_adjust, "min_score"] = (
        letter_grade_equivalents[grades_to_adjust]["min_score"] + adjustment
    )

    # make the letter grades a column so they show up in the merged DataFrame
    grade_cutoffs_with_letters = grade_cutoffs.reset_index().rename(
        columns={"index": "letter_grade"}
    )

    # find the letter grade / GPA for each student
    adjusted_grades = pd.merge_asof(
        grades,
        grade_cutoffs_with_letters,
        left_on="Current Score",
        right_on="min_score",
        direction="backward",
    )

    new_mean = adjusted_grades["gpa"].mean()
    print(f"Adjustment: {adjustment:+.1f}, Average: {new_mean:.3f}")

    # check if we've hit the target range
    if MIN_AVG_GPA <= new_mean <= MAX_AVG_GPA:
        # success
        break
    elif new_mean > MAX_AVG_GPA:
        # raise
        adjustment += STEP_SIZE
    else: # new_mean < MIN_AVG_GPA:
        # lower
        adjustment -= STEP_SIZE

Adjustment: +0.0, Average: 3.417
Adjustment: +0.1, Average: 3.417
Adjustment: +0.2, Average: 3.417
Adjustment: +0.3, Average: 3.407
Adjustment: +0.4, Average: 3.363


Confirm the A cutoff hasn't gone beyond the A+ cutoff:

In [8]:
assert grade_cutoffs.at["A", "min_score"] < grade_cutoffs.at["A+", "min_score"]

### New cutoffs

In [9]:
grade_cutoffs

Unnamed: 0,gpa,min_score
F,0.0,0.0
D,1.0,60.4
C-,1.67,70.4
C,2.0,74.121805
C+,2.33,77.843609
B-,2.67,81.678195
B,3.0,85.4
B+,3.33,89.121805
A-,3.67,92.956391
A,4.0,96.678195


## Check results

Double-check the new average is in line with policy:

In [10]:
assert MIN_AVG_GPA < new_mean < MAX_AVG_GPA, f"{new_mean} not in acceptable range"

new_mean

3.3628358208955222

In [11]:
fig = px.histogram(adjusted_grades, x="letter_grade", title="Distribution of grades")
fig.update_layout(yaxis_title_text="Number of students")
fig.show()