In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("project.ipynb")

# Project 1 – The Other Side of Gradescope 💯

## DSC 80, Fall 2022

### Checkpoint Due Date: Thursday, October 6th (Questions 1-4)
### Due Date: Thursday, October 13th

## Instructions

Welcome to Project 1! Be sure to read the instructions below carefully to understand how Projects differ from Labs.

### Working on the Project

This Jupyter Notebook contains the statements of the problems and provides code and Markdown cells to display your answers to the problems.


* Like the lab, your coding work will be developed in the accompanying `project.py` file, that will be imported into the current notebook. This code will be autograded.
* Note that there is no manually-graded component to Project 1, so the only thing you will ever submit is `project.py`.
* **For the checkpoint, you only need to turn in a `project.py` containing solutions for Questions 1-4!**
    - The "Project 1 Checkpoint" autograder on Gradescope does not thoroughly check your code -- it only runs the doctests/public tests on Questions 1-4 to make sure that you have completed them. When you submit the final version of the project, we will use hidden tests to check your answers more thoroughly.
    - Note that the checkpoint is required!
* Note that this means you will ultimately have to submit the project twice – once to the "Project 1 Checkpoint" autograder (Questions 1-4 only), and once to the "Project 1" autograder (once you're fully done).


**Do not change the function names in the `project.py` file!**
- The functions in the `project.py` file are how your assignment is graded, and they are graded by their name.
- If you changed something you weren't supposed to, just use git to revert! Ask us if you need help with this, or google around for `git revert`.

**Tips for developing in the `project.py` file**:
- Do not change the function names in the starter code; grading is done using these function names.
- Do not change the docstrings in the functions. These are there to tell you if your work is on the right track!
- You are **encouraged to write your own additional functions** to solve the questions! 
- Always document your code!

### Working with a Partner

You are allowed to work with a partner on projects in DSC 80. If you do work with a partner, you must follow the [Pair Programming Guidelines](https://dsc10.com/pair-programming/) (the link is for DSC 10, but we'll use the same guidelines in DSC 80). Specifically, you must be actively working on the project at the same time on one computer. Splitting up the project and working on it separately **is not** pair programming.

You can use [this sheet](https://docs.google.com/spreadsheets/d/1gXkpCtwQv1AXA6keWo1nWeFOfsrK8_8DzyN1bXPIrao/edit?usp=sharing) to find a partner.

Note that if you do work with a partner, you and your partner must submit the Checkpoint together and the whole project together.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import pandas as pd
import numpy as np
import os

In [3]:
from project import *

## About the Assignment

The file contains the gradebook from a fictional data science course with 535 students. 

***Note: this dataset is synthetically generated; it does not contain real student grades. The course syllabus below is similar, but not quite the same as the course syllabus for this class!***

In this project, you will:
1. Clean and process the data to compute total course grades according to a fictional syllabus (below).
2. Qualitatively understand how students did in the course.
3. Understand how student grades vary with small changes in performance on each assignment.

---

### Navigating the Project

Click on the links below to navigate to different parts of the project. Note that Questions 1, 2, 3, and 4 constitute your Checkpoint submission.

- [Part 1: Enumerating the Assignments  🔢](#part1)
    - [Question 1 (Checkpoint Question)](#Question-1-(Checkpoint-Question))
- [Part 2: Computing Project Grades 🧮](#part2)
    - [Question 2 (Checkpoint Question)](#Question-2-(Checkpoint-Question))
- [Part 3: Computing Lab Grades 🧪](#part3)
    - [Question 3 (Checkpoint Question)](#Question-3-(Checkpoint-Question))
    - [Question 4 (Checkpoint Question)](#Question-4-(Checkpoint-Question))
    - [Question 5](#Question-5)
    - [Question 6](#Question-6)
- [Part 4: Putting It All Together 🧩](#part4)
    - [Question 7](#Question-7)
    - [Question 8](#Question-8)
- [Part 5: Do Seniors Get Worse Grades? 👵](#part5)
    - [Question 9](#Question-9)
- [Part 6: What is the True Distribution of Grades? 🧐](#part6)
    - [Question 10](#Question-10)
    - [Question 11](#Question-11)

---

### The Syllabus

The course syllabus for this fictional class is as follows:

* **Lab assignments (20% total)**
    - Each lab is worth the same amount, regardless of each lab's raw point total.
    - The lowest lab is dropped.
    - Each lab may be revised for up to (and including) one week after the deadline for a 10% penalty, for up to (and including) two weeks after the deadline for a 30% penalty, and beyond that for a 60% penalty. Such revisions are reflected in the `'Lateness'` columns in the gradebook.
* **Projects (30% total)** 
    - Each project consists of an autograded portion, and **possibly** a free response portion.
    - The total points for a single project consist of the sum of the raw score of the two portions.
    - Each project is worth the same amount, regardless of each project's raw point total.
* **Checkpoints (2.5% total)**
    - Each project checkpoint is worth the same amount, regardless of each project checkpoint's raw point total.
* **Discussions (2.5% total)**
    - Each discussion is worth the same amount, regardless of each discussion's raw point total.
* **Midterm Exam (15%)**
* **Final Exam (30%)**

You will need to refer to this syllabus repeatedly throughout the project, and several questions will link you back to it.

---

### Generalization

You may assume that your code will only need to work on a gradebook for a class with the syllabus given above. That is, you may assume that the DataFrame `grades` looks **like** the given one in `data/grades.csv`.

However, such a class:
1. May have a different numbers of labs, projects, discussions, and project checkpoints.
2. May have a different number of students.

You may assume the course components and the naming conventions are as given in the data file, and you may assume that the course has no more than 99 of any type of assignment.

---

### Putting Everything Together

Here are a few remarks and tips for approaching Project 1, and projects more generally:

1. If you are having trouble figuring out what a question is asking you to do, look at the big picture and try to understand what the current step is doing to contribute to this big picture. This may clarify what's being asked!
1. These questions intentionally build off of each other and the final result matters! In fact, you can "get a question correct", but only receive partial credit for it because a previous answer was wrong.
    - Credit for a question will typically receive partial credit based on *how close* your answer is to correct (as well as some credit for a solution in the correct form). 
    - You should try to assess your answer to each question based on what you understand of the data. This might involve writing extensive code (that isn't turned in) just to check your work! Suggestions on checking your work are given in the assignment, but you should also think of your own ways of checking your work.
    - As you do this project, think about the data from the perspective of the student (which should be easy to do, since you've used Gradescope before!)
1. To test the correctness of your answers:
     - Once you have implemented a particular function in `project.py`, you should test out your function in the notebook. In particular, you should inspect/analyze the output to assess its correctness.
    - Run your functions on the main dataset (`grades`) and ask yourself if the output *looks correct.*
    - Run your functions on very small datasets (e.g. 1-5 row DataFrames that you construct by hand), calculate the expected output by hand, and see if the function output matches (this *is* unit-testing your code with data).
    * Run your functions on (large and small) samples of the dataset `grades`. Does your code break, or does it still run as expected?

Run the cell below to load in the aforementioned `grades` dataset.

In [4]:
grades_fp = os.path.join('data', 'grades.csv')
grades = pd.read_csv(grades_fp)
grades.head()

Unnamed: 0,PID,College,Level,lab01,lab01 - Max Points,lab01 - Lateness (H:M:S),lab02,lab02 - Max Points,lab02 - Lateness (H:M:S),project01,...,discussion07 - Lateness (H:M:S),discussion08,discussion08 - Max Points,discussion08 - Lateness (H:M:S),discussion09,discussion09 - Max Points,discussion09 - Lateness (H:M:S),discussion10,discussion10 - Max Points,discussion10 - Lateness (H:M:S)
0,A14721419,SI,JR,99.735279,100.0,00:00:00,84.990171,100.0,00:00:00,75.282632,...,00:00:00,8.895294,10,00:00:00,10.0,10,780:01:28,10.0,10,00:00:00
1,A14883274,TH,JR,98.829476,100.0,00:00:00,50.784231,100.0,00:00:00,52.929482,...,669:12:21,9.022407,10,00:00:00,9.020283,10,00:00:00,9.437368,10,00:00:00
2,A14164800,SI,SR,86.513369,100.0,00:00:00,47.80282,100.0,00:00:00,46.122801,...,00:00:00,3.030538,10,00:04:51,7.613698,10,00:00:00,9.624617,10,00:00:00
3,A14847419,TH,JR,100.0,100.0,00:00:00,100.0,100.0,00:00:00,79.121806,...,00:00:00,10.0,10,00:00:00,9.249126,10,00:00:00,10.0,10,00:00:00
4,A14162943,SI,JR,66.506974,100.0,00:00:00,33.422412,100.0,00:00:00,41.823703,...,00:00:00,4.439606,10,00:00:00,4.485291,10,00:00:00,6.282712,10,00:00:00


**Tip:** The `grades` DataFrame has 100 columns, and you can't see them all right now. To get a feel for what all of the columns represent, you might consider opening `grades.csv` with a spreadsheet application, like Google Sheets or Excel.

<a name='part1'></a>

## Part 1: Enumerating the Assignments 🔢

To start, you'll list out the names of each assignment in the course.

### Question 1 (Checkpoint Question)

Complete the implementation of the function `get_assignment_names`, which takes in a DataFrame like `grades` and returns a dictionary with the following structure:
- The keys are the general areas of [the syllabus](#The-Syllabus): `'lab'`, `'project'`, `'midterm'`, `'final'`, `'disc'`, and `'checkpoint'`.
- The values are **lists** that contain all the assignment names of that type. For example, the lab assignments all have names of the form `'labXX'` where `XX` is a zero-padded two digit number. If the class has 5 labs, the returned dictionary's value for the `'lab'` key should be `['lab01', 'lab02', 'lab03', 'lab04', 'lab05']`.

See the doctests for more details.

In [119]:
grades = pd.read_csv(os.path.join('data', 'grades.csv'))
columns = grades.columns.to_list()
nonalphanum = ''.join(c for c in map(chr, range(256)) if not c.isalnum())

out_dict = {'lab':np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:5] for x in columns 
                        if 'lab' in x]])).tolist(),
            'project':np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:9] for x in columns 
                        if 'project' in x]])).tolist(),
            'midterm':np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:9] for x in columns 
                        if 'Midterm' in x]])).tolist(),
            'final':np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:7] for x in columns 
                        if 'Final' in x]])).tolist(),
            'disc':np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:12] for x in columns 
                        if 'disc' in x]])).tolist(),
            'checkpoint':np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:25] for x in columns 
                        if 'checkpoint' in x]])).tolist()}


In [None]:
grader.check("q1")

<a name='part2'></a>

## Part 2: Computing Project Grades 🧮

Now you're ready to compute each student's overall grade on the first type of assignment – projects.

### Question 2 (Checkpoint Question)

Complete the implementation of the function `projects_total`, which takes in a DataFrame `grades` and returns a Series containing the total project grade for each student for the entire quarter, according to [the syllabus](#The-Syllabus). The output Series should contain values between 0 and 1.

***Note***: Don't forget to properly handle students who didn't turn in assignments. Also, don't forget the fact that some projects have free response components that you need to account for.

***Hints:*** To check your work, try:
1. Calculating the total project scores for a few types of students by hand.
2. Calculating summary statistics for the whole class' performance on a few projects in particular and ensuring the results seem reasonable.

In [127]:
grades = pd.read_csv(os.path.join('data', 'grades.csv'))
columns = grades.columns.to_list()
projectlist = np.unique(np.array([''.join(ch for ch in str 
                        if ch.isalnum())for str in 
                        [x[:9] for x in columns 
                        if 'project' in x]])).tolist()
projectfrlist = np.unique(np.array( [x[:9] for x in columns 
                        if 'free_response' in x])).tolist()
projectfrlist = [x[-2:] for x in projectfrlist]
grades[projectfrlist + projectlist].fillna(0)


for i in range(int(projectlist[-1][-2:])):
    if str(i).zfill(2) in projectfrlist:
        grades['project' + str(i)]= (grades['project01_free_response']+ grades['project01']).div((grades['project01_free_response - Max Points'] + grades['project01 - Max Points']))




grades['project1']


0      0.902826
1      0.629166
2      0.611228
3      0.924086
4      0.568237
         ...   
530    0.926306
531    0.756583
532    0.812733
533    0.767939
534    0.925721
Name: project1, Length: 535, dtype: float64

In [141]:
[x[-2:] for x in projectfrlist]

['01', '02', '05']

In [None]:
grader.check("q2")

<a name='part3'></a>

## Part 3: Computing Lab Grades 🧪

Now, you will clean and process the lab grades, which involves a bit more work than was necessary for projects. To do this, you will develop functions that:
- identify late submissions (Questions 3 and 4), 
- compute normalized scores for each lab assignment, factoring in late penalties (Question 5), and 
- drop the lowest lab grade and compute a total lab score for each student (Question 6).

### Question 3 (Checkpoint Question)

Unfortunately, Gradescope sometimes experiences a delay in registering when an assignment is submitted during "periods of heavy usage" (i.e. near a submission deadline). For instance, let's say that 15 students submit their assignment at 11:59 PM, right before the deadline. In this case, Gradescope has trouble registering the 15 submissions at the same time. As a result, Gradescope registers these submissions one-by-one **after** the deadline over some period of time, and these submissions are **marked late despite being submitted on-time**. 

Your job is to identify when a student's lab assignment was actually submitted on time, even if Gradescope did not process it in time and marked it as late. To do this, it is helpful to know that in our fictional class:
* Late submissions are turned off, so the students cannot submit their assignment on their own after the deadline.
* The only way that a student can make a late submission is by attending office hours and having a tutor submit for them.
* However, there are no office hours "just after" the deadline, since deadlines are at 11:59 PM and tutors are asleep by then.
* As a result, **truly late submissions are not submitted within a few hours of a deadline, but are instead submitted later**.

Complete the implementation of the function `last_minute_submissions`, which takes in a DataFrame `grades` and outputs a **Series indexed by lab assignment that contains the number of submissions that were turned in on time by students that were marked "late" by Gradescope**. For instance, if the value for the `'lab01'` index in your returned Series is 15, that should mean that 15 students submitted Lab 1 on time but were marked as late by Gradescope.

***Notes:***

- You have to figure out what a truly late submission is by looking at the data and understanding the data generating process described above. This question is about "cleaning" a messy "data recording process". There is some ambiguity in finding which submissions are truly late; you will make a best guess for **a threshold** by looking at this dataset.  All the submissions that occur before this **threshold time** are on-time submissions that are incorrectly marked as "late".

- There is no one correct value for the threshold; a range of threshold values will return the correct answer.

- This function only involves **labs**, do not look at any other assignments categories.

- If you're curious, this is not how Gradescope actually works.

***Hints:*** 

- At some point, you'll need to convert times of the form `'00:11:08'` into a more usable numeric form.
- Plot the distribution of number of submissions over time and use that to determine the threshold.

In [None]:
grader.check("q3")

### Question 4 (Checkpoint Question)

Now you need to adjust lab grades for truly late submissions. However, you need to take into account your investigation in the previous question, since students shouldn't be penalized by a bug in Gradescope!

Complete the implementation of the function `lateness_penalty`, which takes in a Series (like `grades['lab01 - Lateness (H:M:S)']`) containing data on how late a student turned in an assignment and returns a Series of penalties (represented by the values `1.0` ,`0.9`, `0.7`, and `0.4` according to [the syllabus](#The-Syllabus)). **Only truly late submissions should be counted as late**.

***Note***: For the purpose of this project, we will only be calculating lateness for labs. There is no penalty for lateness for projects, discussions, nor checkpoints (unlike in real DSC 80 😢).

In [None]:
grader.check("q4")

### Question 5

Complete the implementation of the function `process_labs`, which takes in a DataFrame like `grades` and returns a DataFrame of processed lab scores. The returned DataFrame should:
* have the same index as `grades`,
* have one column for each lab assignment (e.g. `'lab01'`, `'lab02'`,..., `'lab09'`),
* have values representing the final score for each lab assignment, adjusted for lateness and **normalized** to a score between 0 and 1.

***Note:*** If a student does not turn in a lab, their score for that lab is a 0.

In [None]:
grader.check("q5")

### Question 6

Complete the implementation of the function `lab_total`, which takes in a DataFrame of processed assignments (like the output of Question 5) and returns a Series containing the total lab grade for each student according to [the syllabus](#The-Syllabus). Your answers should be proportions between 0 and 1. 

For example, if the class only has 3 labs, and a student received scores of 20%, 90%, and 100%, then your output should be `0.95`. This is because we drop the lowest score, and thus in effect we only compute the average of 90% and 100%, which is 95%, or 0.95 as a proportion.

In [None]:
grader.check("q6")

<a name='part4'></a>

## Part 4: Putting It All Together 🧩

It's time to compute the letter grade of each student.

### Question 7

First, you need to compute each student's course grade, which results from adding their total grades in each course component according to the weights given in [the syllabus](#The-Syllabus).

Complete the implementation of the function `total_points`, which takes in a DataFrame `grades` and returns a Series containing each student's course grade. **Course grades should be proportions between 0 and 1.**

***Notes***: 

- Don't repeat yourself when computing the checkpoint and discussion portions of the course.
- Remember, only the lab portion of the course accounts for late assignments; you may assume all assignments in other portions are turned in without penalty.
- Do the work by hand for a few students to check your code!

In [None]:
grader.check("q7")

### Question 8

Two more functions to go in this part!

#### `final_grades`

Complete the implementation of the function `final_grades`, which takes in final course grades (as computed in Question 7) and returns a Series of letter grades as determined by the standard cutoffs (without pluses or minuses):

| Letter Grade | Cutoff |
|:--- | --- |
| A | grade >= 0.9 |
| B | 0.8 <= grade < 0.9 |
| C | 0.7 <= grade < 0.8 |
| D | 0.6 <= grade < 0.7 |
| F | grade < 0.6 |

***Note:*** Do not round anyone's course grade when determining their letter grade.

#### `letter_proportions`

Complete the implementation of the function `letter_proportions`, which takes in final course grades (as computed in Question 7) and returns a Series containing the proportion of the class that received each letter grade. For instance, this Series might tell us that the proportion of the class receiving B's was 0.45, A's was 0.33, C's was 0.16, D's was 0.05, and F's was 0.01 (though these are made up numbers). The index of this Series should be letters, and the **values should be sorted in decreasing order**.

***Notes***: 

- The values in your returned Series should add up to exactly 1.0. If you are getting something close such as 0.99999, that means there is an issue with your code in a function you implemented earlier. 

- To check your work, verify the course grade distribution and relevant statistics! 

In [None]:
grader.check("q8")

<a name='part5'></a>

## Part 5: Do Seniors Get Worse Grades? 👵

Now that you've computed the overall course grades and letter grades for all students in our fictional class, it's time to perform some analyses of these grades.

### Question 9

You notice that students who are seniors on average did worse in the class (if you can't verify this, you should go back and check your work!). Is this difference significant, or just due to noise?

Perform a hypothesis test, assessing the likelihood of the above statement under the null hypothesis: 
> Seniors earn grades that are roughly equal on average to the rest of the class.

To do this, complete the implementation of the function `simulate_pval`, which takes in a DataFrame `grades` and a number of simulations `N` and returns the **probability that the mean grade (between 0 and 1) earned by a random subset of students* is less than or equal to the average grade of seniors** (i.e. calculate and return the p-value).

*_The size of each random subset must be same the number of seniors in the class._

***Note:*** To check your work, plot the sampling distribution of your test statistic along with the observed test statistic. Do these values look reasonable?

In [None]:
grader.check("q9")

<a name='part6'></a>

## Part 6: What is the True Distribution of Grades? 🧐

The gradebook for this class only reflects one particular instance of each student's performance, subject to the effects of all the little events and hiccups that occurred throughout the quarter. Would you have done better on the midterm if your roommate didn't kept you up all night with their coughing? Wasn't it lucky that the example you were studying just before the final happened to appear on the exam?

### Question 10

This question will simulate these "(un)lucky, random events" by **adding or subtracting random amounts to each assignment right before calculating the final grades for that assignment**. These "random amounts" will be drawn from a Gaussian distribution with mean 0 and standard deviation 0.02:
```
np.random.normal(0, 0.02, size=(num_rows, num_cols))
```
Intuitively, such a model says that random events may bump up or down a particular grade, given as a proportion, in a way that:
- on average has no effect on the class as a whole (mean 0), but
- could perturb an individual grade by 0.02 or more (standard deviation 0.02).

Create a function `total_points_with_noise` that takes in a DataFrame like `grades`, adds noise to the assignments as described above, and returns the final scores using **the same procedure** as Questions 1-8.

***Notes:*** 

- **For any given assignment, the noise should be applied after calculating the normalized scores (i.e. proportions) for that assignment. For example, the noise should be added to `'lab1'` once the column contains scores that are less than or equal to 1.**

- Your function should finish in a *reasonable amount of time* (in this case, about 5 seconds). If your code is slow, make sure you’re not looping over the rows of a large dataframe, since that is very inefficient!

- Once adding the noise to the assignment scores, use the `np.clip` function to be sure each assignment retains a score between 0 and 1.

- You should be able to reuse (or minorly change) the code from previous problems. Try to practice DRY (don't repeat yourself)!

- To check your work - what would you expect the difference between the actual scores and noisy scores to be, on average?

In [None]:
grader.check("q10")

### Question 11

To conclude, you will answer the following five questions about the class described in `grades`. The function you are required to implement in this question, `short_answer`, should return a **hard-coded list of length 5** containing your answers to the following questions, in order:

0. What is the **mean difference** between students' scores (`total_points`) and their scores with noise (`total_points_with_noise`), amongst all students in `grades`? (***Hint:*** Plot the distribution of differences.)
1. What **proportion** of the class only sees their grade change at most (but not including) $\pm 0.01$? (Your answer should be a number between 0 and 1.)
2. What is the 95% confidence interval for the statistic above, as a **tuple**? (***Hint:*** See the [DSC 10 course notes](https://notes.dsc10.com/05-hypothesis_testing/1_hypothesis_tests.html) and use `np.percentile`).
3. What **proportion** of the class sees a change in their letter grade when moving from `total_points` to `total_points_with_noise`?
4. Answer the following two true-or-false questions with a tuple, like `(True, True)`.
    - True or False: The model in Question 10 assumes that the observed gradebook (i.e. `grades`) is a good representation of the grades of students in DSC 80.
    - True or False: The Series that `total_points_with_noise(grades)` evaluates to **cannot** be interpreted as a Series of grades drawn from the population of students in DSC 80.

For example, `[0.0, 0.0, (0.0, 1.0), 0.0, (True, True)]` is an answer in valid form.

In [None]:
grader.check("q11")

## Congratulations, you've finished Project 1! 🎉

Submit your `.py` file to Gradescope. Note that you only need to submit the `.py` file; this notebook should not be uploaded because there are no manually-graded questions in this project.

Before submitting, you should ensure that all of your work is in the `.py` file. You can do this by running the doctests below, which will verify that your work passes the public tests **and** that your work is in the `.py` file. Run the cell below; you should see no output.

In [108]:
!python -m doctest project.py

In addition, `grader.check_all()` will verify that your work passes the public tests. Ultimately, the Gradescope autograder is also going to run `grader.check_all()`, so you should ensure these pass as well (which they should if the doctests above passed).

---

To double-check your work, the cell below will rerun all of the autograder tests.

In [None]:
grader.check_all()