### Project description:

One of the jobs that all teachers have in common is evaluating students. Whether you use exams, homework assignments, quizzes, or projects, you usually have to turn students’ scores into a letter grade at the end of the term. This often involves a bunch of calculations that you might do in a spreadsheet. Instead, you can consider using Python and pandas.

One problem with using a spreadsheet is that it can be hard to see when you make a mistake in a formula. Maybe you selected the wrong column and put quizzes where exams should go. Maybe you found the maximum of two incorrect values. To solve this problem, you can use Python and pandas to do all your calculations and find and fix those mistakes much faster.

In this project, we will learn how to:

- Load and explore data from multiple sources with pandas
- Clean DataFrame using Numpy and Pandas
- Merge data in a pandas DataFrame
- Calculate grades, filter and group in a pandas DataFrame
- Plotting Summary Statistics

Once we complete these steps, we’ll have the grades in a format that we should be able to upload to the school’s student administration system.

Click the link below for the project resource:
https://realpython.com/pandas-project-gradebook/

## Below are the code scripts & steps for the project

In [143]:
# import all packages
import numpy as np
import pandas as pd
import re
# import the missed libraries
from functools import reduce
from pathlib import Path

# Save the path in DATA_FOLDER variable to be reused
HERE = Path().resolve()
DATA_FOLDER = HERE / "materials-pandas-gradebook-project/data"

### Loading and exploring the data

In [144]:
# hw_exam_grades dataframe, this dataframe is from the homework and exam grading service.
# Each student has SID, first name, and last name. In addition, there are three values reported for each assignment and 
# exam you gave which are: score the student received, The maximum score for that assignment, and 
# The time the student submitted the assignment
hw_exam_grades = pd.read_csv('materials-pandas-gradebook-project/data/hw_exam_grades.csv')
hw_exam_grades.name = 'hw_exam_grades' # to be used later in data cleaning
# roster dataframe, it contains roster information for the class. This would come from your student administration system
# Each student’s ID number, name, NetID, and email address as well as the section of the class 
# that they belong to. In this term, you taught one class that met at different times,
# and each class time has a different section number
roster = pd.read_csv('materials-pandas-gradebook-project/data/roster.csv')
roster.name = 'roster' # to be used later in data cleaning
# Quiz dataframe, 
# Each student has a last name, first name, email, and quiz grade. 
# Notice that the maximum possible quiz score isn’t stored in this table. and will see how to
# supply that information later on.

# Use the pathlib library to import the quiz's datasets in bulk

quiz_grades = pd.DataFrame()
for file_path in DATA_FOLDER.glob('quiz_*_grades.csv'):
    quiz_name = ' '.join(file_path.stem.title().split('_')[:2])
    # title() returns the title cased version of the string,
    # stem attribute extracts the file name, using the pathlib library
    quiz = pd.read_csv(file_path,
                       usecols=['Email', 'Grade'],
                       index_col=['Email']) # make it index to prevent repeaded cols
    quiz.rename(columns={'Grade':quiz_name}, inplace=True)
    quiz_grades = pd.concat([quiz_grades, quiz], axis='columns')
# Lastly reset the index as we need to access the email column later.
quiz_grades.reset_index(inplace=True)
quiz_grades.name = 'quiz_grades' # to be used later in data cleaning

#     Old way before fix

# quiz_1_grades = pd.read_csv('materials-pandas-gradebook-project/data/quiz_1_grades.csv')
# quiz_2_grades = pd.read_csv('materials-pandas-gradebook-project/data/quiz_2_grades.csv')
# quiz_3_grades = pd.read_csv('materials-pandas-gradebook-project/data/quiz_3_grades.csv')
# quiz_4_grades = pd.read_csv('materials-pandas-gradebook-project/data/quiz_4_grades.csv')
# quiz_5_grades = pd.read_csv('materials-pandas-gradebook-project/data/quiz_5_grades.csv')

hw_exam_grades[['Homework 1 - Max Points','Homework 2 - Max Points',
               'Homework 3 - Max Points','Homework 4 - Max Points',
               'Homework 5 - Max Points','Homework 6 - Max Points',
               'Homework 7 - Max Points','Homework 8 - Max Points',
               'Homework 9 - Max Points','Homework 10 - Max Points']]
# roster
# quiz_1_grades
quiz_grades
# hw_exam_grades

Unnamed: 0,Email,Quiz 1,Quiz 2,Quiz 3,Quiz 4,Quiz 5
0,richard.bennett@univ.edu,10,6,9,8,10
1,timothy.parker@univ.edu,9,14,13,14,10
2,carol.reyes@univ.edu,5,15,8,14,6
3,brooke.powers@univ.edu,6,10,17,10,8
4,michael.taylor@univ.edu,5,15,13,12,5
...,...,...,...,...,...,...
145,jeffrey.perez@univ.edu,4,7,12,12,9
146,angela.dunlap@univ.edu,6,11,11,11,6
147,richard.elliott@univ.edu,6,13,17,11,12
148,donna.nguyen@univ.edu,7,12,14,9,4


### Clean DataFrame using Numpy and Pandas


### Data cleaning
##### 1- Trim/Strip space within string values
##### 2- Treat Empty/Null values and report nulls
##### 3- Treat duplicated values
##### 4- Concatenate or split columns, and unify the data types if needed
##### 5- Unify the unique ID (student ID)
##### 6- Sort data
##### 7- Check the total number of students



In [145]:
# functions initiation

#     -------------------------------------------------
# 1- Trim/Strip space within string values
#     -------------------------------------------------
def remove_col_str_space(df, col):
    df[col] = df[col].str.strip()
    return df

#     -------------------------------------------------
# 2- Treat Empty/Null values and report nulls
#     -------------------------------------------------
def replace_col_empty_value(df,col):
    return df[col].replace('',np.nan,inplace=True)
def replace_col_nan_value(df,col):
    return df[col].fillna(0,inplace=True)
    # Note' Blank in numbers fields is a Nan by default in dataframes
# Report Null values in dataframe columns
def checknull(col, mask):
    return mask[col].unique()
def reportnulls(df):
    # masking (True/False)
    mask = df.isnull()
    counter=0
    for col in mask.columns:
        if True in checknull(col, mask):
            print('Column: ',col,' contains Null value')
            counter = counter+1
    if counter == 0:
        print('There is no Nulls in: ',df.name)
        
#     -------------------------------------------------
# 3- Treat duplicated values
#     -------------------------------------------------
def retrieve_col_duplicated_value(df,col):
    return df[df[col].duplicated()]

# functions calling to clean & prepare the datasets

# Trim/Strip space within string values
# String columns in dataframes
string_col_hw_exam_grades = hw_exam_grades[['First Name','Last Name','SID']]
string_col_roster = roster[['Name','NetID','Email Address']]
string_col_quiz_grades = quiz_grades[['Email']]

#     Old way before fix 

# string_col_quiz_1_grades = quiz_1_grades[['Last Name','First Name','Email']]
# string_col_quiz_2_grades = quiz_2_grades[['Last Name','First Name','Email']]
# string_col_quiz_3_grades = quiz_3_grades[['Last Name','First Name','Email']]
# string_col_quiz_4_grades = quiz_4_grades[['Last Name','First Name','Email']]
# string_col_quiz_5_grades = quiz_5_grades[['Last Name','First Name','Email']]
[remove_col_str_space(hw_exam_grades,i) for i in string_col_hw_exam_grades.columns]
[remove_col_str_space(roster,i) for i in string_col_roster.columns]
remove_col_str_space(quiz_grades, string_col_quiz_grades.columns[0])

#     Old way before fix

# [remove_col_str_space(quiz_1_grades,i) for i in string_col_quiz_1_grades.columns]
# [remove_col_str_space(quiz_2_grades,i) for i in string_col_quiz_2_grades.columns]
# [remove_col_str_space(quiz_3_grades,i) for i in string_col_quiz_3_grades.columns]
# [remove_col_str_space(quiz_4_grades,i) for i in string_col_quiz_4_grades.columns]
# [remove_col_str_space(quiz_5_grades,i) for i in string_col_quiz_5_grades.columns]

# Treat Empty/Null values and report nulls
# targeted df to report nulls
reportnulls(hw_exam_grades)
reportnulls(roster)
reportnulls(quiz_grades)

#     Old way before fix

# reportnulls(quiz_1_grades)
# reportnulls(quiz_2_grades)
# reportnulls(quiz_3_grades)
# reportnulls(quiz_4_grades)
# reportnulls(quiz_5_grades)

    # Check if df has nan values {EXAMPLE} to validate reportnulls function
    # df = pd.DataFrame(np.random.rand(10,10), index= [1,2,3,4,5,6,7,8,9,10], columns=np.arange(10))
    # df = df[df > 0.6]
    # # arbitrary cols
    # df['No nulls'] = [1,1,1,1,1,1,1,1,11,1]
    # df['Np2'] = [2,2,2,2,2,2,2,2,2,2]
    # # one way to move a column (when you have massive number of columns)
    # colM = df.pop('No nulls')
    # df.insert(4,'Np',colM)
    # # another way be like
    # # df = df[[0,1,2,3,'No nulls',4,5,6,7,8,9]]
    # # report the nulls in columns using the function
    # print(reportnulls(mask2))
    # # columns modification
    # df[2] = np.arange(2,21,2)
    # df[1] = 0
    # # report the nulls in columns using the function after the modifications
    # print(reportnulls(df))
    # df
# replace the '' values with nan in a certain columns
[replace_col_empty_value(hw_exam_grades,i) for i in hw_exam_grades.columns]
[replace_col_empty_value(roster,i) for i in roster.columns]
[replace_col_empty_value(quiz_grades,i) for i in quiz_grades.columns]

#     Old way before fix

# [replace_col_empty_value(quiz_1_grades,i) for i in quiz_1_grades.columns]
# [replace_col_empty_value(quiz_2_grades,i) for i in quiz_2_grades.columns]
# [replace_col_empty_value(quiz_3_grades,i) for i in quiz_3_grades.columns]
# [replace_col_empty_value(quiz_4_grades,i) for i in quiz_4_grades.columns]
# [replace_col_empty_value(quiz_5_grades,i) for i in quiz_5_grades.columns]

# replace the nan values with zero in a certain columns
numiric_col_hw_exam_grades = hw_exam_grades[
                             ['Homework 1', 'Homework 1 - Max Points',
                              'Homework 2', 'Homework 2 - Max Points', 
                              'Homework 3', 'Homework 3 - Max Points', 
                              'Homework 4', 'Homework 4 - Max Points', 
                              'Homework 5', 'Homework 5 - Max Points', 
                              'Homework 6', 'Homework 6 - Max Points', 
                              'Homework 7', 'Homework 7 - Max Points', 
                              'Homework 8', 'Homework 8 - Max Points', 
                              'Homework 9', 'Homework 9 - Max Points',
                              'Homework 10', 'Homework 10 - Max Points',
                              'Exam 1', 'Exam 1 - Max Points',
                              'Exam 2', 'Exam 2 - Max Points',
                              'Exam 3', 'Exam 3 - Max Points']]
numiric_col_quiz_grades = quiz_grades[['Quiz 1', 'Quiz 2', 'Quiz 3',
                                       'Quiz 4', 'Quiz 5']]
[replace_col_nan_value(hw_exam_grades,i) for i in numiric_col_hw_exam_grades.columns]
print(reportnulls(hw_exam_grades),', so, nan value has been changed to zero') # just to do a quick check
[replace_col_nan_value(roster,'Section')]
[replace_col_nan_value(quiz_grades,i) for i in numiric_col_quiz_grades]

#     Old way before fix

# [replace_col_nan_value(quiz_1_grades,'Grade')]
# [replace_col_nan_value(quiz_2_grades,'Grade')]
# [replace_col_nan_value(quiz_3_grades,'Grade')]
# [replace_col_nan_value(quiz_4_grades,'Grade')]
# [replace_col_nan_value(quiz_5_grades,'Grade')]

# Treat duplicated values
duplicated_col = retrieve_col_duplicated_value(hw_exam_grades,'SID')
duplicated_col2 = retrieve_col_duplicated_value(roster,'NetID')
if len(duplicated_col) & len(duplicated_col2) == 0:
    print('there is no duplicated unique ID\'s')
else:
    print('Oops, you have got some duplicated id\'s be it: '+duplicated_col+duplicated_col2)
    # We scaped the quizz_number_grade as it has no unique value
    
#     -------------------------------------------------
# 4- Concatenate or split columns, and unify the data type if needed
#     -------------------------------------------------
# make sure firstly, the names format is (LastName, FirstName), only one keyword, no spaces no more keywords or symbols
hw_exam_grades[hw_exam_grades['First Name'].str.match(r'(\w+)')]
hw_exam_grades[hw_exam_grades['Last Name'].str.match(r'(\w+)')]
    #It should be the same length of the origin df
# Concatenate
hw_exam_grades['Name'] = hw_exam_grades['Last Name']+', '+hw_exam_grades['First Name']
# # Unify the data type if needed
datetime_columns = ['Homework 1 - Submission Time',
                'Homework 2 - Submission Time',
                'Homework 3 - Submission Time',
                'Homework 4 - Submission Time',
                'Homework 5 - Submission Time',
                'Homework 6 - Submission Time',
                'Homework 7 - Submission Time',
                'Homework 8 - Submission Time',
                'Homework 9 - Submission Time',
                'Homework 10 - Submission Time',
                'Exam 1 - Submission Time',
                'Exam 2 - Submission Time',
                'Exam 3 - Submission Time']
# get rid of the hour times
hw_exam_grades[datetime_columns] = hw_exam_grades[datetime_columns].apply(lambda x: x.str.replace(r'\s.*', ''))
# convert to datetime
hw_exam_grades[datetime_columns] = hw_exam_grades[datetime_columns].apply(lambda x: pd.to_datetime(x, format='%Y-%m-%d'))

#     -------------------------------------------------
# # 5- Unify the unique ID (student ID and email)
#     -------------------------------------------------
# unify the identifier names and emails
roster.rename(columns={'NetID':'Identifier'}, inplace=True)
hw_exam_grades.rename(columns={'SID':'Identifier'}, inplace=True)
roster.rename(columns={'Email Address':'Email'}, inplace=True)
# lower case all identifiers and emails
roster['Identifier'] = roster['Identifier'].str.lower()
hw_exam_grades['Identifier'] = hw_exam_grades['Identifier'].str.lower()
roster['Email'] = roster['Email'].str.lower()

#     -------------------------------------------------
# 6- Sort data
#     -------------------------------------------------
# !!!?

#     -------------------------------------------------
# 7- Check the total number of students
#     -------------------------------------------------
# they should be all 150
print(len(hw_exam_grades))
print(len(roster))
print(len(quiz_grades))














Column:  Homework 1  contains Null value
There is no Nulls in:  roster
There is no Nulls in:  quiz_grades
There is no Nulls in:  hw_exam_grades
None , so, nan value has been changed to zero
there is no duplicated unique ID's
150
150
150


  hw_exam_grades[datetime_columns] = hw_exam_grades[datetime_columns].apply(lambda x: x.str.replace(r'\s.*', ''))


### Merge data in a pandas DataFrame

In [146]:
# Rename and merge based on the new dataframes (quiz_grades)

joined_hw_exam_grades_roster = pd.merge(hw_exam_grades, roster, how='inner', on='Identifier')
joined_hw_exam_grades_roster.drop(['Name_y', 'ID'], axis='columns', inplace=True)
joined_hw_exam_grades_roster = joined_hw_exam_grades_roster.rename(columns={'Name_x':'Name'})

joined_hw_exam_grades_roster

joined_final = pd.merge(joined_hw_exam_grades_roster, quiz_grades, how='inner', on='Email')

joined_final.columns

#     Old way before fix

# quiz_12_grades = pd.merge(quiz_1_grades, quiz_2_grades, how='inner', 
#                           on=['Last Name', 'First Name', 'Email'])
# quiz_12_grades = pd.DataFrame(quiz_12_grades)
# quiz_123_grades = pd.merge(quiz_12_grades, quiz_3_grades, how='inner', 
#                           on=['Last Name', 'First Name', 'Email'])
# quiz_123_grades.rename(columns={'Grade':'Grade_z'}, inplace=True)
# quiz_1234_grades = pd.merge(quiz_123_grades, quiz_4_grades, how='inner', 
#                           on=['Last Name', 'First Name', 'Email'])
# quiz_1234_grades.rename(columns={'Grade':'Grade_w'}, inplace=True)
# quiz_12345_grades = pd.merge(quiz_1234_grades, quiz_5_grades, how='inner', 
#                           on=['Last Name', 'First Name', 'Email'])
# # joined_hw_exam_grades_roster_quiz_n_grades = reduce(left)
# quiz_12345_grades['Quizzes'] = (quiz_12345_grades['Grade']+
#                               quiz_12345_grades['Grade_x']+
#                               quiz_12345_grades['Grade_y']+
#                               quiz_12345_grades['Grade_z']+
#                               quiz_12345_grades['Grade_w'])/5
# quiz_12345_grades.drop(['Grade_x','Grade_y','Grade_z','Grade_w', 'Grade'] ,axis='columns', inplace=True)
# quiz_12345_grades




Index(['First Name', 'Last Name', 'Identifier', 'Homework 1',
       'Homework 1 - Max Points', 'Homework 1 - Submission Time', 'Homework 2',
       'Homework 2 - Max Points', 'Homework 2 - Submission Time', 'Homework 3',
       'Homework 3 - Max Points', 'Homework 3 - Submission Time', 'Homework 4',
       'Homework 4 - Max Points', 'Homework 4 - Submission Time', 'Homework 5',
       'Homework 5 - Max Points', 'Homework 5 - Submission Time', 'Homework 6',
       'Homework 6 - Max Points', 'Homework 6 - Submission Time', 'Homework 7',
       'Homework 7 - Max Points', 'Homework 7 - Submission Time', 'Homework 8',
       'Homework 8 - Max Points', 'Homework 8 - Submission Time', 'Homework 9',
       'Homework 9 - Max Points', 'Homework 9 - Submission Time',
       'Homework 10', 'Homework 10 - Max Points',
       'Homework 10 - Submission Time', 'Exam 1', 'Exam 1 - Max Points',
       'Exam 1 - Submission Time', 'Exam 2', 'Exam 2 - Max Points',
       'Exam 2 - Submission Time', 'Exam 

In [150]:
"""
There are three mistakes occured during "commit 3" to be fixed, and the fixed code will be uploaded on "commit 3 fixed".

1- Use the pathlib library to import the quiz's datasets in bulk
2- Rename and merge based on the new dataframes
3- Fix the scoring calculation as it is not one max point for homeworks, exams, or quizzes
"""

'\nThere are three mistakes occured during "commit 3" to be fixed, and the fixed code will be uploaded on "commit 3 fixed".\n\n1- Use the pathlib library to import the quiz\'s datasets in bulk\n2- Rename and merge based on the new dataframes\n3- Fix the scoring calculation as it is not one max point for homeworks, exams, or quizzes\n'

### Calculate grades, filter and group in a pandas DataFrame

- Exams
- Homework
- Quizzes

Each of these categories is assigned a weight toward the students’ final score. For your class this term, you assigned the following weights:

Category    ___        Weight

Exam 1 Score		__0.05

Exam 2 Score		__0.10

Exam 3 Score		__0.15

Quiz Score	     	__0.30

Homework Score		__0.40

Total               ___1.00 (100%)

In [151]:
# Fix the scoring calculation as it is not one max point for homeworks, exams, or quizzes

#     Old way before fix

# hw_exam_grades['Homework'] = (hw_exam_grades['Homework 1']+
#                               hw_exam_grades['Homework 2']+
#                               hw_exam_grades['Homework 3']+
#                              hw_exam_grades['Homework 4']+
#                               hw_exam_grades['Homework 5']+
#                              hw_exam_grades['Homework 6']+
#                              hw_exam_grades['Homework 7']+
#                              hw_exam_grades['Homework 8']+
#                              hw_exam_grades['Homework 9']+
#                              hw_exam_grades['Homework 10'])/10
# hw_exam_grades['Homework'] = hw_exam_grades['Homework'].astype(float)
# hw_exam_grades['Exams'] = (hw_exam_grades['Exam 1']+hw_exam_grades['Exam 2']+hw_exam_grades['Exam 3'])/3
# hw_exam_grades['Exams'] = hw_exam_grades['Exams'].astype(float)

# hw_exam_grades = hw_exam_grades[['Identifier', 'Name', 'Homework', 'Exams']]

# ================================
# Calculating the Exam Total Score
# ================================
n_exams = 3

for n in range(1, n_exams+1):
    joined_final[f'Exam {n} Score'] = (joined_final[f'Exam {n}']/joined_final[f'Exam {n} - Max Points'])

# ================================
# Calculating the Homework Scores
# ================================
# First approach by total as below,
homework_scores = joined_final.filter(regex=r'^Homework \d\d?$', axis=1)
homework_max_points = joined_final.filter(regex=r'^Homework \d\d? - M', axis=1)
sum_homework_scores = homework_scores.sum(axis=1)
sum_homework_max_points = homework_max_points.sum(axis=1)
joined_final['Total Homework'] = (sum_homework_scores/sum_homework_max_points)
# Second approach by average as below,
hw_max_renamed = homework_max_points.set_axis(homework_scores.columns, axis=1)
#     You set the columns axis to have the same names as the columns in homework_scores, then,
average_hw_scores = (homework_scores / hw_max_renamed).sum(axis=1)
#   In this code, you calculate the average_hw_scores by dividing each homework score by its respective maximum points,then,
joined_final["Average Homework"] = average_hw_scores / (homework_scores.shape[1])
    # You use DataFrame.shape to get the number of assignments from homework_scores. 
    # Like a NumPy array, DataFrame.shape returns a tuple of (n_rows, n_columns)
joined_final['Homework Score'] = joined_final[['Average Homework','Total Homework']].max(axis=1)
    # calculate the higher score between the two approaches and go with it.

joined_final


# ================================
# Calculating the Quiz Scores
# ================================
# First approach by total as below,
Quiz_scores = joined_final.filter(regex=r'^Quiz \d\d?$', axis=1)
Quiz_max_point = pd.Series({"Quiz 1": 11, "Quiz 2": 15, "Quiz 3": 17, "Quiz 4": 14, "Quiz 5": 12})
sum_quiz_scores = Quiz_scores.sum(axis=1)
sum_quiz_max_point = Quiz_max_point.sum()
joined_final['Total quizzes'] = sum_quiz_scores/sum_quiz_max_point
# Second approach by average as below,
avg_quiz_scores = (Quiz_scores/Quiz_max_point).sum(axis=1)
joined_final['Average quizzes'] = avg_quiz_scores/Quiz_scores.shape[1]
joined_final['Quiz Score'] = joined_final[['Average quizzes','Total quizzes']].max(axis=1)
# ================================
# Calculating the letter grade
# ================================
weightings = pd.Series({
    'Exam 1 Score':0.05,
    'Exam 2 Score':0.10,
    'Exam 3 Score':0.15,
    'Quiz Score':0.30,
    'Homework Score':0.40
})
joined_final['Final score'] = (joined_final[weightings.index]*weightings).sum(axis=1)
joined_final['Ceiling score'] = np.ceil(joined_final['Final score']*100) # تقريب لاقرب عدد
def final_grade(x):
    if x>=90:
        return 'A'
    elif x>=80:
        return 'B'
    elif x>=70:
        return 'C'
    elif x>=60:
        return 'D'
    else:
        return 'F'
    
joined_final['Final Grade'] = joined_final['Ceiling score'].apply(lambda x: final_grade(x))

# ================================
# Groupby section and sort by 1st last name and save the students grades to files
# ================================
for section, table in joined_final.groupby('Section'):
    section_file = DATA_FOLDER/f'Section {section} Grades.csv'
    number_students = table.shape[0]
    print(f'there are {number_students} in section {section} saved to "{section_file}" file.')
    table.sort_values(by=['Last Name', 'First Name'])
    table.to_csv(section_file)
name = joined_final.pop('Name')
joined_final.insert(0,'Name', name)
joined_final.columns

#     Old way before fix

# joined_final = joined_final[['Identifier', 'Name', 'Homework', 'Quizzes', 'Exams']]
# total = 15+80+100
# joined_final['Final Score'] = ((joined_final['Exams']+joined_final['Quizzes']+joined_final['Homework'])/total*100).round(2)
# def final_grade(x):
#     if 100>=x>=95:
#         return 'A'
#     elif 90>=x>=80:
#         return 'B'
#     elif 80>=x>=70:
#         return 'C'
#     elif 70>=x>=60:
#         return 'D'
#     else:
#         return 'F'
    
# joined_final['Final Grade'] = joined_final['Final Score'].apply(lambda x: final_grade(x))

there are 56 in section 1 saved to "C:\Users\sahal\Workspace\Gradebook\materials-pandas-gradebook-project\data\Section 1 Grades.csv" file.
there are 51 in section 2 saved to "C:\Users\sahal\Workspace\Gradebook\materials-pandas-gradebook-project\data\Section 2 Grades.csv" file.
there are 43 in section 3 saved to "C:\Users\sahal\Workspace\Gradebook\materials-pandas-gradebook-project\data\Section 3 Grades.csv" file.


Index(['Name', 'First Name', 'Last Name', 'Identifier', 'Homework 1',
       'Homework 1 - Max Points', 'Homework 1 - Submission Time', 'Homework 2',
       'Homework 2 - Max Points', 'Homework 2 - Submission Time', 'Homework 3',
       'Homework 3 - Max Points', 'Homework 3 - Submission Time', 'Homework 4',
       'Homework 4 - Max Points', 'Homework 4 - Submission Time', 'Homework 5',
       'Homework 5 - Max Points', 'Homework 5 - Submission Time', 'Homework 6',
       'Homework 6 - Max Points', 'Homework 6 - Submission Time', 'Homework 7',
       'Homework 7 - Max Points', 'Homework 7 - Submission Time', 'Homework 8',
       'Homework 8 - Max Points', 'Homework 8 - Submission Time', 'Homework 9',
       'Homework 9 - Max Points', 'Homework 9 - Submission Time',
       'Homework 10', 'Homework 10 - Max Points',
       'Homework 10 - Submission Time', 'Exam 1', 'Exam 1 - Max Points',
       'Exam 1 - Submission Time', 'Exam 2', 'Exam 2 - Max Points',
       'Exam 2 - Submission Time'

### Plotting Summary Statistics