# Professor Recommendations and the Relationship between Expected and Received Grades

Video Link:

# Permissions

Place an `X` in the appropriate bracket below to specify if you would like your group's project to be made available to the public. (Note that student names will be included (but PIDs will be scraped from any groups who include their PIDs).

* [ X ] YES - make available
* [  ] NO - keep private

# Names

- Pallavi Prabhu
- Hana Ton-Nu
- Justin Gamm
- Raquel Sanchez
- Ria Singh

# Abstract

Our team was interested in learning whether or not a student's opinion on a professor would result in a higher expectation of their class grade. We decided to look into our own school's evaluation system, CAPEs (Course and Professor Evaluations) in order to answer this question. The data was relatively clean, however, we wished to only include necessary information. In order to do this we dropped all rows with NaN values, created a function to convert necessary information into float objects, and created a new column that would help with our analysis later. We also decided to drop the entire Instructor row out of respect and privacy for the professors, we also deemed it unnecessary for our final analysis.

We then began our Data Analysis and produced some interesting results. We found that most students were fairly accurate in estimating their final grades. We also found that most evaluations, whether it be for the class or professor, are fairly positive. We decided to look at a small subset in order to see if there was any correlation between expected grades and professor recommendation. From that we weren't able to get any viable information. We also noticed a large positive jump in final grades for the year 2020, likely due to COVID and more lax class setups. 

After running all of our analyses we saw that there was a slight increase in grades depending on more positive reviews of a professor, however it is not a large increase. We also noted that just being in a stem class will have a larger effect on a student's expected and received grade than their recommendation of the professor. 

We concluded that our hypothesis was incorrect, but not in the way we expected. We found that the more positive ratings a professor had, the higher average _received_ grade. This does not coincide with our hypothesis, which stated the expected grade would be higher. However, we do have evidence for a higher grade expected _and_ received from professors with positive ratings.

# Research Question

Do students at UCSD expect higher grades than they receive if they recommend the professor, as measured through CAPEs? Namely, do students that recommend a professor generally expect to do better (expected grade) than they actually did (grade received) and vice versa?

## Background and Prior Work

CAPES stands for “Course And Professor Evaluations” and are offered to students at the end of every quarter to give feedback on their courses and the professors that taught them. As stated on the website, CAPES<a name="cite_ref-1"></a>[<sup>1</sup>](#cite_note-1) is: 

    “a student-run organization that administers a standardized evaluation of UCSD's undergraduate courses and professors. Student feedback gauges the caliber of both the University's curriculum and its faculty. We provide students with the opinions of their peers on any particular course or professor.” (UCSD)
    
After students submit their CAPES, this data is aggregated for each quarter of evaluations into features like the percentage of students that recommended the class, the percentage of students that recommend the instructor, the number of hours per week involved in this class, the average grade expected, and the average grade received (which is calculated from the final submitted grades). CAPEs play an important role in helping students choose their professors and even the courses they enroll them in as they generally have more statistics and comprehensive data compared to other rating sites. 

In addition, other rating sites often show signs of significant skew, with a majority of ratings being either towards the extreme since people are more likely to rate something they have a strong reaction towards. This creates a bimodal distribution that lacks data in between these two strong opinions. On the other hand, while not entirely protected, CAPEs likely have some resistance to potential skews in data since students are often incentivized to complete them for extra credit, which gives equal opportunity for various kinds of responses.
In winter of 2021, a COGS108 group also asked a question regarding CAPEs: “How has remote learning during the COVID-19 pandemic affected UCSD students’ grades and learning experience?” (Group084_Wi21<a name="cite_ref-2"></a>[<sup>2</sup>](#cite_note-2)). Thus, while the general topic remains the same, while they focus on the change of grades and learning as a result of the pandemic, we want to explore the potential association between students’ like of professors (recommended professor) and their potential overestimation/underestimation of their performance in class. While it seems obvious that students who do better in a course likely recommend a professor at a higher rate than those who don’t, we are more interested in their confidence and expectation.

Furthermore, prior studies that have explored this topic of student-professor evaluations have found evidence of gender bias where female professors tended to rank lower than their male counterparts<a name="cite_ref-3"></a>[<sup>3</sup>](#cite_note-3). While this is not directly related to our question, the general idea remains the same and it peaked our curiousity on some of the other featurs we could include. This could also be an interesting feature to consider in our analysis: how the professors’ gender, likeability, and the students' expectation of grade vs their actual received grade are all related. Do students overestimate their ability and expect a higher grade when the professor is female? How does the gender of a professor play into their recommendation in relation to the grades received in the class? Some other features that pique interest are the difference in expectation vs received depending on the department or type of class (like STEM vs liberal arts). Based on experience and expectatioon, STEM classes are notoriously perceived as more difficult than liberal arts class. Thus, it may be interesting to explore how this perceived difficulty plays into the relationship between the difference in grades expected and received and the overall likeabilty/recommendation of a professor. 


1. <a name="cite_note-1"></a> [^](#cite_ref-1) University of California, San Diego. (n.d.). Course And Professor Evaluations. COURSE AND PROFESSOR EVALUATIONS (CAPE). Retrieved November 1, 2023, from https://cape.ucsd.edu/
2. <a name="cite_note-2"></a> [^](#cite_ref-2) FinalProject_group084. (2021, March 29). GitHub. Retrieved November 1, 2023, from https://github.com/COGS108/FinalProjects-Wi21/blob/main/FinalProject_group084.ipynb
3. <a name="cite_note-3"></a> [^](#cite_ref-3) Mitchell, K., & Martin, J. (2018). Gender Bias in Student Evaluations. PS: Political Science & Politics, 51(3), 648-652. doi:10.1017/S104909651800001X from https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/gender-bias-in-student-evaluations/1224BE475C0AE75A2C2D8553210C4E27


# Hypothesis


After thinking back on our experiences in classes at UCSD and talking to our peers, we hypothesize that students in classes of well liked professors will overestimate their grade and those in classes of professors who are not as well liked will underestimate their grade.

We expect the correlation between the recommendation percentage of a professor and the difference between the expected and received grades to be positive. Additionally we expect the difference to become negative in professors who have low recommendation percentages. 

# Data

## Data overview

- Dataset #1
  - Dataset Name: CAPES data from UCSD
  - Link to the dataset: https://www.kaggle.com/datasets/sanbornpnguyen/ucsdcapes/
  - Number of observations: 63363
  - Number of variables: 11


This dataset includes CAPES data from UCSD. Important variables in this dataset that drive our research include recommendation percent, expected grade, and received grade. The datatypes included in this dataset include object, int, and float. The metric in our case would be the difference between received grade and expectated grade, which we created a column to display. These variables can act as a proxy for teaching effectiveness. In order to clean up our data, we used a couple different methods; we dropped unecessary columns, we standardized the grades and percentage columns, and dropped na columns. 

### Cleaning Data Process
  - Dataset is relatively clean except for 'Average Grade Recieved' column containing some NaN. We determined that the data was in a usable format because each measured variable had its own column, each observation of the measured variable was in a different row, and the column headers are descriptive of their variables.
  - We first decided to drop the evaluation url column since we did not deem it as relevant information needed to answer our question.
  - We then checked and dropped all rows with null information since we felt those with missing values would not be helpful and would make exctracting data more difficult.
  - We created a standarisation function that would convert all of the values in the 'Average Grade Expected' and 'Average Grade Received' columns to float columns and remove the '%' symbol.
  - A new column was created called 'Difference' where the difference between the 'Average Grade Received' and the 'Average Grade Expected' is stored. This will help later to determine whether students over or underestimated their grades.
  - We created columns for both the year and quarters so it has integers representing the quarters (1 = Winter, 2 = Spring, 3 = S1, 4 = S2, 5 = S3, 6 = Fall, 7 = SU) and the full year is displayed.
  - Lastly, we also the categories of 'liberal' and 'stem' for the top 100 class codes for which information could be found in order to use these two groups in our analysis.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

import patsy
import statsmodels.api as sm
import scipy.stats as stats
from scipy.stats import ttest_ind, chisquare, normaltest
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, classification_report
from sklearn.svm import SVC 
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error, r2_score

In [None]:
capes = pd.read_csv("capes_data.csv")

In [None]:
#determine number of variables and observations
capes.shape

In [None]:
capes.head()

In [None]:
capes.dtypes

In [None]:
#Dropping column as we cannot use it
capes = capes.drop(columns = ["Evalulation URL", "Instructor"])

In [None]:
#checking for NA 
capes.isna().sum()

In [None]:
#Investigating if NA in grades expected and recieved are from particular quarter(covid?)
capes_na = capes[capes.isnull().any(axis=1)]
capes_na.nunique()

In [None]:
capes_na.head()

In [None]:
capes_na["Quarter"].unique()

In [None]:
#dropping all rows with NA as remaining dataset is still large enough and the NA rows are random
capes = capes.dropna()
capes.shape

In [None]:
#Function to make the Average Grade Expected and Average Grade Recieved into floats of just the grade point
#number
def grade_standardize(grade):
    
    #ensure grade has not been standardized
    if type(grade) == float:
        return grade
    
    #retain only part after open parenthesis
    grade = grade.split("(")[-1]
    
    #remove close parenthesis
    grade = grade.replace(")", "")
    
    #change to float
    grade = float(grade)
    
    return grade
   

In [None]:
capes["Average Grade Expected"] = capes["Average Grade Expected"].apply(grade_standardize)
capes["Average Grade Received"] = capes["Average Grade Received"].apply(grade_standardize)

In [None]:
#function to remove % symbol from Percentage Recommended Class and Percentage Recommended Professor and make
#into float

def percent_standardize(percent):
    #ensure percent has not been standardized
    if type(percent) == float:
        return percent
    
    #remove % sign
    percent = percent.replace("%", "")
    
    #convert to float
    percent = float(percent)
    
    return percent
      

In [None]:
capes["Percentage Recommended Class"] = capes["Percentage Recommended Class"].apply(percent_standardize)
capes["Percentage Recommended Professor"] = capes["Percentage Recommended Professor"].apply(percent_standardize)

In [None]:
capes.dtypes

In [None]:
#create new column called difference which is the difference between the average
#grade expected and the average grade recieved
#negative numbers in difference signify students overestimated their grade, positive number signifies students
#underestimated their grade
capes = capes.assign(Difference = capes["Average Grade Received"] - capes["Average Grade Expected"])

In [None]:
#create columns for year and terms where each year will be an int (eg. 07, 08, 23, etc) and terms will be
#denoted by numbers(see code)
#this code is taken from https://github.com/COGS108/FinalProjects-Wi21/blob/main/FinalProject_group084.ipynb

In [None]:
def extract_term(term):
    TERMS = { "WI": 1, "SP": 2, "S1": 3, "S2": 4, "S3": 5, "FA": 6, "SU" : 7}
    
    term = term[:2]
    term = TERMS[term]
    
    return term

In [None]:
def extract_year(qtr):
    
    year = qtr[2:]
    year = int("20" + year)
    
    return year




In [None]:
capes["Term"] = capes["Quarter"].apply(extract_term)
capes["Year"] = capes["Quarter"].apply(extract_year)

In [None]:
capes.head()

In [None]:
capes.dtypes

In [None]:
capes.describe()

In [None]:
capes.isna().any()

In [None]:
#getting the types of courses in the dataset for potential grouping based on STEM vs liberal
#using value counts soted by descending to get the top 100 codes
capes['Course'].apply(lambda x: x.split(' ')[0]).value_counts()

In [None]:
#categories given for the top 100 course codes
#note some classes can be considered both since they are multidiciplinary like COGS
#classes offered as both BS and CA: COGS, ESYS, HDS, SYN, ICAM
#unsure due to no information found COSF, COGN, CONT, SOCC
stem_courses = ['MATH', 'CSE', 'CHEM', 'PHYS', 'MAE', 'ECE', 'BILD', 'SE', 'BENG', 'BIPN', 'BIEB', 'NANO'\
               'DSC','ENG', 'BISP', 'FPMU' ]
liberal_courses = ['ECON','POLI', 'MGT', 'PSYC', 'VIS', 'SOCI', \
                   'COGS', 'COMM', 'MUS', 'PHIL', 'SIO', 'TMDV', 'JAPN', 'CHIN', 'EDS', 'MMW', 'ETHN', 'USP', \
                   'LIGN', 'TDGE', 'ANTH', 'HIUS', 'AWP', 'LTWL', 'DOC', 'FMPH', 'HILD', 'HDP', 'INTL', \
                   'LSTP', 'HIEA', 'GLBH', 'CGS', 'WCWP', 'TDHT', 'HILA', 'ESYS', 'ANBI', 'HINE', 'COCU', \
                   'LTKO', 'LIHL', 'RELI', 'COHI', 'ENVR', 'LTCS', 'HDS', 'ANAR', 'LTLA', 'TDTR', 'HITO'\
                   'LATI', 'TDDR', 'HIAF', 'LTEU', 'TWS', 'DSGN', 'LTRU', 'TDDE', 'ELWR', 'SOCB', 'LAWS'\
                   'LTCH', 'LTIT']

In [None]:
#defining a function to add this label based on the class code
def course_cat(code):
    if code in stem_courses:
        return 'stem'
    elif code in liberal_courses:
        return 'liberal'
    else:
        return 

# Results

## Exploratory Data Analysis

Carry out whatever EDA you need to for your project.  Because every project will be different we can't really give you much of a template at this point. But please make sure you describe the what and why in text here as well as providing interpretation of results and context.

## First Analysis You Did - Give it a better title

Some more words and stuff.  Remember notebooks work best if you interleave the code that generates a result with properly annotate figures and text that puts these results into context.

In [None]:
## YOUR CODE HERE
## FEEL FREE TO ADD MULTIPLE CELLS PER SECTION

## Second Analysis You Did - Give it a better title

Some more words and stuff.  Remember notebooks work best if you interleave the code that generates a result with properly annotate figures and text that puts these results into context.

In [None]:
## YOUR CODE HERE
## FEEL FREE TO ADD MULTIPLE CELLS PER SECTION

## ETC AD NASEUM

Some more words and stuff.  Remember notebooks work best if you interleave the code that generates a result with properly annotate figures and text that puts these results into context.

In [None]:
## YOUR CODE HERE
## FEEL FREE TO ADD MULTIPLE CELLS PER SECTION

# Ethics & Privacy

- Thoughtful discussion of ethical concerns included
- Ethical concerns consider the whole data science process (question asked, data collected, data being used, the bias in data, analysis, post-analysis, etc.)
- How your group handled bias/ethical concerns clearly described

Acknowledge and address any ethics & privacy related issues of your question(s), proposed dataset(s), and/or analyses. Use the information provided in lecture to guide your group discussion and thinking. If you need further guidance, check out [Deon's Ethics Checklist](http://deon.drivendata.org/#data-science-ethics-checklist). In particular:

- Are there any biases/privacy/terms of use issues with the data you propsed?
- Are there potential biases in your dataset(s), in terms of who it composes, and how it was collected, that may be problematic in terms of it allowing for equitable analysis? (For example, does your data exclude particular populations, or is it likely to reflect particular human biases in a way that could be a problem?)
- How will you set out to detect these specific biases before, during, and after/when communicating your analysis?
- Are there any other issues related to your topic area, data, and/or analyses that are potentially problematic in terms of data privacy and equitable impact?
- How will you handle issues you identified?

# Discusison and Conclusion

Wrap it all up here.  Somewhere between 3 and 10 paragraphs roughly.  A good time to refer back to your Background section and review how this work extended the previous stuff. 


# Team Contributions

Speficy who did what.  This should be pretty granular, perhaps bullet points, no more than a few sentences per person.