Is there a gender difference in the tags awarded by students? Make sure to teach each of the 20 tags for a potential gender difference and report which of them exhibit a statistically significant different. Comment on the 3 most gendered (lowest p-value) and least gendered (highest p-value) tags.

We need the rmpCapstoneNum and rmpCapstoneQual datasets to answer this question.

Column 1: “Tough grader”Column 2: “Good feedback”Column 3: “Respected”Column 4: “Lots to read”Column 5: “Participation matters”Column 6: “Don’t skip class or you will not pass”Column 7: “Lots of homework”Column 8: “Inspirational”Column 9: “Pop quizzes!”Column 10: “Accessible”Column 11: “So many papers”Column 12: “Clear grading”Column 13: “Hilarious”Column 14: “Test heavy”Column 15: “Graded by few things”Column 16: “Amazing lectures”Column 17: “Caring”Column 18: “Extra credit”Column 19: “Group projects”Column 20: “Lecture heavy”

In [127]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [128]:
# Seed value for random number generators to obtain reproducible results
RANDOM_SEED = 10676128

# Apply the random seed to numpy.
np.random.seed(RANDOM_SEED)

In [129]:
'''
Columns are:
1. Average Rating (the arithmetic mean of all individual quality ratings of this professor)
2. Average Difficulty (the arithmetic mean of all individual difficulty ratings of this professor)
3. Number of ratings (simply the total number of ratings these averages are based on)
4. Received a “pepper”? (Boolean - was this professor judged as “hot” by the students?)
5. The proportion of students that said they would take the class again
6. The number of ratings coming from online classes
7. Male gender (Boolean – 1: determined with high confidence that professor is male)
8. Female (Boolean – 1: determined with high confidence that professor is female)
'''
df_capstone = pd.read_csv('./rmpCapstoneNum.csv', header=None)
df_capstone.columns = ['Average Rating', 'Average Difficulty', 'Number of ratings', 'Received a pepper', 
                       'Proportion of students that said they would take the class again', 
                       'Number of ratings coming from online classes', 'Male Professor', 'Female Professor']


In [130]:
df_capstone_tags = pd.read_csv('./rmpCapstoneTags.csv', header=None)
df_capstone_tags.columns = ['Tough grader', 'Good feedback', 'Respected', 'Lots to read', 'Participation matters', 
                            'Don’t skip class or you will not pass', 'Lots of homework', 'Inspirational', 'Pop quizzes!', 
                            'Accessible', 'So many papers', 'Clear grading', 'Hilarious', 'Test heavy', 'Graded by few things', 
                            'Amazing lectures', 'Caring', 'Extra credit', 'Group projects', 'Lecture heavy']

In [131]:
# Merge the two dataframes
df_merged = pd.concat([df_capstone[['Average Rating', 'Number of ratings', 'Received a pepper', 'Male Professor', 'Female Professor']], df_capstone_tags], axis=1)

df_merged.head()

Unnamed: 0,Average Rating,Number of ratings,Received a pepper,Male Professor,Female Professor,Tough grader,Good feedback,Respected,Lots to read,Participation matters,...,So many papers,Clear grading,Hilarious,Test heavy,Graded by few things,Amazing lectures,Caring,Extra credit,Group projects,Lecture heavy
0,5.0,2.0,0.0,0,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
1,,,,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,3.2,4.0,0.0,1,0,2,1,2,1,0,...,0,0,0,0,0,0,0,0,0,0
3,3.6,10.0,1.0,0,0,6,3,0,0,2,...,0,2,1,0,0,0,0,0,1,0
4,1.0,1.0,0.0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [132]:
df_merged.shape

(89893, 25)

In [133]:
# Drop rows with average rating of NaN
df_merged = df_merged.dropna(subset=['Average Rating'])

In [134]:
df_merged.shape

(70004, 25)

In [135]:
# subset the data to only include professors with 10 or more ratings
df_merged_min_10 =  df_merged[(df_merged['Number of ratings'] >= 10) & ~((df_merged['Male Professor'] == 1) & (df_merged['Female Professor'] == 1)) & ~((df_merged['Male Professor'] == 0) & (df_merged['Female Professor'] == 0))]


In [136]:
df_merged_min_10.isna().sum()

Average Rating                           0
Number of ratings                        0
Received a pepper                        0
Male Professor                           0
Female Professor                         0
Tough grader                             0
Good feedback                            0
Respected                                0
Lots to read                             0
Participation matters                    0
Don’t skip class or you will not pass    0
Lots of homework                         0
Inspirational                            0
Pop quizzes!                             0
Accessible                               0
So many papers                           0
Clear grading                            0
Hilarious                                0
Test heavy                               0
Graded by few things                     0
Amazing lectures                         0
Caring                                   0
Extra credit                             0
Group proje

In [137]:
# Convert the tag columns to float
df_merged_min_10.iloc[:, 5:] = df_merged_min_10.iloc[:, 5:].apply(pd.to_numeric, errors='coerce').astype(float)

21       4.0
25       6.0
39       0.0
40       4.0
        ... 
89810    4.0
89841    2.0
89855    0.0
89866    4.0
89877    3.0
Name: Tough grader, Length: 7105, dtype: float64' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  df_merged_min_10.iloc[:, 5:] = df_merged_min_10.iloc[:, 5:].apply(pd.to_numeric, errors='coerce').astype(float)
21        3.0
25        5.0
39        1.0
40        0.0
         ... 
89810     2.0
89841     2.0
89855     2.0
89866     1.0
89877     0.0
Name: Good feedback, Length: 7105, dtype: float64' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  df_merged_min_10.iloc[:, 5:] = df_merged_min_10.iloc[:, 5:].apply(pd.to_numeric, errors='coerce').astype(float)
21       0.0
25       4.0
39       2.0
40       0.0
        ... 
89810    3.0
89841    3.0
89855    1.0
89866    1.0
89877    1.0
Name: Respected, Length: 7105, dtype: float64' has dtype incompatible with int64, please explicit

In [138]:
# Replace tag values by normalizing them with the total number of tags awarded to that professor
df_merged_min_10.iloc[:, 5:] = df_merged_min_10.iloc[:, 5:].div(df_merged_min_10.iloc[:, 3:].sum(axis=1), axis=0)

df_merged_min_10.head()

Unnamed: 0,Average Rating,Number of ratings,Received a pepper,Male Professor,Female Professor,Tough grader,Good feedback,Respected,Lots to read,Participation matters,...,So many papers,Clear grading,Hilarious,Test heavy,Graded by few things,Amazing lectures,Caring,Extra credit,Group projects,Lecture heavy
5,3.5,22.0,0.0,1,0,0.148148,0.240741,0.018519,0.055556,0.037037,...,0.0,0.12963,0.055556,0.0,0.0,0.0,0.055556,0.0,0.018519,0.055556
21,2.6,10.0,0.0,1,0,0.2,0.15,0.0,0.0,0.1,...,0.0,0.0,0.0,0.0,0.0,0.05,0.05,0.0,0.25,0.05
25,4.3,16.0,1.0,0,1,0.139535,0.116279,0.093023,0.023256,0.023256,...,0.0,0.0,0.023256,0.0,0.0,0.046512,0.232558,0.0,0.0,0.023256
39,3.5,20.0,1.0,1,0,0.0,0.03125,0.0625,0.0625,0.0625,...,0.03125,0.0,0.0,0.0,0.0,0.0,0.15625,0.1875,0.0,0.125
40,1.8,15.0,0.0,0,1,0.137931,0.0,0.0,0.206897,0.034483,...,0.0,0.034483,0.0,0.068966,0.034483,0.068966,0.034483,0.0,0.0,0.0


In [139]:
df_merged.isna().sum()

Average Rating                           0
Number of ratings                        0
Received a pepper                        0
Male Professor                           0
Female Professor                         0
Tough grader                             0
Good feedback                            0
Respected                                0
Lots to read                             0
Participation matters                    0
Don’t skip class or you will not pass    0
Lots of homework                         0
Inspirational                            0
Pop quizzes!                             0
Accessible                               0
So many papers                           0
Clear grading                            0
Hilarious                                0
Test heavy                               0
Graded by few things                     0
Amazing lectures                         0
Caring                                   0
Extra credit                             0
Group proje

In [140]:
from scipy.stats import mannwhitneyu, ks_2samp

# For each of the tags, calculate the p-value of the gender bias using Mann-Whitney U test and the KS test and store the results in a dataframe
# Initialize an empty list to store the results
results = []

# Iterate over each tag column
for tag in df_merged.columns[5:]:
    male_values = df_merged[df_merged['Male Professor'] == 1][tag].dropna()
    female_values = df_merged[df_merged['Female Professor'] == 1][tag].dropna()
    
    # Perform Mann-Whitney U test
    u_stat, p_value_u = mannwhitneyu(male_values, female_values, alternative='two-sided')
    
    # Perform KS test
    ks_stat, p_value_ks = ks_2samp(male_values, female_values)
    
    # Append the results to the list
    results.append({'Tag': tag, 'Mann-Whitney U p-value': p_value_u, 'KS test p-value': p_value_ks})

# Convert the results list to a DataFrame
p_values_df = pd.DataFrame(results)

p_values_df


Unnamed: 0,Tag,Mann-Whitney U p-value,KS test p-value
0,Tough grader,0.1968379,0.6182416
1,Good feedback,3.832195e-08,4.489109e-09
2,Respected,8.910723e-52,5.323298e-38
3,Lots to read,0.03468374,0.1127673
4,Participation matters,3.519467e-18,2.814118e-18
5,Don’t skip class or you will not pass,0.8496794,0.8699988
6,Lots of homework,7.446895e-09,1.581436e-07
7,Inspirational,1.436928e-12,1.000357e-07
8,Pop quizzes!,0.0008633165,0.3583098
9,Accessible,4.946528e-10,8.504067e-05


In [141]:
# sort the dataframe by the Mann-Whitney U p-value
p_values_df.sort_values(by='Mann-Whitney U p-value', ascending=True, inplace=True)

p_values_df
# Filter the results to only include tags with p-values less than 0.005
significant_results = p_values_df[(p_values_df['Mann-Whitney U p-value'] < 0.005) & (p_values_df['KS test p-value'] < 0.005)]

# Get the 3 tags with the lowest p-values
significant_results_smallest = significant_results.nsmallest(3, 'KS test p-value')

# Additionally get the 3 tags with the lowest p-values and the 3 tags with the highest p-values
significant_results_biggest = p_values_df.nlargest(3, 'Mann-Whitney U p-value')

print(significant_results)
print(significant_results_smallest)
print(significant_results_biggest)


                      Tag  Mann-Whitney U p-value  KS test p-value
12              Hilarious           5.684646e-251    3.536302e-185
15       Amazing lectures            2.057400e-69     2.380596e-49
19          Lecture heavy            6.855117e-52     3.978301e-35
2               Respected            8.910723e-52     5.323298e-38
14   Graded by few things            6.880139e-38     1.476642e-13
18         Group projects            1.056607e-20     2.039988e-09
4   Participation matters            3.519467e-18     2.814118e-18
13             Test heavy            1.165288e-15     7.524322e-05
10         So many papers            2.777445e-13     1.045154e-04
16                 Caring            5.350485e-13     1.578991e-12
7           Inspirational            1.436928e-12     1.000357e-07
9              Accessible            4.946528e-10     8.504067e-05
6        Lots of homework            7.446895e-09     1.581436e-07
17           Extra credit            1.104491e-08     1.953956

In [142]:
from scipy.stats import mannwhitneyu, ks_2samp

# For each of the tags, calculate the p-value of the gender bias using Mann-Whitney U test and the KS test and store the results in a dataframe
# Initialize an empty list to store the results
results = []

# Iterate over each tag column
for tag in df_merged_min_10.columns[5:]:
    male_values = df_merged_min_10[df_merged_min_10['Male Professor'] == 1][tag].dropna()
    female_values = df_merged_min_10[df_merged_min_10['Female Professor'] == 1][tag].dropna()
    
    # Perform Mann-Whitney U test
    u_stat, p_value_u = mannwhitneyu(male_values, female_values, alternative='two-sided')
    
    # Perform KS test
    ks_stat, p_value_ks = ks_2samp(male_values, female_values)
    
    # Append the results to the list
    results.append({'Tag': tag, 'Mann-Whitney U p-value': p_value_u, 'KS test p-value': p_value_ks})

# Convert the results list to a DataFrame
p_values_df = pd.DataFrame(results)

p_values_df


Unnamed: 0,Tag,Mann-Whitney U p-value,KS test p-value
0,Tough grader,0.02591416,0.03058014
1,Good feedback,6.137211e-09,4.542519e-07
2,Respected,2.4779570000000003e-17,2.501638e-13
3,Lots to read,2.312016e-06,1.551928e-07
4,Participation matters,2.130257e-15,3.382434e-13
5,Don’t skip class or you will not pass,0.001599722,0.0198554
6,Lots of homework,7.013001e-05,9.114387e-06
7,Inspirational,1.113329e-05,0.0005313693
8,Pop quizzes!,0.7077547,0.9999723
9,Accessible,0.0257937,0.05498972


In [143]:
# sort the dataframe by the Mann-Whitney U p-value
p_values_df.sort_values(by='Mann-Whitney U p-value', ascending=True, inplace=True)

p_values_df
# Filter the results to only include tags with p-values less than 0.005
significant_results = p_values_df[(p_values_df['Mann-Whitney U p-value'] < 0.005) & (p_values_df['KS test p-value'] < 0.005)]

# Get the 3 tags with the lowest p-values
significant_results_smallest = significant_results.nsmallest(3, 'KS test p-value')

# Additionally get the 3 tags with the lowest p-values and the 3 tags with the highest p-values
significant_results_biggest = p_values_df.nlargest(3, 'Mann-Whitney U p-value')

print(significant_results)
print(significant_results_smallest)
print(significant_results_biggest)


                      Tag  Mann-Whitney U p-value  KS test p-value
12              Hilarious            4.785031e-82     8.393299e-64
15       Amazing lectures            1.037135e-18     2.849352e-13
16                 Caring            7.703272e-18     1.528908e-14
2               Respected            2.477957e-17     2.501638e-13
4   Participation matters            2.130257e-15     3.382434e-13
18         Group projects            1.178424e-12     1.745627e-09
17           Extra credit            4.133627e-12     1.364045e-12
14   Graded by few things            9.021652e-12     2.114797e-08
1           Good feedback            6.137211e-09     4.542519e-07
10         So many papers            9.521631e-09     1.850942e-05
3            Lots to read            2.312016e-06     1.551928e-07
13             Test heavy            2.860739e-06     6.095948e-05
19          Lecture heavy            3.271754e-06     5.415648e-05
7           Inspirational            1.113329e-05     5.313693

In [144]:
# Repeat the same for those who did not receive a pepper and with 19 or more ratings
df_merged_min_19_no_pepper = df_merged_min_10[(df_merged_min_10['Received a pepper'] == 0) & (df_merged_min_10['Number of ratings'] >= 19)]

# Initialize an empty list to store the results
results = []

# Iterate over each tag column
for tag in df_merged_min_19_no_pepper.columns[5:]:
    male_values = df_merged_min_19_no_pepper[df_merged_min_19_no_pepper['Male Professor'] == 1][tag].dropna()
    female_values = df_merged_min_19_no_pepper[df_merged_min_19_no_pepper['Female Professor'] == 1][tag].dropna()
    
    # Perform Mann-Whitney U test
    u_stat, p_value_u = mannwhitneyu(male_values, female_values, alternative='two-sided')
    
    # Perform KS test
    ks_stat, p_value_ks = ks_2samp(male_values, female_values)
    
    # Append the results to the list
    results.append({'Tag': tag, 'Mann-Whitney U p-value': p_value_u, 'KS test p-value': p_value_ks})

# Convert the results list to a DataFrame
p_values_df = pd.DataFrame(results)

p_values_df

Unnamed: 0,Tag,Mann-Whitney U p-value,KS test p-value
0,Tough grader,0.042482,0.04861631
1,Good feedback,0.8156274,0.9960229
2,Respected,8.679785e-07,1.786488e-05
3,Lots to read,0.1509157,0.03277275
4,Participation matters,0.08939194,0.1299905
5,Don’t skip class or you will not pass,0.01482165,0.05224913
6,Lots of homework,0.02363202,0.01330433
7,Inspirational,2.07821e-05,1.488763e-05
8,Pop quizzes!,0.6773414,0.9991341
9,Accessible,0.1356561,0.2921728


In [145]:
# sort the dataframe by the Mann-Whitney U p-value
p_values_df.sort_values(by='Mann-Whitney U p-value', ascending=True, inplace=True)

p_values_df
# Filter the results to only include tags with p-values less than 0.005
significant_results = p_values_df[(p_values_df['Mann-Whitney U p-value'] < 0.005) & (p_values_df['KS test p-value'] < 0.005)]

# Get the 3 tags with the lowest p-values
significant_results_smallest = significant_results.nsmallest(3, 'KS test p-value')

# Additionally get the 3 tags with the lowest p-values and the 3 tags with the highest p-values
significant_results_biggest = p_values_df.nlargest(3, 'Mann-Whitney U p-value')

print(significant_results)
print(significant_results_smallest)
print(significant_results_biggest)


                 Tag  Mann-Whitney U p-value  KS test p-value
12         Hilarious            1.674606e-14     3.309649e-11
15  Amazing lectures            9.316475e-08     1.892189e-06
2          Respected            8.679785e-07     1.786488e-05
7      Inspirational            2.078210e-05     1.488763e-05
17      Extra credit            7.359923e-05     1.589638e-04
                 Tag  Mann-Whitney U p-value  KS test p-value
12         Hilarious            1.674606e-14     3.309649e-11
15  Amazing lectures            9.316475e-08     1.892189e-06
7      Inspirational            2.078210e-05     1.488763e-05
              Tag  Mann-Whitney U p-value  KS test p-value
19  Lecture heavy                0.984728         0.731513
16         Caring                0.860974         0.808828
1   Good feedback                0.815627         0.996023
