# Simpson's Paradox
Use the Jupyter notebook to analyze admission_data.csv to find the following values below.

Proportion and admission rate for each gender
Proportion and admission rate for physics majors of each gender
Proportion and admission rate for chemistry majors of each gender
Admission rate for each major

In [18]:
# Load and view first few lines of dataset
import pandas as pd
import numpy as np

df = pd.read_csv('admission_data.csv')
df.head()

Unnamed: 0,student_id,gender,major,admitted
0,35377,female,Chemistry,False
1,56105,male,Physics,True
2,31441,female,Chemistry,False
3,51765,male,Physics,True
4,53714,female,Physics,True


### Proportion and admission rate for each gender

In [19]:
# Proportion of students that are female
len(df[df['gender'] == 'female'])/df.shape[0]

0.514

In [20]:
# Proportion of students that are male
1 - _

0.486

In [21]:
# Admission rate for females
df[df['gender'] == 'female']['admitted'].mean() 

0.28793774319066145

In [22]:
# Admission rate for males
df[df['gender'] == 'male']['admitted'].mean() #admission rates for females appear to be lower

0.48559670781893005

> By only looking at gender and admission rates, Males appears to be favored in the admissions process. Males were admitted at a rate of 48.6%, while females were admitted at a rate of 28.8%.

### Proportion and admission rate for physics majors of each gender

In [23]:
# What proportion of female students are majoring in physics?

# given that a student is female, what is the probability they major in physics 
# that is the proportion of females and physics majors divided by the proportion of females
# since the denominators are the same, we can just get counts of each and take the ratio

df.query('gender == "female" and major == "Physics"').count()[0]/len(df[df['gender'] == 'female'])

0.12062256809338522

In [24]:
# What proportion of male students are majoring in physics?

df.query('gender == "male" and major == "Physics"').count()[0]/len(df[df['gender'] == 'male']) # many more males apply

0.9259259259259259

In [25]:
# Admission rate for female physics majors

# That is what proportion of females who apply in physics are admitted
fem_adm_phys = df.query('gender == "female" and major == "Physics" and admitted == True').count()[0]
fem_phys = df.query('gender == "female" and major == "Physics"').count()[0]

fem_adm_phys/fem_phys

0.7419354838709677

In [26]:
# Admission rate for male physics majors

# That is what proportion of males who apply in physics are admitted 
male_adm_phys = df.query('gender == "male" and major == "Physics" and admitted == True').count()[0]
male_phys = df.query('gender == "male" and major == "Physics"').count()[0]

male_adm_phys/male_phys #female admissions in physics are higher

0.5155555555555555

>- Of the students applying as physics majors, Females appears to be favored in the admissions process. Female physics majors were admitted at a rate of 74.2%, while male physics majors were admitted at a rate of 51.6%.
- Males tends to have more physics majors than chemistry majors. 92.6% of males have physics majors.

### Proportion and admission rate for chemistry majors of each gender

In [27]:
# What proportion of female students are majoring in chemistry?
df.query('gender == "female" and major == "Chemistry"').count()[0]/len(df[df['gender'] == 'female'])

0.8793774319066148

In [28]:
# What proportion of male students are majoring in chemistry?
df.query('gender == "male" and major == "Chemistry"').count()[0]/len(df[df['gender'] == 'male']) #many fewer males

0.07407407407407407

In [29]:
# Admission rate for female chemistry majors
fem_adm_chem = df.query('gender == "female" and major == "Chemistry" and admitted == True').count()[0]
fem_chem = df.query('gender == "female" and major == "Chemistry"').count()[0]

fem_adm_chem/fem_chem

0.22566371681415928

In [30]:
# Admission rate for male chemistry majors
male_adm_chem = df.query('gender == "male" and major == "Chemistry" and admitted == True').count()[0]
male_chem = df.query('gender == "male" and major == "Chemistry"').count()[0]

male_adm_chem/male_chem #fewer males are admitted into chemistry as well as physics

0.1111111111111111

>- Of the students applying as chemistry majors, Females appears to be favored in the admissions process. Women were admitted as chemistry majors at a rate of 22.6%, while men were admitted at a rate of 11.1%.
- Females tends to have more chemistry majors than physics majors. 87.9% of females have chemistry majors.

### Admission rate for each major

In [31]:
# Admission rate for physics majors
df[df['major'] == "Physics"]['admitted'].mean()

0.54296875

In [33]:
# Admission rate for chemistry majors
df[df['major'] == "Chemistry"]['admitted'].mean()

0.21721311475409835

>Chemistry has a lower admission rate. Chemistry has an admission rate of 21.7%, while physics has a rate of 54.3%!

Many more females applied to chemistry, which had a lower admissions rate.  Therefore, they had an overall lower admission rate.  Though, females had higher admission rates conditionally in both physics and chemistry.  This is known as **Simpson's Paradox**.