# Simpson's Paradox

Simpson's paradox, which goes by several names, is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined. This result is often encountered in social-science and medical-science statistics and is particularly problematic when frequency data is unduly given causal interpretations. The paradox can be resolved when causal relations are appropriately addressed in the statistical modeling.

In [1]:
# Load and view first few lines of dataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv('admission_data.csv')
df.head()

Unnamed: 0,student_id,gender,major,admitted
0,35377,female,Chemistry,False
1,56105,male,Physics,True
2,31441,female,Chemistry,False
3,51765,male,Physics,True
4,53714,female,Physics,True


## Admission rate

In [2]:
# total number of students

total_students = df.shape[0]
total_students

500

> __Total number of students who applied for admission is 500.__

In [3]:
admit = len(df[df['admitted']])
admit

192

> __Number of students who got admission: 192__ 

In [4]:
# Admission rate

admit_prop = admit / total_students
admit_prop

0.384

> __Admission rate: 0.384__ 

In the admissions dataset there are a total of __500 applicants__. The number of students who receive an admit is __192__, hence the admission percentage is __38.4%__.

## Proportion and admission rate for each gender

In [5]:
# number of female student

female_students = len(df[df['gender'] == 'female'])
female_students

257

> __Number of female students who applied for admission: 257.__

In [6]:
# Proportion of students that are female

female_prop = female_students/total_students
female_prop

0.514

> __Proporion of students that are female: 0.514__ 

In [7]:
# number of male students

male_students = len(df[df['gender'] == 'male'])
male_students

243

> __Number of male students who applied for admission: 243.__

In [8]:
# Proportion of students that are male

male_prop = male_students/total_students
male_prop

0.486

> __Proporion of students that are male: 0.486__ 

In [9]:
# Admission rate for females

female_admits = len(df[(df['gender'] == 'female') & (df['admitted'])]) / female_students
female_admits

0.28793774319066145

> __Admission rate for female students: 0.288__ 

In [10]:
# Admission rate for males

male_admits = len(df[(df['gender'] == 'male') & (df['admitted'])]) / male_students
male_admits

0.48559670781893005

> __Admission rate for male students: 0.486__ 

- The number of female applicants is greater than the number of male applicants __(female : male = 0.514 : 0.486)__.
- The admission rate for female students __(28.8%)__ is less than the admission rate of male students __(48.6%)__.

## Proportion and admission rate for physics majors of each gender

In [11]:
# total number of students applying for physics major

total_physics = len(df[df['major'] == 'Physics'])
total_physics

256

> __Number of students applying for Physics major: 256__ 

In [12]:
#proportion of students applying for physics major 

phy_prop = total_physics/ total_students
phy_prop

0.512

> __Percentage of students applying for physics major: 51.2%__ 

In [13]:
# number of female students applying for physics major

female_physics = len(df[(df['major'] == 'Physics') & (df['gender'] == 'female')])
female_physics

31

> __Number of female students applying for Physics major: 31__ 

In [14]:
# What proportion of female students are majoring in physics?

female_physics_prop = female_physics / female_students
female_physics_prop

0.12062256809338522

> __Percentage of female students applying for physics mojor: 12.06%__ 

In [15]:
# number fo male students majoring in physics

male_physics = len(df[(df['gender'] == 'male') & (df['major'] == 'Physics')])
male_physics

225

> __Number of male students applying for Physics major: 225__ 

In [16]:
# What proportion of male students are majoring in physics?
male_physics_prop = male_physics/ male_students
male_physics_prop

0.9259259259259259

> __Percentage of male students applying for Physics major: 92.6%__ 

In [17]:
# Number of female physics major students who got admission 

len(df[(df['gender'] == 'female') & (df['major'] == 'Physics') & (df['admitted'])])

23

> __Number of Female Physics major students who got admission is: 23__ 

In [18]:
# Admission rate for female physics majors
female_physics_admits = len(df[(df['gender'] == 'female') & (df['major'] == 'Physics') & (df['admitted'])]) / female_physics
female_physics_admits

0.7419354838709677

> __Admission percentage for female students applying for Physics major: 74.19%__ 

In [19]:
# Number of male physics major students who got admission 

len(df[(df['gender'] == 'male') & (df['major'] == 'Physics') & (df['admitted'])])

116

> __Number of Female Physics major students who got admission is: 116__ 

In [20]:
# Admission rate for male physics majors
male_physics_admits = len(df[(df['gender'] == 'male') & (df['major'] == 'Physics') & (df['admitted'])]) / male_physics
male_physics_admits

0.5155555555555555

> __Admission percentage for male students applying for Physics major: 51.6%__ 

- Percentage of students applying for physics major is __51.2%.__
- Number of female students applying for physics mojor is __31__, out of which __23__ students got admission. Hence the admission rate for female students with Physics major is __74.19%.__
- Number of male students applying for physics mojor is __225__, out of which __116__ students got admission. Hence the admission rate for male students with Physics major is __51.55%.__
- From the above extracted information we can observe that for __Physics major the admission rate for female is greater than that of male__.

## Proportion and admission rate for chemistry majors of each gender

In [21]:
# total number of students applying for chemistry major

total_chem = len(df[df['major'] == 'Chemistry'])
total_chem

244

> __Number of students applying for Chemistry major: 244__ 

In [22]:
# Proportaion of studnets applying for Chemistry major

chem_prop = total_chem / total_students
chem_prop

0.488

> __Percentage of students applying for Chemistry major: 48.8%__ 

In [23]:
# number of female students applying for chemistry major

female_chem = len(df[(df['major'] == 'Chemistry') & (df['gender'] == 'female')])
female_chem

226

> __Number of female students applying for Chemistry major: 226__ 

In [24]:
# What proportion of female students are majoring in chemistry?

female_chem_prop = female_chem / female_students
female_chem_prop

0.8793774319066148

> __Percentage of female students applying for Chemistry major: 87.9%__ 

In [25]:
# number of male students applying for chemistry major

male_chem = len(df[(df['major'] == 'Chemistry') & (df['gender'] == 'male')])
male_chem

18

> __Number of male students applying for Chemistry major: 18__ 

In [26]:
# What proportion of male students are majoring in chemistry?

male_chem_prop = male_chem / male_students
male_chem_prop

0.07407407407407407

> __Percentage of male students applying for Chemistry major: 7.4%__ 

In [27]:
# Number of Female Physics major students who got admission

len(df[(df['gender'] == 'female') & (df['major'] == 'Chemistry') & (df['admitted'])])

51

> __Number of Female Chemistry major students who got admission is: 51__ 

In [28]:
# Admission rate for female chemistry majors

female_chem_admits = len(df[(df['gender'] == 'female') & (df['major'] == 'Chemistry') & (df['admitted'])]) / female_chem
female_chem_admits

0.22566371681415928

> __Admission percentage for female students applying for Chemistry major: 22.56%__ 

In [29]:
# Number of Male Physics major students who got admission 

len(df[(df['gender'] == 'male') & (df['major'] == 'Chemistry') & (df['admitted'])])

2

> __Number of Male Chemistry major students who got admission is: 2__ 

In [30]:
# Admission rate for male chemistry majors

male_chem_admits = len(df[(df['gender'] == 'male') & (df['major'] == 'Chemistry') & (df['admitted'])]) / male_chem

male_chem_admits

0.1111111111111111

> __Admission percentage for female students applying for Chemistry major: 11.11%__ 

- Percentage of students applying for chemistry major is __48.8%.__
- Number of female students applying for chemistry mojor is __226__, out of which __51__ students got admission. Hence the admission rate for female students with chemistry major is __22.56%.__
- Number of male students applying for chemistry mojor is __18__, out of which __2__ students got admission. Hence the admission rate for male students with chemistry major is __11.11%.__
- From the above extracted information we can observe that for __Chemistry major the admission rate for female is greater than that of male__.

## Admission rate for each major

In [31]:
# Admission rate for physics majors

admit_physics = len(df[(df['major'] == 'Physics') & (df['admitted'])]) / total_physics
admit_physics

0.54296875

> __Admission percentage for for Physics major: 54.3%__ 

In [32]:
# Admission rate for chemistry majors

admit_chem = len(df[(df['major'] == 'Chemistry') & (df['admitted'])]) / total_chem
admit_chem

0.21721311475409835

> __Admission percentage for Chemistry major: 21.7%__ 

### Observations: 

- We can observe that the number of students applying for Chemistry major (244) is slightly greater than the number of people applying for Physics major (225), but the admission rate of Physics major students (54.3%) is greater than the admission rate of Chemistry major students (21.7%).

- By only looking at gender and admission rates, males appear to be favored in the admissions process. Males were admitted at a rate of 48.6%, while females were admitted at a rate of 28.8%

- When we look at gender and physics admission rates we see a different conclusion, females appear to be more favored in the admissions process. Female physics majors were admitted at a rate of 74.2%, while male physics majors were admitted at a rate of 51.6%. However, there are more male physics majors. 92.6% of males in this dataset have physics majors.

- Looking at the chemistry data, we find that there are more female chemistry majors in this dataset. 87.9% of females have chemistry majors. However, chemistry has a lower admission rate, 21.7%, than physics at 54.3%.


__________________