In [1]:
#import the libraries
import numpy as np
import pandas as pd
%matplotlib inline 
import matplotlib as mpl
import matplotlib.pyplot as plt
from scipy import stats

<a id="ref1"></a>
## Compare EMPA Program and Other Fee-based Programs using Hypothesis Testing

<a id="ref1"></a>
## Statistical Background

**A. What is Hypothesis Testing? and why do we use Hypothesis Testing?**

In statistics, when we wish to start asking questions about the data and interpret the results, we use statistical methods that provide a confidence or likelihood about the answers. In general, this class of methods is called statistical hypothesis testing, or significance tests. In statistics, a hypothesis test calculates some quantity under a given assumption. The result of the test allows us to interpret whether the assumption holds or whether the assumption has been violated. <br>

The reason why we use hypothesis testing is that often we cannot make conclusions about the whole population just based on one sample data. The findings we inferred from the dataset can happen just by chance. When we say that a finding is statistically significant, it’s thanks to a hypothesis test.

**B. A few terms that will be used in the following analaysis:**

1. Null Hypothesis(H0): H0 always assume there is no significant effect/difference within the specified population at some level of significance.
2. Alternative Hypothesis(H1): H1 always has opposite opinion with H0.
3. P-value: P-value is the probability of obtaining a result at least as extreme, given that H0 was true.
4. Alpha(level of significance): Alpha is the pre-defined probability of rejecting H0, given that the H0 is true (a type I error). In other words, if P-value is lower than alpha, it means: in the observations, the chance reject the true H0 is low, so we are in favor of H1. A common value used for alpha is 5% or 0.05. 

**C. When do we reject the null hypothesis H0?**

**If the P-value is lower than the predefined significant level(alpha significant level), then we reject H0 in favor of H1 because there is enough evidence to prove the H0 is wrong.**

**D. Why do we use Student’s t-test(2-sample t-test) and 2-proportion z-tests in this analysis?**

*This analysis is divided into three sections, which analyze Question 3, Question 6, and Question 12's data from Master's Exit Surveys respectively with 2-sample t-tests and 2-proportion z-tests.*

Student's t-test and 2-proportion z-test are types of hypothesis testing. Student's T-test is used to determine whether there is a significant difference between the means of two groups. 2-Proportion Z-test is used to compare proportions from 2 independent samples.<br>

In our case, we want to know if our previous findings from sample data visualization are statistically significant. When we compare the average ratings on different items from EMPA students and other fee-based program students, we use Student's T-test; when we compare proportions of poor&fair ratings given by the EMPA students and other fee-based program students, we use 2-Proportion Z-test.

<a id="ref1"></a>
## Section 1

From our previous data visualization, the EMPA program needs the most improvement in the following category:
1. Inclusion of diverse perspectives (political, religious, racial/ethnic, gender, sexual orientation, etc.) in course discussions or assignments(Q3_6)
2. Extent to which the program has kept pace with recent trends and developments in your field(Q3_4)
3. Benefit versus cost of the program(Q3_7)
4. Overall quality(Q3_1)

Let's test the statistical significance!

**Data Import**

In [2]:
df= pd.read_csv("C:\\Users\\gtang\\Desktop\\MES\\MES_N_3.csv")

In [3]:
df_empa=df[(df.major=='EMPA')]

In [4]:
df_empa.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [5]:
df_others=df[(df.major!='EMPA')]
df_others.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [7]:
df_others

Unnamed: 0,Q3_1,Q3_2,Q3_3,Q3_4,Q3_5,Q3_6,Q3_7,major
0,4.0,4.0,4.0,4.0,3.0,3.0,3.0,NURSX
2,4.0,5.0,4.0,5.0,3.0,5.0,4.0,SOCWEH
3,5.0,5.0,5.0,5.0,5.0,5.0,2.0,BIOMRX
4,4.0,4.0,5.0,4.0,3.0,4.0,3.0,SOCWEH
5,3.0,3.0,3.0,3.0,3.0,3.0,3.0,HCDEE
6,4.0,4.0,4.0,4.0,4.0,4.0,4.0,MEDXM
7,4.0,4.0,3.0,4.0,3.0,1.0,1.0,HA GX
8,5.0,4.0,5.0,5.0,5.0,5.0,4.0,LIS X
9,4.0,4.0,4.0,4.0,4.0,4.0,4.0,COM E
10,4.0,4.0,3.0,3.0,5.0,5.0,4.0,MBA EX


**2-sample t-test**

*Null Hypothesis H0: u_EMPA=u_Others<br>
Alternative H1: u_EMPA<u_Others*<br>
where u_EMPA is the average ratings from EMPA students and u_Others is the average ratings from other fee-based program students<br>
We define our alpha as 0.05

In [6]:
t2, p2 = stats.ttest_ind(df_others[['Q3_6']].values,df_empa[['Q3_6']].values)

print("p of Inclusion of diverse perspectives= " + str(p2/2))

p of Inclusion of diverse perspectives= [0.01478659]


In [125]:
t2, p2 = stats.ttest_ind(df_others[['Q3_4']].values,df_empa[['Q3_4']].values)

print("p of Extent to which the program has kept pace with recent trends and developments in your field= " + str(p2/2))

p of Extent to which the program has kept pace with recent trends and developments in your field= [0.01815814]


In [126]:
t2, p2 = stats.ttest_ind(df_others[['Q3_7']].values,df_empa[['Q3_7']].values)

print("p of Benefit versus cost of the program" + str(p2/2))

p of Benefit versus cost of the program[0.01361567]


In [127]:
t2, p2 = stats.ttest_ind(df_others[['Q3_1']].values,df_empa[['Q3_1']].values)

print("p of Overall Quality = " + str(p2/2))

p of Overall Quality = [0.29391385]


We can reject the null hypothesis in the first three categories, but there is no significant evidence showing that the overall quality of EMPA program is worse than the other fee-based ones.<br>
Let's continue to do the 2-sample proportion z-test on the proportions of EMPA students and others giving poor and fair ratings on these four categories.

**Data Preprocessing**

In [128]:
df_t= pd.read_csv("C:\\Users\\gtang\\Desktop\\MES\\MES_TT.csv")
df_empa1=df_t[(df_t.major=='EMPA')].iloc[:,0:7].apply(pd.Series.value_counts)
df_empa1 = df_empa1.replace(np.nan, 0)
totale=df_empa1.sum()
df_empa1=df_empa1.transpose()
df_empa1.drop(['Good','Very Good','Excellent'],axis=1,inplace=True)
empa=df_empa1.sum(axis=1)

In [129]:
df_others1=df_t[(df_t.major!='EMPA')].iloc[:,0:7].apply(pd.Series.value_counts)
df_others1 = df_others1.replace(np.nan, 0)
totalo=df_others1.sum()
df_others1=df_others1.transpose()
df_others1.drop(['Good','Very Good','Excellent'],axis=1,inplace=True)
others=df_others1.sum(axis=1)

**2-proportion z-test**

*Null Hypothesis H0: p_EMPA=p_Others*<br> 
*Alternative H1: p_EMPA>p_Others*<br>
where p_EMPA is the proportion of EMPA students who give poor and fair ratings and p_Others is the proportion of other fee-based program students who give poor and fair ratings<br>
We define our alpha as 0.05

In [130]:
import numpy as np
def two_proportion_z(y1,n1,y2,n2):
    p1=y1/n1
    p2=y2/n2
    p = (p1 * n1 + p2 * n2) / (n1 + n2)
    se = np.sqrt((p*(1-p))*((1/n1)+(1/n2)))
    z = (p1-p2)/se
    print('p value:',1-stats.norm.cdf(z))

In [131]:
two_proportion_z(empa[5],totale[5],others[5],totalo[5])
two_proportion_z(empa[3],totale[3],others[3],totalo[3])
two_proportion_z(empa[6],totale[6],others[6],totalo[6])
two_proportion_z(empa[0],totale[0],others[0],totalo[0])

p value: 0.019651491705923285
p value: 0.022498004041894726
p value: 0.12904113656541516
p value: 0.8065013632861502


**Conclusion**

Based on the above p values, we can conclude that EMPA students tend to give more negative reviews on the following items than other programs' students:
1. Inclusion of diverse perspectives (political, religious, racial/ethnic, gender, sexual orientation, etc.) in course discussions or assignments(Q3_6)
2. Extent to which the program has kept pace with recent trends and developments in your field(Q3_4)**

Please also note that the above items' ratings and the rating on Benefit versus cost of the program from EMPA students are below average.

<a id="ref1"></a>
## Section 2

It is very obvious that EMPA program received the negative ratings in internship options(12) the most from our data visualization.
Other aspects that we need to pay close attention to are: academic advising(5), accessibility of UW resources(6), research resources(9), availability of financial support(11), student networking opportunities(13), and admission services(14).


**Data Import**

In [132]:
df= pd.read_csv("C:\\Users\\gtang\\Desktop\\MES\\MES_N_6.csv")

In [133]:
df_empa=df[(df.major=='EMPA')]

In [134]:
df_empa.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [135]:
df_others=df[(df.major!='EMPA')]
df_others.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


**2-sample t-test**

*Null Hypothesis H0: u_EMPA=u_Others<br>
Alternative H1: u_EMPA<u_Others*<br>
where u_EMPA is the average ratings from EMPA students and u_Others is the average ratings from other fee-based program students<br>
We define our alpha as 0.05

In [136]:
t2, p2 = stats.ttest_ind(df_others[['Q6_12']].values,df_empa[['Q6_12']].values)

print("p of internship options " + str(p2/2))

p of internship options [0.24003867]


In [137]:
t2, p2 = stats.ttest_ind(df_others[['Q6_5']].values,df_empa[['Q6_5']].values)

print("p of academic advising" + str(p2/2))

p of academic advising[0.22283072]


In [138]:
t2, p2 = stats.ttest_ind(df_others[['Q6_6']].values,df_empa[['Q6_6']].values)

print("p of accessibility of UW resources" + str(p2/2))

p of accessibility of UW resources[0.05176785]


In [139]:
t2, p2 = stats.ttest_ind(df_others[['Q6_9']].values,df_empa[['Q6_9']].values)

print("p of research resources " + str(p2/2))

p of research resources [0.01723138]


In [140]:
t2, p2 = stats.ttest_ind(df_others[['Q6_11']].values,df_empa[['Q6_11']].values)

print("p of availability of financial support  " + str(p2/2))

p of availability of financial support  [0.45537123]


In [141]:
t2, p2 = stats.ttest_ind(df_others[['Q6_13']].values,df_empa[['Q6_13']].values)

print("p of student networking opportunities" + str(p2/2))

p of student networking opportunities[0.06678924]


In [142]:
t2, p2 = stats.ttest_ind(df_others[['Q6_14']].values,df_empa[['Q6_14']].values)

print("p of admission services " + str(p2/2))

p of admission services [0.07699615]


**Data Preprocessing**

In [143]:
df_t= pd.read_csv("C:\\Users\\gtang\\Desktop\\MES\\MES_TT.csv")
df_empa1=df_t[(df_t.major=='EMPA')].iloc[:,7:22].apply(pd.Series.value_counts)
df_empa1 = df_empa1.replace(np.nan, 0)
totale=df_empa1.sum()
df_empa1=df_empa1.transpose()
df_empa1.drop(['Good','Very Good','Excellent'],axis=1,inplace=True)
empa=df_empa1.sum(axis=1)

In [144]:
df_others1=df_t[(df_t.major!='EMPA')].iloc[:,7:22].apply(pd.Series.value_counts)
df_others1 = df_others1.replace(np.nan, 0)
totalo=df_others1.sum()
df_others1=df_others1.transpose()
df_others1.drop(['Good','Very Good','Excellent'],axis=1,inplace=True)
others=df_others1.sum(axis=1)

**2-proportion z-test**

*Null Hypothesis H0: p_EMPA=p_Others*<br> 
*Alternative H1: p_EMPA>p_Others*<br>
where p_EMPA is the proportion of EMPA students who give poor and fair ratings and p_Others is the proportion of other fee-based program students who give poor and fair ratings<br>
We define our alpha as 0.05

In [145]:
import numpy as np
def two_proportion_z(y1,n1,y2,n2):
    p1=y1/n1
    p2=y2/n2
    p = (p1 * n1 + p2 * n2) / (n1 + n2)
    se = np.sqrt((p*(1-p))*((1/n1)+(1/n2)))
    z = (p1-p2)/se
    print('p value:',1-stats.norm.cdf(z))

In [146]:
two_proportion_z(empa[11],totale[11],others[11],totalo[11])
two_proportion_z(empa[4],totale[4],others[4],totalo[4])
two_proportion_z(empa[5],totale[5],others[5],totalo[5])
two_proportion_z(empa[8],totale[8],others[8],totalo[8])
two_proportion_z(empa[10],totale[10],others[10],totalo[10])
two_proportion_z(empa[12],totale[12],others[12],totalo[12])
two_proportion_z(empa[13],totale[13],others[13],totalo[13])

p value: 0.00844281750111986
p value: 0.1782971655707445
p value: 0.058936312733272045
p value: 0.0843314539393214
p value: 0.237775002152391
p value: 0.14065184742637893
p value: 0.19881029169401543


**Conclusion**

Combining the above tests, we are confident the average of EMPA students'ratings on research resources is worse than the average of other fee-based program students'. Even though we can not conclude that EMPA students tend to give more negative ratings to research resources(the proportion z-test p value is greater than alpha), we can be 95% certain that EMPA tend to give more negative ratings to the internship options that the program can provide than other fee-based program students.

<a id="ref1"></a>
## Section 3

From our previous descirptive statistical analysis, we know that
The EMPA students tend to disagree more with the following statements: 
(from the highest disagreement percentage difference between EMPA students and Other students to the lowest)
1. I saw myself and people of my background in course materials and examples(9)
2. I received positive mentorship in my program(10)
3. My program reflects an openness to diverse perspectives (political, religious, racial/ethnic, gender, sexual orientation, etc.(8)
4. I felt encouraged and supported in my program(5)
5. I felt encouraged and supported in my school/college(6)

**Data Import**

In [147]:
df= pd.read_csv("C:\\Users\\gtang\\Desktop\\MES\\MES_N_12.csv")

In [148]:
df_empa=df[(df.major=='EMPA')]

In [149]:
df_others=df[(df.major!='EMPA')]
df_others.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [150]:
df_empa.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


**2-sample t-test**

*Null Hypothesis H0: u_EMPA=u_Others<br>
Alternative H1: u_EMPA<u_Others*<br>
where u_EMPA is the average ratings from EMPA students and u_Others is the average ratings from other fee-based program students<br>
We define our alpha as 0.05

In [151]:
t2, p2 = stats.ttest_ind(df_others[['Q12_9']].values,df_empa[['Q12_9']].values)

print("p of I saw myself and people of my background in course materials and examples " + str(p2/2))

p of I saw myself and people of my background in course materials and examples [1.64152295e-06]


Our test is statistically significant because our p value is smaller than alpha, and we can reject the null hypothesis.

In [152]:
t2, p2 = stats.ttest_ind(df_others[['Q12_10']].values,df_empa[['Q12_10']].values)

print("p of I received positive mentorship in my program " + str(p2/2))

p of I received positive mentorship in my program [2.20475584e-05]


Our test is statistically significant because our p value is smaller than alpha, and we can reject the null hypothesis.

In [153]:
t2, p2 = stats.ttest_ind(df_others[['Q12_8']].values,df_empa[['Q12_8']].values)

print("p of My program reflects an openness to diverse perspectives " + str(p2/2))

p of My program reflects an openness to diverse perspectives [0.00380323]


Our test is statistically significant because our p value is smaller than alpha, and we can reject the null hypothesis.

In [154]:
t2, p2 = stats.ttest_ind(df_others[['Q12_5']].values,df_empa[['Q12_5']].values)

print("p of I felt encouraged and supported in my program " + str(p2/2))

p of I felt encouraged and supported in my program [0.25620213]


Our test is not statistically significant because our p value is greater than alpha, and we can not reject the null hypothesis.

In [155]:
t2, p2 = stats.ttest_ind(df_others[['Q12_6']].values,df_empa[['Q12_6']].values)

print("p of I felt encouraged and supported in my school/college " + str(p2/2))

p of I felt encouraged and supported in my school/college [0.22285065]


Let's conduct two-proportion z-test on these three statements to decide if the "strongly disagree" and "disagree" ratings difference between EMPA students and other fee-based program students on the above statements is due to sample selection. 

**Data Preprocessing**

In [156]:
df_t= pd.read_csv("C:\\Users\\gtang\\Desktop\\MES\\MES_TT.csv")
df_empa1=df_t[(df_t.major=='EMPA')].iloc[:,22:32].apply(pd.Series.value_counts)
df_empa1 = df_empa1.replace(np.nan, 0)
totale=df_empa1.sum()
df_empa1=df_empa1.transpose()
df_empa1.drop(['Agree','Neither Agree nor Disagree','Strongly Agree'],axis=1,inplace=True)
empa=df_empa1.sum(axis=1)

In [157]:
df_others1=df_t[(df_t.major!='EMPA')].iloc[:,22:32].apply(pd.Series.value_counts)
df_others1 = df_others1.replace(np.nan, 0)
totalo=df_others1.sum()
df_others1=df_others1.transpose()
df_others1.drop(['Agree','Neither Agree nor Disagree','Strongly Agree'],axis=1,inplace=True)
others=df_others1.sum(axis=1)

**2-proportion z-test**

*Null Hypothesis H0: p_EMPA=p_Others*<br> 
*Alternative H1: p_EMPA>p_Others*<br>
where p_EMPA is the proportion of EMPA students who give poor and fair ratings and p_Others is the proportion of other fee-based program students who give poor and fair ratings<br>
We define our alpha as 0.05

In [158]:
import numpy as np
def two_proportion_z(y1,n1,y2,n2):
    p1=y1/n1
    p2=y2/n2
    p = (p1 * n1 + p2 * n2) / (n1 + n2)
    se = np.sqrt((p*(1-p))*((1/n1)+(1/n2)))
    z = (p1-p2)/se
    print('p value:',1-stats.norm.cdf(z))

In [159]:
two_proportion_z(empa[8],totale[8],others[8],totalo[8])
two_proportion_z(empa[9],totale[9],others[9],totalo[9])
two_proportion_z(empa[7],totale[7],others[7],totalo[7])
two_proportion_z(empa[4],totale[4],others[4],totalo[4])
two_proportion_z(empa[5],totale[5],others[5],totalo[5])

p value: 1.3935985698765307e-10
p value: 5.714918780763689e-05
p value: 0.014492595760150384
p value: 0.20577374780946767
p value: 0.2608932521049081


**Conclusion**

Based on the above p values and previous t-test results, we can confirm that EMPA students tend to give more negative reviews on the following statements than other programs' students:
1. I saw myself and people of my background in course materials and examples(9)
2. I received positive mentorship in my program(10)
3. My program reflects an openness to diverse perspectives (political, religious, racial/ethnic, gender, sexual orientation, etc.(8)

The above items' average ratings from EMPA students are also lower than those from other fee-based program students.

<a id="ref1"></a>
## Thanks for reading!