# Case Study

A small survey of health-care providers (physicians in private practice) classifies respondents as being either Primary-Care Providers (PCP) or Specialists (Spec). Of the 26 PCP physicians responding to the survey, 19 say health care reform is needed, and the rest say it is not. Of the 29 Specialists, 10 say reform is needed, and the rest say it is not.
Use the most appropriate test (at the .05 level) to test if type of specialization (PCP vs. Specialist) is independent of support for health care reform. Report the results in APA format. Interpret the results in terms of the proportion of each type of physician supporting health care reform (citing relevant descriptive statistics).


    
# Problem 1 
Use the most appropriate test (at the .05 level) to test if type of specialization (PCP vs. Specialist) is independent of support for health care reform.

In [1]:
#### Observed frequencies:
####                      PCP   Spec   Sums
#### Need Reform           19     10     29
#### Don’t need reform     7      19     26
#### Sums                  26     29     55

In [13]:
from scipy import stats
Need_Reform = [19,     10]
Dont_need_reform =  [7 ,     19]
stats.chisquare(Need_Reform, Dont_need_reform)

Power_divergenceResult(statistic=24.834586466165415, pvalue=6.246653302320535e-07)

In [14]:
import numpy as np
dice = np.array([Need_Reform, Dont_need_reform])
stats.chi2_contingency(dice)

(6.717091075712909, 0.00954932563048075, 1, array([[13.70909091, 15.29090909],
        [12.29090909, 13.70909091]]))

In [15]:
## a better visualization of the result
chi2_stat, p_val, dof, ex = stats.chi2_contingency(dice)
print("===Chi2 Stat===")
print(chi2_stat)
print("\n")
print("===Degrees of Freedom===")
print(dof)
print("\n")
print("===P-Value===")
print(p_val)
print("\n")
print("===Contingency Table===")
print(ex)

===Chi2 Stat===
6.717091075712909


===Degrees of Freedom===
1


===P-Value===
0.00954932563048075


===Contingency Table===
[[13.70909091 15.29090909]
 [12.29090909 13.70909091]]


### Chi2observed = 6.72, Chi2crit = 3.84 
### The observed frequencies are significantly different from the frequencies expected due to chance alone.

### Conclusion: Specialization is not independent of support for health care reform, such that the proportion of PCP and Spec physicians supporting health care reform is significantly different, Chi2(1) = 3.84, p < .05.

### Contingency Table = Expected frequencies
the array at the end of the output is the contingency table with expected values based on all of our samples. Note in this case, our contingency table produced values that are, in some cases, quite a bit off of what we know we should expect. This is because we are using too small of a sample to accurate measure the population.

# Concept: Chi-Square Test
The Chi-Square test of independence is used to determine if there is a significant relationship between two categorical variables.  

The frequency of each category for one nominal variable is compared across the categories of the second nominal variable.  

The data can be displayed in a contingency table where each row represents a category for one variable and each column represents a category for the other variable.  

For example, a researcher wants to examine the relationship between gender (male vs. female) and empathy (high vs. low).  

The chi-square test of independence can be used to examine this relationship.  

The null hypothesis for this test is that there is no relationship between gender and empathy.  

The alternative hypothesis is that there is a relationship between gender and empathy (e.g. there are more high-empathy females than high-empathy males).

Reference: http://www.statisticssolutions.com/non-parametric-analysis-chi-square/