# Statistical Analysis

A significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference. If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that not all of population means are equal.

### 2010 ANOVA
* **Uninsured vs. Mortality:** statistic= 11.423921956549762 // pvalue=0.0026973985600535423
* **Uninsured vs. ER:** statistic = 6.351761254768116 // pvalue = 0.019483926426570968
* **Uninsured vs. Expenditure:** statistic = 11.283565271277174 // pvalue=0.0028346121764053025)

### 2011 ANOVA
* **Uninsured vs. Mortality:** statistic= 11.21136058639101 // pvalue = 0.0029081766915106714
* **Uninsured vs. ER:** statistic = 7.033348884222337 // pvalue = 0.014559805084624079
* **Uninsured vs. Expenditure:** statistic = 11.06714126387808 // pvalue = 0.003061481311349054

### 2012 ANOVA
* **Uninsured vs. Mortality:** statistic = 11.323001957300848 // pvalue = 0.002795298916617509
* **Uninsured vs. ER:** statistic = 6.7526620230312 // pvalue = 0.016396720805420435
* **Uninsured vs. Expenditure:** statistic = 11.170360018700608 // pvalue = 0.0029508825831652606

### 2013 ANOVA
* **Uninsured vs. Mortality:** statistic = 11.120012688857067 // pvalue = 0.003004270448895195
* **Uninsured vs. ER:** statistic = 6.46450451340912 // pvalue = 0.018554893700544528
* **Uninsured vs. Expenditure:** statistic = 10.965546725568366 // pvalue = 0.003174809371025765)

### 2014 ANOVA
* **Uninsured vs. Mortality:** statistic = 10.186823873344807 // pvalue = 0.004213841256332821
* **Uninsured vs. ER:** statistic = statistic = 8.017535743002378 // pvalue = 0.009713419730182178
* **Uninsured vs. Expenditure:** statistic = 10.00437409628404 // pvalue = 0.004508216649936408
* **ER vs. Grouped ER:** statistic = 16.09991899027026 // pvalue = 0.0005850848184336672

In [2]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import scipy
from scipy import stats

In [2]:
uninsured = "Resources/dataframe_uninsured.csv"
uninsured = pd.read_csv(uninsured)
uninsured_df = pd.DataFrame(uninsured)
uninsured_df

Unnamed: 0,State,2010,2011,2012,2013,2014
0,ARIZONA,1065000,1095000,1131000,1118000,903000
1,FLORIDA,3941000,3911000,3816000,3853000,3245000
2,ILLINOIS,1746000,1659000,1622000,1618000,1238000
3,IOWA,280000,269000,254000,248000,189000
4,KENTUCKY,647000,618000,595000,616000,366000
5,MARYLAND,641000,598000,598000,593000,463000
6,MINNESOTA,476000,467000,425000,440000,317000
7,NEBRASKA,208000,207000,206000,209000,179000
8,NORTH CAROLINA,1570000,1545000,1582000,1509000,1276000
9,SOUTH CAROLINA,795000,766000,778000,739000,642000


In [3]:
mortality = "Resources/crudemortality.csv"
mortality = pd.read_csv(mortality)
mortality_df = pd.DataFrame(mortality)
mortality_df

Unnamed: 0,STATE,2010,2011,2012,2013,2014
0,Arizona,731.6,747.4,755.8,761.6,765.6
1,Florida,924.4,910.5,916.0,924.0,934.8
2,Illinois,778.8,792.5,795.7,802.1,817.5
3,Iowa,910.8,919.6,922.9,936.1,939.5
4,Kentucky,967.5,975.4,996.1,994.6,1015.9
5,Maryland,750.4,748.6,754.9,769.3,767.5
6,Minnesota,734.8,744.6,743.7,755.9,759.5
7,Nebraska,830.7,840.1,843.9,842.9,849.2
8,North Carolina,826.1,827.7,840.4,846.1,858.5
9,South Carolina,899.7,900.3,914.7,934.3,940.6


In [5]:
er = "Resources/total_er_visits.csv"
er = pd.read_csv(er)
er_df = pd.DataFrame(er)
er_df

Unnamed: 0,State,2010,2011,2012,2013,2014
0,Arizona,2305413,2416553,2476208,2432305,2569082
1,Florida,8173500,8507584,9041333,9194744,9764626
2,Illinois,4955198,5140486,5305026,5092925,5245177
3,Iowa,1140898,1171011,1188770,1159998,1194712
4,Kentucky,2404944,2433143,2430001,2321513,2436880
5,Maryland,2408759,2498178,2619227,2530131,2527936
6,Minnesota,1801852,1876179,1789410,1749949,1865994
7,Nebraska,536858,538245,546455,551549,563255
8,North Carolina,4125701,5633259,4499568,4585990,4672977
9,South Carolina,2169044,2231509,2283494,2322938,2391435


In [5]:
expenditure = "Resources/expenditure.csv"
expenditure = pd.read_csv(expenditure)

expenditure_df = pd.DataFrame(expenditure)
expenditure_df

Unnamed: 0.1,Unnamed: 0,State,2010,2011,2012,2013,2014
0,0,Arizona,6027,6076,6183,6262,6452
1,1,Florida,7301,7408,7635,7688,8076
2,2,Illinois,7253,7429,7665,7911,8262
3,3,Iowa,7177,7416,7648,7806,8200
4,4,Kentucky,6898,7142,7289,7543,8004
5,5,Maryland,7748,7937,8115,8250,8602
6,6,Minnesota,7782,7968,8177,8465,8871
7,7,Nebraska,7524,7715,7979,8133,8412
8,8,North Carolina,6615,6808,7073,7027,7264
9,9,South Carolina,6554,6707,6853,7020,7311


In [3]:
grouped_er = "Resources/ER_visits_rate_clean.csv"
grouped_er = pd.read_csv(grouped_er)

grouped_df = pd.DataFrame(grouped_er)
grouped_df

Unnamed: 0.1,Unnamed: 0,2010,2011,2012,2013,2014
0,Arizona,35981.75607,37334.872323,37775.992536,36671.062019,38171.238526
1,Florida,43371.011396,44651.646332,46851.572162,47042.475652,49202.205936
2,Illinois,38590.372978,39949.519151,41180.065065,39494.951931,40709.223095
3,Iowa,37397.357039,38189.259103,38644.231988,37504.013098,38423.207423
4,Kentucky,55309.19711,55680.610258,55399.209274,52705.850782,55203.609864
5,Maryland,41611.793434,42781.276699,44491.77101,42715.696345,42434.378222
6,Minnesota,33927.892223,35094.066881,33281.175633,32325.774239,34231.644781
7,Nebraska,29343.846711,29241.766051,29485.4646,29569.249426,29971.197044
8,North Carolina,43091.307866,58329.850754,46151.895753,46589.794354,47045.506508
9,South Carolina,46790.514122,47763.524525,48406.246383,48759.424695,49577.630231


In [None]:
#2010 Uninsured vs Mortality ANOVA Test
mortality_anova_10 = scipy.stats.f_oneway(uninsured_df["2010"],
                                       mortality_df["2010"])

print(mortality_anova_10)

In [7]:
#2010 Uninsured vs ER ANOVA Test
er_anova_10 = scipy.stats.f_oneway(uninsured_df["2010"],
                                er_df["2010"])

print(er_anova_10)

F_onewayResult(statistic=6.351761254768116, pvalue=0.019483926426570968)


In [8]:
#2010 Uninsured vs Expenditure ANOVA Test
expenditure_anova_10 = scipy.stats.f_oneway(uninsured_df["2010"],
                                            expenditure_df["2010"])

print(expenditure_anova_10)

F_onewayResult(statistic=11.283565271277174, pvalue=0.0028346121764053025)


In [9]:
#2011 Uninsured vs Mortality ANOVA Test
mortality_anova_11 = scipy.stats.f_oneway(uninsured_df["2011"],
                                       mortality_df["2011"])

print(mortality_anova_11)

F_onewayResult(statistic=11.21136058639101, pvalue=0.0029081766915106714)


In [21]:
#2011 Uninsured vs ER ANOVA Test
er_anova_11 = scipy.stats.f_oneway(uninsured_df["2011"],
                                er_df["2011"])

print(er_anova_11)

F_onewayResult(statistic=7.033348884222337, pvalue=0.014559805084624079)


In [11]:
#2011 Uninsured vs Expenditure ANOVA Test
expenditure_anova_11 = scipy.stats.f_oneway(uninsured_df["2011"],
                                            expenditure_df["2011"])

print(expenditure_anova_11)

F_onewayResult(statistic=11.06714126387808, pvalue=0.003061481311349054)


In [12]:
#2012 Uninsured vs Mortality ANOVA Test
mortality_anova_12 = scipy.stats.f_oneway(uninsured_df["2012"],
                                       mortality_df["2012"])

print(mortality_anova_12)

F_onewayResult(statistic=11.323001957300848, pvalue=0.002795298916617509)


In [13]:
#2012 Uninsured vs ER ANOVA Test
er_anova_12 = scipy.stats.f_oneway(uninsured_df["2012"],
                                er_df["2012"])

print(er_anova_12)

F_onewayResult(statistic=6.7526620230312, pvalue=0.016396720805420435)


In [14]:
#2012 Uninsured vs Expenditure ANOVA Test
expenditure_anova_12 = scipy.stats.f_oneway(uninsured_df["2012"],
                                            expenditure_df["2012"])

print(expenditure_anova_12)

F_onewayResult(statistic=11.170360018700608, pvalue=0.0029508825831652606)


In [15]:
#2013 Uninsured vs Mortality ANOVA Test
mortality_anova_13 = scipy.stats.f_oneway(uninsured_df["2013"],
                                       mortality_df["2013"])

print(mortality_anova_13)

F_onewayResult(statistic=11.120012688857067, pvalue=0.003004270448895195)


In [16]:
#2013 Uninsured vs ER ANOVA Test
er_anova_13 = scipy.stats.f_oneway(uninsured_df["2013"],
                                er_df["2013"])

print(er_anova_13)

F_onewayResult(statistic=6.46450451340912, pvalue=0.018554893700544528)


In [17]:
#2013 Uninsured vs Expenditure ANOVA Test
expenditure_anova_13 = scipy.stats.f_oneway(uninsured_df["2013"],
                                            expenditure_df["2013"])

print(expenditure_anova_13)

F_onewayResult(statistic=10.965546725568366, pvalue=0.003174809371025765)


In [18]:
#2014 Uninsured vs Mortality ANOVA Test
mortality_anova_14 = scipy.stats.f_oneway(uninsured_df["2014"],
                                       mortality_df["2014"])

print(mortality_anova_14)

F_onewayResult(statistic=10.186823873344807, pvalue=0.004213841256332821)


In [19]:
#2014 Uninsured vs ER ANOVA Test
er_anova_14 = scipy.stats.f_oneway(uninsured_df["2014"],
                                er_df["2014"])

print(er_anova_14)

F_onewayResult(statistic=8.017535743002378, pvalue=0.009713419730182178)


In [20]:
#2014 Uninsured vs Expenditure ANOVA Test
expenditure_anova_14 = scipy.stats.f_oneway(uninsured_df["2014"],
                                            expenditure_df["2014"])

print(expenditure_anova_14)

F_onewayResult(statistic=10.00437409628404, pvalue=0.004508216649936408)


In [6]:
#2014 ER vs 2014 Grouped ER
er_grouped = scipy.stats.f_oneway(er_df["2014"],
                              grouped_df["2014"])

print(er_grouped)

F_onewayResult(statistic=16.09991899027026, pvalue=0.0005850848184336672)
