## Statistical Analysis

In [56]:
[df_clean.fttrump1.quantile(.75),
df_clean.fttrump1.quantile(.25),
df_clean.fttrump1.quantile(.75) - df_clean.fttrump1.quantile(.25)]

[80.0, 0.0, 80.0]

There is a clear relationship between party affiliation and candidates' thermometer ratings.

In [58]:
df_clean['partisanship'] = df_clean['ftbiden1'] - df_clean['fttrump1']
pd.crosstab(df_clean['worry_covid_economy'], df_clean['self_ideology'], values=df_clean['partisanship'], aggfunc='mean').round(2)

self_ideology,Liberal,Moderate,Conservative
worry_covid_economy,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
A little worried,36.57,-4.63,-53.56
Extremely worried,40.57,22.21,-33.8
Moderately worried,36.31,3.53,-51.55
Not at all worried,25.91,-6.38,-60.71
Very worried,37.09,11.12,-40.72


### Independent samples

In [59]:
from scipy import stats

fttrump_men = df_clean.query("sex=='Male'").fttrump1.dropna()
fttrump_women = df_clean.query("sex=='Female'").fttrump1.dropna()

stats.ttest_ind(fttrump_men, fttrump_women, equal_var=False) ## independent samples

Ttest_indResult(statistic=5.133795136669474, pvalue=3.0172390917880847e-07)

Therefore, we can reject the null hypothesis and conclude that there is a statistically significant difference between men and women in terms of how highly they rate Trump, on average.

In [60]:
ftbiden_men = df_clean.query("sex=='Male'").ftbiden1.dropna()
ftbiden_women = df_clean.query("sex=='Female'").ftbiden1.dropna()

stats.ttest_ind(ftbiden_men, ftbiden_women, equal_var=False) ## independent samples

Ttest_indResult(statistic=1.6038731186457855, pvalue=0.10884548070401018)

We failed to reject the null hypothesis and that women and men have the same average thermometer rating to Biden.

#### Compare the average rating of Trump to the average rating of Biden

In [61]:
{'Trump': df_clean['fttrump1'].mean(),
 'Biden': df_clean['ftbiden1'].mean(),
 'difference': df_clean['fttrump1'].mean() - df_clean['ftbiden1'].mean()}

{'Trump': 42.41913439635535,
 'Biden': 45.16868257600523,
 'difference': -2.749548179649878}

### Pair samples

In [1]:
df_ttest = df_clean[['fttrump1', 'ftbiden1']].dropna()
stats.ttest_rel(df_ttest['fttrump1'], df_ttest['ftbiden1']) ## Pair samples

NameError: name 'df_clean' is not defined

We reject the null hypothesis and conclude that there is a statistically significant difference between voters rate Trump vs. Biden, on average.

### Test of multiple comparisons-one-way anova

Whether there is a significant difference between the average age of democrat, independent, and republican voters.

In [63]:
df_clean.loc[~(df_clean['partyID']=='something else')].groupby('partyID').agg({'age': 'mean'})

Unnamed: 0_level_0,age
partyID,Unnamed: 1_level_1
Democrat,49.941731
Republican,54.021494
independent,49.20341


In [64]:
stats.f_oneway(df_clean.query("partyID=='Democrat'").age.dropna(),
               df_clean.query("partyID=='Republican'").age.dropna(),
               df_clean.query("partyID=='independent'").age.dropna()) ## one-way anowa

F_onewayResult(statistic=23.18228867227828, pvalue=1.0233980844224508e-10)

The p-value is much smaller than .05, so we reject the null hypothesis and conclude that there is a statistically significant difference between these three parties in terms of voters' average age.

### Tests of association -chi square

Conservatives overwhelmingly oppose both universal basic income & free college, with 79% of conservatives stating that they oppose them. Only 44% of liberals, in contrast, oppose both universal basic income & free college.

Whether these differences are strong enough for us to conclude that there are ideological differences in support for both universal basic income & free college.

In [65]:
round(pd.crosstab(df_clean['favor_both'], df_clean['self_ideology'], normalize='columns')*100,2)

self_ideology,Liberal,Moderate,Conservative
favor_both,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
False,44.19,71.95,79.22
True,55.81,28.05,20.78


In [66]:
stats.chi2_contingency(pd.crosstab(df_clean['favor_both'], df_clean['self_ideology']).values) ## chi-square

(333.6356352497014,
 3.56403797143625e-73,
 2,
 array([[758.31428571, 507.45779221, 700.22792208],
        [429.68571429, 287.54220779, 396.77207792]]))

The p-value at 3.56e-73 is much less than .05. Therefore, we reject the null hypothesis and conclude that there is a statistically significant relationship between ideology and support for both universal basic income & free college.