## Challenge: What test to use

Using selected questions from the 2012 and 2014 editions of the European Social Survey, address the following questions.

Keep track of your code and results in a Jupyter notebook or other source that you can share with your mentor. For each question, explain why you chose the approach you did.

Here is the data file. And here is the codebook, with information about the variable coding and content.

In this dataset, the same participants answered questions in 2012 and again 2014.

Did people become less trusting from 2012 to 2014? Compute results for each country in the sample.

Did people become happier from 2012 to 2014? Compute results for each country in the sample.

Who reported watching more TV in 2012, men or women?

Who was more likely to believe people were fair in 2012, people living with a partner or people living alone?


In [1]:
import pandas as pd
import numpy as np
df = pd.read_csv('https://raw.githubusercontent.com/Thinkful-Ed/data-201-resources/master/ESS_practice_data/ESSdata_Thinkful.csv')

In [2]:
df.head()
df.isna()
df.describe()

Unnamed: 0,idno,year,tvtot,ppltrst,pplfair,pplhlp,happy,sclmeet,sclact,gndr,agea,partner
count,8594.0,8594.0,8586.0,8580.0,8555.0,8569.0,8563.0,8579.0,8500.0,8584.0,8355.0,8577.0
mean,39549.38,6.5,3.861985,5.559907,6.005143,5.319874,7.694616,5.192563,2.748941,1.497204,47.470736,1.384867
std,626725.9,0.500029,2.019689,2.2337,2.129866,2.173449,1.735904,1.457643,0.905477,0.500021,18.397369,0.486592
min,1.0,6.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,15.0,1.0
25%,1052.0,6.0,2.0,4.0,5.0,4.0,7.0,4.0,2.0,1.0,33.0,1.0
50%,1714.0,6.5,4.0,6.0,6.0,5.0,8.0,6.0,3.0,1.0,47.0,1.0
75%,2745.0,7.0,5.0,7.0,8.0,7.0,9.0,6.0,3.0,2.0,62.0,2.0
max,11001430.0,7.0,7.0,10.0,10.0,10.0,10.0,7.0,5.0,2.0,114.0,2.0


In [3]:
gdf = df.dropna().groupby(['cntry','year'],as_index=False)['ppltrst','happy'].mean()
gdf['moretrust'] = gdf['ppltrst'].diff()[gdf['year']==7]
gdf['happier'] = gdf['happy'].diff()[gdf['year']==7]

In [4]:
# Did people become more trusting?
gdf.sort_values(by='moretrust',ascending=False)[gdf['year']==7]

  


Unnamed: 0,cntry,year,ppltrst,happy,moretrust,happier
5,DE,7,5.357143,7.857143,0.28022,0.549451
11,SE,7,6.239908,7.93887,0.196834,0.037822
1,CH,7,5.764468,8.142665,0.078676,0.059332
3,CZ,7,4.356436,6.922442,-0.046227,0.132093
9,NO,7,6.599719,7.919944,-0.048586,-0.332881
7,ES,7,4.940035,7.450617,-0.187771,-0.107241


In [5]:
# In this case, the higher the trust, the higher the value in moretrust
# Negative change indicates trust decrease
# On average, trust slightly increased for Germany, Sweden and Switzerland
# and slightly decreased for Czech Republic, Norway and Spain

# Did people get happier?
gdf.sort_values(by='happier',ascending=False)[gdf['year']==7]

  import sys


Unnamed: 0,cntry,year,ppltrst,happy,moretrust,happier
5,DE,7,5.357143,7.857143,0.28022,0.549451
3,CZ,7,4.356436,6.922442,-0.046227,0.132093
1,CH,7,5.764468,8.142665,0.078676,0.059332
11,SE,7,6.239908,7.93887,0.196834,0.037822
7,ES,7,4.940035,7.450617,-0.187771,-0.107241
9,NO,7,6.599719,7.919944,-0.048586,-0.332881


In [6]:
# The higher the value, the higher the happiness, and happiness increase the most
# in Germany and increased slightly in Czech Republic, Switzerland, and Sweden
# It decreased slighly in Spain and more so in Norway

tdf = df.dropna().groupby(['gndr'],as_index=False)['tvtot'].mean()[df['year']==6]
tdf

  """


Unnamed: 0,gndr,tvtot
0,1.0,3.782842
1,2.0,3.854847


In [7]:
#  Surprising to me, females report watching more TV than males
fdf = df.dropna().groupby(['partner'],as_index=False)['pplfair'].mean()[df['year']==6]
fdf

  


Unnamed: 0,partner,pplfair
0,1.0,6.062238
1,2.0,5.913848


In [8]:
# People who live with a partner, on average, believe people are more fair than those who live alone
