First, we import the `zipfile` module to enable us to extract the Stata `dta` file.

In [1]:
import zipfile

We used the [ESS Cumulative Data Wizard](http://www.europeansocialsurvey.org/downloadwizard) to obtain a file containing all of the variables within the following categories:
* Subjective well-being, social exclusion; religion; percieved discrimination; national and ethnic identity (only the ESS standard variables)
* Gender, age and household composition
* Socio-demographic profile, including: type of area, education and occupation, union membership, income, marital status (only the ESS standard variables)

We did **not** use the country specific variables because we relied on the recoding done by the ESS staff. For example, we did not need the responses to a question asked in German, as the answers were incorporated in the ESS standard variables in a uniform manner.

In [2]:
with zipfile.ZipFile("output4909048052568705413.zip","r") as zip_ref:
    zip_ref.extractall("ESS_Cumulative")

In [3]:
import pandas as pd

In [4]:
ESS_data_frame = pd.read_stata("ESS_Cumulative/ESS1-6e01_1_F1.dta",
                               convert_categoricals=False, 
                               convert_missing=False)

In [5]:
ESS_data_frame.head()

Unnamed: 0,cntry,cname,cedition,cproddat,cseqno,name,essround,edition,idno,dweight,...,emprm14,emplnom,jbspvm,occm14,occm14a,occm14b,atncrse,fxltph,mbltph,inttph
0,BE,ESS1-6e01_1,1.1,09.03.2016,12394,ESS4e04_3,4,4.3,10202.0,1.0074,...,3.0,,,,,,1.0,1.0,1.0,2.0
1,BE,ESS1-6e01_1,1.1,09.03.2016,12395,ESS4e04_3,4,4.3,10203.0,1.0074,...,1.0,,2.0,,,5.0,1.0,2.0,1.0,2.0
2,BE,ESS1-6e01_1,1.1,09.03.2016,12396,ESS4e04_3,4,4.3,10207.0,1.0074,...,2.0,3.0,,,,4.0,1.0,1.0,1.0,1.0
3,BE,ESS1-6e01_1,1.1,09.03.2016,12397,ESS4e04_3,4,4.3,10208.0,1.0074,...,3.0,,,,,,1.0,1.0,1.0,2.0
4,BE,ESS1-6e01_1,1.1,09.03.2016,12398,ESS4e04_3,4,4.3,10302.0,1.0074,...,3.0,,,,,,1.0,1.0,1.0,1.0


We used the [IPUMS website for the CPS](https://cps.ipums.org/) to obtain data from the ASEC for 2010, 2012, and 2014. We used Stata to apply the data definitions provided by IPUMS to obtain a dta file. 


In [6]:
with zipfile.ZipFile("CPS_data_even_years.zip","r") as zip_ref:
    zip_ref.extractall("CPS_data_even_years")

In [7]:
CPS_data_frame = pd.read_stata("CPS_data_even_years/CPS_data.dta",
                               convert_categoricals=False, 
                               convert_missing=False)

In [8]:
CPS_data_frame.head()

Unnamed: 0,year,serial,numprec,hwtsupp,hhtenure,hhintype,region,statefip,statecensus,asecflag,...,educ99_mom,educ99_mom2,educ99_pop,educ99_pop2,educ99_sp,schlcoll_mom,schlcoll_mom2,schlcoll_pop,schlcoll_pop2,schlcoll_sp
0,2010,1,1,485.98999,2,1,11,23,11,1,...,,,,,,,,,,
1,2010,2,1,531.710022,1,1,11,23,11,1,...,,,,,,,,,,
2,2010,3,2,474.399994,1,1,11,23,11,1,...,,,,,7.0,,,,,0.0
3,2010,3,2,474.399994,1,1,11,23,11,1,...,,,,,9.0,,,,,0.0
4,2010,4,2,486.649994,1,1,11,23,11,1,...,,,,,10.0,,,,,0.0
