# Live Assignment 2: Roper Center
## DS 6001: Practice and Application of Data Science
### Drew Haynes (rbc6wr)

The [Roper Center for Public Opinion Research](https://ropercenter.cornell.edu/) describes itself as "the World’s Largest Public Opinion Archive", and is an online archive of the datasets used to generate public opinion polling results dating back to 1935. It's not simply a collection of aggregate measures (such as the overall presidential approval rate over time), but it provides access to the individual responses that comprise each poll: if a poll interviewed 1000 people, we get the spreadsheet with 1000 rows. It allows us to correlate, visualize, and regress the responses to each poll question in any way we'd like.

1. Gallup Poll # 1936-0053: Teachers' Oath/Government Loans for Farmers/Employers Insurance Contributions/Presidential Candidates [Roper #31087039](https://ropercenter.cornell.edu/ipoll/study/31087039)
2. Kaiser Family Foundation: December 2021 COVID-19 Vaccine Monitor: Early Omicron Update [Roper #31119104](https://ropercenter.cornell.edu/ipoll/study/31119104)
3. USIA Poll # 2000-I20068: Economic Conditions/Government Approval/Security/Civilian Rule/International Relations/US [Roper #31086002](https://ropercenter.cornell.edu/ipoll/study/31086002)

In [1]:
import numpy as np
import pandas as pd
import os
import csv

In [2]:
url = "https://uvadatascienceproject.s3.us-east-2.amazonaws.com/USAIPO1936-0053.csv"
gallup1936 = pd.read_csv(url)

In [3]:
pd.set_option('display.max_columns',50)
gallup1936

Unnamed: 0,form,state,region,female,age,class,OCCUPATION1,OCCUPATION2,OCCUPATION3,black,size,education,AGE_3WAY,AGE40,OCC8,prof,REGION4,EDU_RECODE,VOTE_PRO,VOTE_RETRO,PHONE_RECODE,CAR_RECODE,ballot,Q1,Q2,Q3,Q4A,Q4B,Q4C,Q5A,Q5B,farm,SIZE3,urban,StPOAbrv,SOUTH11,SOUTH11xBLACK,SOUTH12,SOUTH12xBLACK,south,SOUTHxBLACK,year,WtPubFeas,WtVotFeas
0,,Indiana,East Central,Male,,Av+,Skilled workers,,,,Urban,,,,Labor,Not Professional,Midwest,,Landon,Hoover,,,53,Yes,Yes,Yes,Roosevelt,Roosevelt,Landon,Landon,"Yes, voted for Hoover",Non-Farm,Urban,Urban,in,Non-South,,Non-South,,Non-South,,1936,,
1,,Illinois,East Central,Male,,Av+,Skilled workers,,,,Urban,,,,Labor,Not Professional,Midwest,,Landon,Hoover,,,53,Yes,,Yes,Landon,Landon,Landon,Landon,"Yes, voted for Hoover",Non-Farm,Urban,Urban,il,Non-South,,Non-South,,Non-South,,1936,,
2,,Michigan,East Central,Male,,Av,Business,,,,Urban,,,,Professional,Professional,Midwest,,Landon,Hoover,,,53,No,No,Yes,Landon,Landon,Landon,Landon,"Yes, voted for Hoover",Non-Farm,Urban,Urban,mi,Non-South,,Non-South,,Non-South,,1936,,
3,,Virginia,South and Southwest,Male,55 yrs and over,P or P+,Skilled workers,,,,Urban,,,,Labor,Not Professional,South,,fdr,fdr,,,53,Yes,Yes,Yes,Roosevelt,Roosevelt,Roosevelt,Roosevelt,"Yes, voted for Roosevelt",Non-Farm,Urban,Urban,va,South,,South,,South,,1936,,
4,,Florida,South and Southwest,Male,55 yrs and over,Av+,Skilled workers,,,,Small town,,,,Labor,Not Professional,South,,fdr,fdr,,,53,Yes,Yes,Yes,Roosevelt,Roosevelt,Roosevelt,Roosevelt,"Yes, voted for Roosevelt",Non-Farm,Rural Non-Farm,Non-Urban,fl,South,,South,,South,,1936,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5510,,Mississippi,South and Southwest,Male,55 yrs and over,Av+,Skilled workers,,,,Farm,,,,Labor,Not Professional,South,,fdr,fdr,,,53,Yes,Yes,Yes,Roosevelt,Roosevelt,Roosevelt,Roosevelt,"Yes, voted for Roosevelt",Farm,Farm,Non-Urban,ms,South,,South,,South,,1936,,
5511,,Indiana,East Central,Male,25-34 yrs,Av+,Skilled workers,,,,Farm,,,,Labor,Not Professional,Midwest,,Landon,Hoover,,,53,No,Yes,Yes,Landon,Landon,Roosevelt,Landon,"Yes, voted for Hoover",Farm,Farm,Non-Urban,in,Non-South,,Non-South,,Non-South,,1936,,
5512,,Illinois,East Central,Male,25-34 yrs,Av,Professiol,,,,Urban,,,,Professional,Professional,Midwest,,Landon,Hoover,,,53,No,No,Yes,Roosevelt,Roosevelt,Landon,Landon,"Yes, voted for Hoover",Non-Farm,Urban,Urban,il,Non-South,,Non-South,,Non-South,,1936,,
5513,,Michigan,East Central,Female,35-44 yrs,Av,Professiol,,,,Urban,,,,Professional,Professional,Midwest,,Landon,Hoover,,,53,Yes,Yes,Yes,Roosevelt,Landon,Landon,Landon,"Yes, voted for Hoover",Non-Farm,Urban,Urban,mi,Non-South,,Non-South,,Non-South,,1936,,


In [4]:
gallup1936.head(1).T

Unnamed: 0,0
form,
state,Indiana
region,East Central
female,Male
age,
class,Av+
OCCUPATION1,Skilled workers
OCCUPATION2,
OCCUPATION3,
black,


In [5]:
gallup1936.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5515 entries, 0 to 5514
Data columns (total 44 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   form           0 non-null      float64
 1   state          5515 non-null   object 
 2   region         5515 non-null   object 
 3   female         5514 non-null   object 
 4   age            5169 non-null   object 
 5   class          4619 non-null   object 
 6   OCCUPATION1    5513 non-null   object 
 7   OCCUPATION2    0 non-null      float64
 8   OCCUPATION3    0 non-null      float64
 9   black          2058 non-null   object 
 10  size           5514 non-null   object 
 11  education      0 non-null      float64
 12  AGE_3WAY       0 non-null      float64
 13  AGE40          0 non-null      float64
 14  OCC8           5513 non-null   object 
 15  prof           5513 non-null   object 
 16  REGION4        5515 non-null   object 
 17  EDU_RECODE     0 non-null      float64
 18  VOTE_PRO

In [6]:
kaiser2021 = pd.read_stata("31119104.DTA")