### Exploring `pums_2017`
Using column descriptions from:  https://data.census.gov/mdat/?#/search?ds=ACSPUMS1Y2016&vv=POWPUMA,PAP,NRC,RNTP,WAGP,OIP,POVPIP,JWMNP,GRPIP,AGEP,undefined&cv=SEX,ACR,FPARC,FES&rv=ucgid&nv=ACCESS,OCCP,WRK,WKW,SCHG,SCHL,SCH,WKL,WIF,NWLK,NWLA,NWAV,HUPAC,HHL,HHT,JWDP,JWAP&wt=PWGTP&g=7950000US5311612,5311613,5311614,5311615

In [3]:
# Imports:
import psycopg2
import pandas as pd

# Establish DB:
DBNAME = "opportunity_youth"

# Create a connection to db
conn = psycopg2.connect(dbname=DBNAME)

In [15]:
# nwav = 'Available to work'
# Values:
# 0 = NA (less than 16)
# 1 = Yes
# 2 = No, temporarily ill
# 3 = No, other reasons
# 4 = No, unspecified
# 5 = Did not report 

nwav_pums_2017 = pd.read_sql("SELECT nwav FROM pums_2017;", conn)
nwav_pums_2017.head()

Unnamed: 0,nwav
0,5
1,5
2,5
3,5
4,5


In [23]:
# agep = 'Age'
age_pums_2017 = pd.read_sql("SELECT agep FROM pums_2017 WHERE agep <= 24 AND agep >= 16;", conn)
age_pums_2017.head() 

Unnamed: 0,agep
0,18.0
1,17.0
2,17.0
3,21.0
4,23.0


In [21]:
# wkl = 'When last worked'
# Values:
# 0 = NA (less than 16)
# 1 = Within the past 12 months
# 2 = 1-5 years ago
# 3 = Over 5 years ago or never worked
wkl_pums_2017 = pd.read_sql("SELECT wkl FROM pums_2017;", conn)
wkl_pums_2017.head(10)

Unnamed: 0,wkl
0,3
1,3
2,3
3,3
4,1
5,1
6,1
7,1
8,1
9,1


So we need to generate a df with our relevant variables.  
Variables to include:
- Age (agep)
- Sex (sex)
- Available to work (nwav)
- When last worked (?) (wkl)
- English ability (eng)

In [24]:
first_df = pd.read_sql("SELECT agep, sex, nwav, wkl, eng FROM pums_2017 WHERE agep <= 24 AND agep >= 16;", conn)
first_df.head() 

Unnamed: 0,agep,sex,nwav,wkl,eng
0,18.0,1,5,1,
1,17.0,2,5,3,
2,17.0,2,1,1,
3,21.0,1,5,1,
4,23.0,1,5,1,


In [25]:
first_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38170 entries, 0 to 38169
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   agep    38170 non-null  float64
 1   sex     38170 non-null  object 
 2   nwav    38170 non-null  object 
 3   wkl     38170 non-null  object 
 4   eng     7242 non-null   object 
dtypes: float64(1), object(4)
memory usage: 1.5+ MB
