### This project provides an updated estimate of the number of Opportunity Youth in South King county as compared to the 2016 report used by The Seattle Times.
##### Updated by: Luluva Lakdawala, Jacob Prebys, Jason Wong

#### Data Source: American Community Survey 2017 5-year [(ACS)](https://www.census.gov/programs-surveys/acs/about.html) Public Use Microdata Survey [(PUMS)](https://www.census.gov/programs-surveys/acs/technical-documentation/pums.html).
##### The ACS provides vital information on an anual basis about America's people and places. pums provides untabulated  recoreds of individual people and housing units. Opportunity Youth are classified as individuals within the Road Map Project region between the ages of 16-24 that are not enrolled in school or working.

###### This code will allow the notebook to re-import the source code located in src after being edited

In [1]:
%load_ext autoreload
%autoreload 2

##### This path allows the notebook to import from the src module
```
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering)
│   │                     followed by the topic of the notebook, e.g.
│   │                     01_data_collection_exploration.ipynb
│   └── exploratory    <- Raw, flow-of-consciousness, work-in-progress notebooks
│   └── report         <- Final summary notebook(s)
│
├── src                <- Source code for use in this project
│   ├── data           <- Scripts to download and query data
│   │   ├── sql        <- SQL scripts. Naming convention is a number (for ordering)
│   │   │                 followed by the topic of the script, e.g.
│   │   │                 03_create_pums_2017_table.sql
│   │   ├── data_collection.py
│   │   └── sql_utils.py
```

##### Import the src code by going up two parent directories.

In [2]:
import os
import sys
module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

##### This code downloads the data needed by loading it into an SQL database.

In [3]:
from src.data import data_collection

In [None]:
data_collection.download_data_and_load_into_sql()

##### We need to import psycopg2 as it is a very convenient PostgreSQL database adapter for Python. As well as pandas and matplotlib for data manipulation and visualization.

In [4]:
import psycopg2
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [5]:
DBNAME = 'opportunity_youth'

In [6]:
conn = psycopg2.connect(dbname=DBNAME)

### EVALUATION:
##### Our analysis shows a 41% drop in the number of Opportunity Youth. The 2017 5-year ACS PUMS dataset shows there are 11,115 OY as opposed to 18,817 from the 2016 report.

In [10]:
import matplotlib.image as mpimg
image = mpimg.imread("OY_table.png")
plt.imshow(image)
plt.show()

FileNotFoundError: [Errno 2] No such file or directory: 'OY_table.png'

### Exploratory Data Analysis explained
##### Our analysis included exploring the daata and looking for any missing values or objects that needed to be replaced or removed completely. The PUMAs that are classified as in the Road Map Project region are the puma codes 11610-11615. After querying the tables for each PUMA we were able to query for specific age groups within each individual PUMA. Once we had the updated values from the database, we updated the table 'Opportunity Youth Status by Age'. The variables we used  to collect the appropriate data were:
#### AGEP- Age
#### ESR- Employment Status Code
#### COW- Class of Worker
#### SCH- School Enrollment Status
#### SCHL- Educational Attainment