# Immigrant Project 

### Cross-Generational Differences in Hispanic Outcomes

1.	Cross-Generational Differences in Hispanic Outcomes, 1994-2016: use main_national again, and plot the values for **white-All; Hispanic-All; Hispanic, 1st; Hispanic, 2nd; and Hispanic, 3rd** for each outcome listed above other than gen1-gen3. 
2.	Cross-Generational Differences in Hispanic Outcomes in Top Immigration States, 1994-2016: similar to (2) broken down by state, side-by-side would be great other than gen1-gen3.

  * LTHS: “% less than HS diploma”
  * College: “% with college degree”
  *	Hinsured: “% with health insurance”
  * rincp_all: “Average individual real income”
  *	employed: “% employed”
  *	married2: “% married”
  *	children: “% with children”
  * poverty; (% of families under the poverty line)
  * age; (Average age)
  * rinch_all; (median household income);

## Clean Data

In [1]:
import pandas as pd
import os
import plotly.graph_objs as go
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output, State

In [2]:
os.chdir(r'H:\CALDER\CALDER Data Visualizations\Data\Immigrant Project')

In [61]:
state = pd.read_csv('main_topstates.csv')

In [62]:
state = state.sort_values(by=['year', 'state', 'wbhao'])
state.head(9)

Unnamed: 0,year,wbhao,state,igen,lths,college,hinsured,employed,married2,children,gen1,gen2,gen3,rincp_all
23,1994,Hispanic,California,1st Generation,0.612092,0.046513,0.539978,0.644767,0.679131,0.782559,,,,16157.772
35,1994,Hispanic,California,2nd Generation,0.236203,0.074119,0.811544,0.704071,0.60013,0.574653,,,,25254.598
47,1994,Hispanic,California,3rd Generation,0.121056,0.053164,0.831612,0.767576,0.558505,0.703352,,,,29083.99
59,1994,Hispanic,California,All,0.462181,0.051881,0.636176,0.677115,0.644281,0.736546,0.659941,0.14824,0.191819,19324.695
22,1994,White,California,1st Generation,0.111483,0.225295,0.813284,0.713643,0.745518,0.583275,,,,32396.334
34,1994,White,California,2nd Generation,0.056032,0.253155,0.868273,0.784288,0.633254,0.520214,,,,41577.18
46,1994,White,California,3rd Generation,0.042512,0.215277,0.839792,0.794372,0.620122,0.485173,,,,40636.797
58,1994,White,California,All,0.049636,0.219552,0.840092,0.786551,0.632042,0.496733,0.085607,0.090224,0.824168,40408.973
19,1994,Hispanic,Florida,1st Generation,0.293482,0.091414,0.56535,0.710784,0.631509,0.587935,,,,18726.859


In [63]:
# Rename Columns
pre = ['lths', 'college', 'hinsured', 'employed', 'married2', 'children', 'gen1', 'gen2', 'gen3','rincp_all']
post = ["% less than High School diploma", "% with College Degree", "% with Health Insurance", "% Employed", "% Married", 
        "% with Children", "Share of 1st Generation", "Share of 2nd Generation", "Share of 3rd Generation", 
        "Median Individual Real Income"]
for i in range(0, len(pre)):   
    state.rename(columns={pre[i]: post[i]}, inplace=True)

state.columns

Index(['year', 'wbhao', 'state', 'igen', '% less than High School diploma',
       '% with College Degree', '% with Health Insurance', '% Employed',
       '% Married', '% with Children', 'Share of 1st Generation',
       'Share of 2nd Generation', 'Share of 3rd Generation',
       'Median Individual Real Income'],
      dtype='object')

In [64]:
state['wbhao_igen'] = state['wbhao'] + ' ' + state['igen']

In [66]:
state = state.drop(state[(state.wbhao_igen == 'White 1st Generation') | (state.wbhao_igen == 'White 2nd Generation')
                       | (state.wbhao_igen == 'White 3rd Generation')].index)

In [67]:
state = state.drop(['Share of 1st Generation','Share of 2nd Generation', 'Share of 3rd Generation'], axis=1)

## Append "main_national" data set

You will be able to select the national data along with the state data using the "state" column. 

### Import and Clean "top_state" data

In [99]:
nat = pd.read_csv('main_national.csv')

In [100]:
nat = nat.sort_values(by=['year', 'wbhao', 'igen'])
nat.head(9)

Unnamed: 0,year,wbhao,igen,lths,college,hinsured,employed,married2,children,gen1,gen2,gen3,rincp_all
3,1994,Hispanic,1st Generation,0.529125,0.060532,0.544256,0.670069,0.673386,0.726153,,,,16157.772
5,1994,Hispanic,2nd Generation,0.27347,0.087691,0.752589,0.640134,0.551705,0.584468,,,,21515.689
7,1994,Hispanic,3rd Generation,0.208407,0.068863,0.763343,0.756557,0.587008,0.63558,,,,25852.436
9,1994,Hispanic,All,0.397849,0.068677,0.641707,0.682653,0.62584,0.673044,0.543867,0.230767,0.225366,19389.326
2,1994,White,1st Generation,0.118049,0.222853,0.815063,0.731381,0.746446,0.526193,,,,30699.768
4,1994,White,2nd Generation,0.054359,0.221735,0.867673,0.800871,0.666478,0.479842,,,,38809.355
6,1994,White,3rd Generation,0.080448,0.186397,0.861411,0.806955,0.691202,0.526112,,,,35270.801
8,1994,White,All,0.080278,0.189811,0.860092,0.803842,0.691756,0.523384,0.036445,0.059018,0.904537,35275.648
11,1995,Hispanic,1st Generation,0.534055,0.059078,0.526185,0.690245,0.669891,0.713,,,,16618.637


In [101]:
# Rename Columns
pre = ['lths', 'college', 'hinsured', 'employed', 'married2', 'children', 'gen1', 'gen2', 'gen3','rincp_all']
post = ["% less than High School diploma", "% with College Degree", "% with Health Insurance", "% Employed", "% Married", 
        "% with Children", "Share of 1st Generation", "Share of 2nd Generation", "Share of 3rd Generation", 
        "Median Individual Real Income"]
for i in range(0, len(pre)):   
    nat.rename(columns={pre[i]: post[i]}, inplace=True)

nat.columns

Index(['year', 'wbhao', 'igen', '% less than High School diploma',
       '% with College Degree', '% with Health Insurance', '% Employed',
       '% Married', '% with Children', 'Share of 1st Generation',
       'Share of 2nd Generation', 'Share of 3rd Generation',
       'Median Individual Real Income'],
      dtype='object')

In [102]:
nat['wbhao_igen'] = nat['wbhao'] + ' ' + nat['igen']

In [104]:
nat = nat.drop(nat[(nat.wbhao_igen == 'White 1st Generation') | (nat.wbhao_igen == 'White 2nd Generation')
                       | (nat.wbhao_igen == 'White 3rd Generation')].index)

In [105]:
nat = nat.drop(['Share of 1st Generation','Share of 2nd Generation', 'Share of 3rd Generation'], axis=1)

In [106]:
nat['state'] = 'National'

### Append

In [107]:
append = nat.append(state)

In [108]:
append = append.sort_values(by=['year', 'wbhao', 'igen'])

In [109]:
append.head(8)

Unnamed: 0,% Employed,% Married,% less than High School diploma,% with Children,% with College Degree,% with Health Insurance,Median Individual Real Income,igen,state,wbhao,wbhao_igen,year
3,0.670069,0.673386,0.529125,0.726153,0.060532,0.544256,16157.772,1st Generation,National,Hispanic,Hispanic 1st Generation,1994
23,0.644767,0.679131,0.612092,0.782559,0.046513,0.539978,16157.772,1st Generation,California,Hispanic,Hispanic 1st Generation,1994
19,0.710784,0.631509,0.293482,0.587935,0.091414,0.56535,18726.859,1st Generation,Florida,Hispanic,Hispanic 1st Generation,1994
17,0.728158,0.68104,0.578963,0.716461,0.020023,0.54378,17773.551,1st Generation,Illinois,Hispanic,Hispanic 1st Generation,1994
15,0.696297,0.623001,0.325496,0.56939,0.069878,0.605457,20027.559,1st Generation,New Jersey,Hispanic,Hispanic 1st Generation,1994
13,0.593195,0.521073,0.353142,0.635117,0.064846,0.609515,14703.573,1st Generation,New York,Hispanic,Hispanic 1st Generation,1994
21,0.708456,0.734023,0.636191,0.741376,0.067662,0.452359,14541.995,1st Generation,Texas,Hispanic,Hispanic 1st Generation,1994
5,0.640134,0.551705,0.27347,0.584468,0.087691,0.752589,21515.689,2nd Generation,National,Hispanic,Hispanic 2nd Generation,1994


## Graph

In [113]:
app = dash.Dash()

app.css.append_css({"external_url": "https://codepen.io/chriddyp/pen/bWLwgP.css"}) 

df = append

states = list(df['state'].unique())
outcomes = ["% less than High School diploma", "% with College Degree", "% with Health Insurance", "% Employed", 
            "% Married", "% with Children", "% of Families under the Poverty Line", "Median Individual Real Income"]

# Organize where items will be on the page
app.layout = html.Div([
        html.H3(
            children='Cross-Generational Differences in Hispanic Outcomes, 1994-2016',
            style={
                'textAlign': 'center', 'fontFamily' : 'Georgia'
            }
        ),
        html.Div([
            html.Div([
                    html.Center([          
                        html.Div([
                            html.Div([html.P('Select State',id='state-title1')],
                                style={'textAlign': 'center', 'fontFamily': 'Georgia'}),
                            dcc.Dropdown(
                                id='state-id1',
                                options=[{'label': i, 'value': i} for i in states],
                                value='California')
                            ],style={'width': '40%','textAlign': 'center', 'fontFamily': 'Georgia', 'display': 'inline-block'}),        
                        html.Div([
                            html.Div([html.P('Select Outcome',id='outcome-title1')],
                                style={'textAlign': 'center', 'fontFamily': 'Georgia'}),
                            dcc.Dropdown(
                                id='outcome-id1',
                                options=[{'label': i, 'value': i} for i in outcomes],
                                value='% less than High School diploma')
                            ],style={'width': '40%','textAlign': 'center', 'fontFamily': 'Georgia', 'display': 'inline-block'}),
                        ]),

                dcc.Graph(id='indicator-graphic1',
                          config={'modeBarButtonsToRemove': ['sendDataToCloud', 'lasso2d', 'zoomIn2d', 'zoomOut2d', 'pan2d', 
                                                             'zoom2d','resetScale2d'], 
                                'displaylogo': False})
                ], style={'width': '50%', 'display': 'inline-block'}),  
            html.Div([
                    html.Center([          
                        html.Div([
                            html.Div([html.P('Select State',id='state-title2')],
                                style={'textAlign': 'center', 'fontFamily': 'Georgia'}),
                            dcc.Dropdown(
                                id='state-id2',
                                options=[{'label': i, 'value': i} for i in states],
                                value='Texas')
                            ],style={'width': '40%','textAlign': 'center', 'fontFamily': 'Georgia', 'display': 'inline-block'}),        
                        html.Div([
                            html.Div([html.P('Select Outcome',id='outcome-title2')],
                                style={'textAlign': 'center', 'fontFamily': 'Georgia'}),
                            dcc.Dropdown(
                                id='outcome-id2',
                                options=[{'label': i, 'value': i} for i in outcomes],
                                value='% less than High School diploma')
                            ],style={'width': '40%','textAlign': 'center', 'fontFamily': 'Georgia', 'display': 'inline-block'}),
                        ]),

                dcc.Graph(id='indicator-graphic2',
                          config={'modeBarButtonsToRemove': ['sendDataToCloud', 'lasso2d', 'zoomIn2d', 'zoomOut2d', 'pan2d', 
                                                             'zoom2d','resetScale2d'], 
                                'displaylogo': False})
                ], style={'width': '50%', 'display': 'inline-block'}),             
        ]),
    ])
@app.callback(

    dash.dependencies.Output('indicator-graphic1', 'figure'),
    [dash.dependencies.Input('outcome-id1', 'value'),
     dash.dependencies.Input('state-id1', 'value'),
     dash.dependencies.Input('outcome-id2', 'value'),
     dash.dependencies.Input('state-id2', 'value')])
def outcome_time_series1(outcome_id, state_id, outcome_id2, state_id2):
    dff = df[['year', 'wbhao_igen', 'state',outcome_id]]
    dff = dff[dff['state'] == state_id]
    
    lines = {}
    data = []
    y_axis = {}
    legends={'orientation': 'h', 'xanchor': 'center', 'x': '0.5', 'y': '-0.22'}
    
    # Sets the range in each graph contingent on the other graphs options.
    if outcome_id==outcome_id2:
        graph2 = df[['year', 'wbhao_igen', 'state', outcome_id2]]
        graph2 = graph2[graph2['state'] == state_id2]

        dff_min = dff[outcome_id].min()
        dff_max = dff[outcome_id].max()
        if dff_min>graph2[outcome_id2].min():
            dff_min = dff[outcome_id].min()
        else:
            dff_min = dff[outcome_id2].min()
        if dff_max<graph2[outcome_id2].max():
            dff_max = graph2[outcome_id2].max()
        else:
            dff_max = dff[outcome_id].max()

        if dff_min<.05 or graph2[outcome_id2].min()<.05:
            dff_min = 0

        ranges = [dff_min, dff_max]
    elif outcome_id!=outcome_id2:
        ranges = []
    
    # Show three lines for each output
    generation = dff['wbhao_igen'].unique()
    for gen in generation:
        if '1st' in gen:
             lines = dict(
                 color = ("#6b6ecf"),
                 width = 2,
                 dash = 'dash')
        if '2nd' in gen:
             lines = dict(
                 color = ("#80b1d3"),
                 width = 2,
                 dash = 'dash')              
        if '3rd' in gen:
             lines = dict(
                 color = ("#fdb462"),
                 width = 2,
                 dash = 'dash')
        if 'White All' in gen:
              lines = dict(
                 color = ("#333333"),
                 width = 3)
        if 'Hispanic All' in gen:
               lines = dict(
                 color = ("#333333"),
                 width = 3,
                 dash = 'dot')
        trace = go.Scatter(
            x = dff[dff['wbhao_igen']==gen]['year'],
            y = dff[dff['wbhao_igen']==gen][outcome_id],
            mode='lines',
            name = gen,
            line = lines,
            opacity = 0.8
            )
        
        data.append(trace)
    if '%' in outcome_id:
        y_axis = {'title': '{0}'.format(outcome_id), 
                  'hoverformat': ',.2f',
                  'range' : ranges}
    else:
         y_axis = {'title': '{0}'.format(outcome_id), 
                  'hoverformat': ',.2f'}    
    return {
        'data' : data,
        'layout' : go.Layout(
            xaxis={'title': 'Year'},
            yaxis=y_axis,
            legend=legends,
        )
    }

@app.callback(

    dash.dependencies.Output('indicator-graphic2', 'figure'),
    [dash.dependencies.Input('outcome-id2', 'value'),
     dash.dependencies.Input('state-id2', 'value'),
     dash.dependencies.Input('outcome-id1', 'value'),
     dash.dependencies.Input('state-id1', 'value')])
def outcome_time_series2(outcome_id, state_id, outcome_id1, state_id1):
    dff = df[['year', 'wbhao_igen', 'state',outcome_id]]
    dff = dff[dff['state'] == state_id]
    lines = {}
    data = []
    y_axis = {}
    legends={'orientation': 'h', 'xanchor': 'center', 'x': '0.5', 'y': '-0.22'}

    # Sets the range in each graph contingent on the other graphs options.
    if outcome_id==outcome_id1:
        graph1 = df[['year', 'wbhao_igen', 'state', outcome_id1]]
        graph1 = graph1[graph1['state'] == state_id1]
        dff_min = dff[outcome_id].min()
        dff_max = dff[outcome_id].max()
        if dff_min>graph1[outcome_id1].min():
            dff_min = graph1[outcome_id].min()
        else:
            dff_min = dff[outcome_id1].min()
        if dff_max<graph1[outcome_id1].max():
            dff_max = graph1[outcome_id1].max()
        else:
            dff_max = dff[outcome_id].max()
        if dff_min<.05 or graph1[outcome_id1].min()<.05:
            dff_min = 0

        ranges = [dff_min, dff_max]  
    elif outcome_id!=outcome_id1:
        ranges = []
        
    y_axis = {'title': '{0}'.format(outcome_id), 
              'hoverformat': ',.2f',
              'range': ranges
            }
    
    # Show 3 lines for each output
    generation = dff['wbhao_igen'].unique()
    for gen in generation:
        if '1st' in gen:
             lines = dict(
                 color = ("#6b6ecf"),
                 width = 2,
                 dash = 'dash')
        if '2nd' in gen:
             lines = dict(
                 color = ("#80b1d3"),
                 width = 2,
                 dash = 'dash')              
        if '3rd' in gen:
             lines = dict(
                 color = ("#fdb462"),
                 width = 2,
                 dash = 'dash')
        if 'White All' in gen:
              lines = dict(
                 color = ("#333333"),
                 width = 3)
        if 'Hispanic All' in gen:
               lines = dict(
                 color = ("#333333"),
                 width = 3,
                 dash = 'dot')
        trace = go.Scatter(
            x = dff[dff['wbhao_igen']==gen]['year'],
            y = dff[dff['wbhao_igen']==gen][outcome_id],
            mode='lines',
            name = gen,
            line = lines,
            opacity = 0.8
            )
        
        data.append(trace)
    

    return {
        'data' : data,
        'layout' : go.Layout(
            xaxis={'title': 'Year'},
            yaxis=y_axis,
            legend=legends
        )
    }
    
if __name__ == '__main__':
    app.run_server()

 * Running on http://127.0.0.1:8050/ (Press CTRL+C to quit)
127.0.0.1 - - [04/Apr/2018 14:40:59] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:00] "GET /_dash-layout HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:00] "GET /_dash-dependencies HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:00] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:00] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:00] "GET /favicon.ico HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:03] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:03] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:07] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:07] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:11] "POST /_dash-update-component HTTP/1.1" 200 -
127.0.0.1 - - [04/Apr/2018 14:41:11] "POST /_dash-update-component HTTP/1.1" 2