##Did Louisville's unemployment rate recover from the recession more quickly than similar cities?

Importing libraries and setting the working directory

The altair package can be installed this way:
conda install altair --channel conda-forge

In [1]:
import sqlite3
import pandas as pd
import altair as alt
import os
os.chdir('C:/Users/natek/Documents/GitHub/code_lou_python')

Reading in an excel sheet of jobs related data for Louisville and for peer cities (source: Greater Louisville Project)

In [2]:
jobs_df = pd.read_excel('GLP-Codebook.xlsx', 'Jobs County', index_col=None, na_values=['NA'])

Creating a sql database and then querying back just the unemployment data by year and city. Only for cities that are currently peers 

In [3]:
jobs_df.to_sql("jobs_table", sqlite3.connect("jobs.db"), if_exists = "replace")

In [4]:
jobs_df = pd.read_sql_query("SELECT year, city, unemployment FROM jobs_table WHERE current = 1", sqlite3.connect("jobs.db"))

Making year into a datetime, subsetting to just Louisville, peers without Louisville, and finding mean of non-Louisville cities

In [5]:
jobs_df['year'] = pd.to_datetime(jobs_df['year'], format = "%Y")
lou_df = jobs_df[(jobs_df.city == "Louisville")]
peer_df = jobs_df[(jobs_df.city != "Louisville")]
mean_df = peer_df.groupby('year', as_index = False).mean()

Renaming columns and merging Louisville and peer together. Also melting data to long format for graphing.

In [6]:
lou_df = lou_df.filter(items = ['year', 'unemployment'])
lou_df = lou_df.rename(columns = {"unemployment":"Louisville"})
mean_df = mean_df.rename(columns = {"unemployment":"Peers"})
df = pd.merge(lou_df, mean_df, how = 'outer', left_on = ['year'], right_on = ['year'])
df_graph = df.melt(id_vars = ['year'], value_vars = ['Louisville', 'Peers'], var_name = "City", value_name = "Unemployment")

Graphing and saving output as .html file

In [7]:
chart = alt.Chart(df_graph).mark_line().encode(
    x='year',
    y='Unemployment',
    color='City'
)

In [8]:
chart.savechart('unemp.html')