## Deployment Code

The code below is just a duplication of the relevant code from notebook 30 which is used for deployment in a webapp.

In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import ElasticNetCV

df = pd.read_csv('./data/clean_data.csv')
hdi = pd.read_csv('./data/country_hdi.csv')

county = input('Input County')
state = input('Input Two Letter State Code')
county_input = county + ', ' + state
county_x = df[df['Description'] == county_input].select_dtypes(include=[np.number]).drop(columns = ['LE', 'II', 'MYS', 'EI', 'HDI'])
county_y = df[df['Description'] == county_input]['HDI']

X = df[df['Description'] != county_input].select_dtypes(include=[np.number]).drop(columns = ['LE', 'II', 'MYS', 'EI', 'HDI'])
y = df[df['Description'] != county_input]['HDI']


def colinearity_remover(df, max_corr, list_of_keepers = []):
    corr = df.corr()
    x, y = corr.shape
    drop_list = []
    target = df.columns[-1]    
    
    for i in range(x-1):
        for j in range (y-1):
            if abs(corr.iloc[i, j]) > max_corr and i != j:
                if corr[target][i] > corr[target][j]:
                    drop_list.append(corr[target].index[j])
                else:
                    drop_list.append(corr[target].index[i])
                                 
    drop_list = list(set(drop_list))
    
    for keeper in list_of_keepers:
        drop_list.remove(keeper)
    
    
    
    return drop_list

drop_list = colinearity_remover(X, 0.67, list_of_keepers = [])
X.drop(columns = drop_list, inplace = True)
county_x.drop(columns = drop_list, inplace = True)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 48228)
ss = StandardScaler()
Z_train = ss.fit_transform(X_train)
Z_test = ss.transform(X_test)
county_z = ss.transform(county_x)


alphas = [0.0009545454545454546]
enet_ratios = [0.2197979797979798]

enet = ElasticNetCV(alphas = alphas, l1_ratio=enet_ratios, cv = 5, max_iter = 3000, n_jobs = -1)

enet.fit(Z_train, y_train)

report = pd.DataFrame(X.mean(), columns = ['Mean Value'])
report['County Value'] = county_x.T
Z = np.concatenate((Z_train, Z_test), axis = 0)
report['County SDs from Mean'] = (county_z - Z.mean(axis = 0)).T
report['Amount of Change'] = (ss.inverse_transform(county_z) - ss.inverse_transform(Z.mean(axis = 0))).T
report['Impact on HDI'] = ((Z.mean(axis = 0) - county_z) * enet.coef_).T
for feature in report.columns:
    report[feature] = report[feature].apply(lambda x: round(x,3))


def get_hdi(my_hdi):
    compare = 1
    for i in range(len(hdi)):
        this_hdi = hdi.iloc[i, 1]
        if abs(my_hdi - this_hdi) < compare:
            compare = abs(my_hdi - this_hdi)

            country = hdi.iloc[i, 0]
        
    return country
    
county_hdi = float(df[df['Description'] == county_input]['HDI'])
improvement = report.sort_values(by = 'Impact on HDI', ascending = False)['Impact on HDI'][0:4].sum()
currently_like = get_hdi(county_hdi) 
would_be_like = get_hdi(county_hdi + improvement)
print(f"With an HDI of {round(county_hdi, 3)}, {county_input}, compared to the international community, currently has an HDI most similar to {currently_like}.")
print(f"According to our model, if {county_input} successfully remediated the top 5 items it could have an HDI more simalar to {would_be_like}.")    
    
report.sort_values(by = 'Impact on HDI', ascending = False)

Input County Wayne County
Input Two Letter State Code MI


With an HDI of 0.841, Wayne County, MI, compared to the international community, currently has an HDI most similar to  Brunei.
According to our model, if Wayne County, MI successfully remediated the top 5 items it could have an HDI more simalar to  Slovakia.


Unnamed: 0,Mean Value,County Value,County SDs from Mean,Amount of Change,Impact on HDI
Motor Vehicle Theft rate per 100000,106.465,859.079,7.354,752.615,0.006
Percent of population that didn't work over the past year,26.435,32.402,0.641,5.966,0.005
% Smokers,21.345,24.000,0.693,2.655,0.003
% Low birthweight,8.134,11.000,1.580,2.866,0.002
"Population growth, 2010-2016",0.343,-2.911,-0.696,-3.253,0.002
...,...,...,...,...,...
% With Access to Exercise Opportunities,64.255,95.000,1.490,30.745,-0.002
Percent of returns with itemized deductions,21.344,24.243,0.348,2.899,-0.002
Rural-urban continuum code (1-9),4.359,1.000,-1.517,-3.359,-0.003
Total civilian population aged 18-64 rate,0.578,0.617,0.877,0.039,-0.003
