# Customizable Index Workbook

This jupyter notebook is designed to be customized to fit the needs of a decisionmaker looking to allocate broadband resources to the Appalachia region.

Run each cell below to load and clean the data, then follow the instructions below to create your own prioritization system.

## Setup

In [None]:
#load packages
import pandas as pd
import geopandas as gpd
import numpy as np
import folium

Download the App_Data shapefile from the raw_data folder in the github here - https://github.com/CPLN-680-Spring-2022/Aiken_Ben_AppalachiaBroadband/tree/main/raw_data <br>
Then, add the file pathname to the code below.

In [None]:
#load data
app_data = gpd.read_file('___instert App_Data shapefile pathname here___')
#make geojson version
counties_geojson = app_data.to_crs(epsg=4326).to_json()

In [None]:
#clean up column names
app_data.set_axis(['County',
                     'STUSPS',
                     'STATEFP',
                     'COUNTYFP',
                     'GEOID',
                     'total pop',
                     'pop_density',
                     'pop_density_land',
                     'Pop per sq mile',
                     'pop_density_percentile',
                     'male median age',
                     'female median age',
                     '16 to 19',
                     '20 to 24',
                     '25 to 44',
                     '45 to 54',
                     '55 to 64',
                     '65+',
                     'Under 18',
                     '18 to 64',
                     '16 to 44',
                     '45 to 64',
                     'pct 16 to 19',
                     'pct 20 to 24',
                     'pct 25 to 44',
                     'pct 45 to 54',
                     'pct 55 to 64',
                     'pct under 18',
                     'pct 18 to 64',
                     'pct 16 to 44',
                     'pct 45 to 64',
                     'pct 65+',
                     'Total Households for internet',
                     'Total with Internet',
                     'Total Households for computer',
                     'Total Households for income and internet',
                     '<10k',
                     '10k to 20k',
                     '20k to 35k',
                     '35k to 50k',
                     '50k to 75k',
                     '>75k',
                     'Total Households for age groups',
                     'Total Households for education',
                     'Less than HS grad',
                     'HS grad but not college',
                     "Bachelor's or higher",
                     'Total Housholds for labor force',
                     'Civilian labor force',
                     'Civilian labor force, employed',
                     'Civilian labor force, unemployed',
                     'Not in labor force',
                     'pct with internet',
                     'pct with dial-up',
                     'pct with broadband',
                     'pct with cellular',
                     'pct with BB:cable, fiber optic, or DSL',
                     'pct with satelitte',
                     'pct with other',
                     'pct without internet',
                     'pct with computer',
                     'pct with comp + dial-up',
                     'pct with comp + broadband',
                     'pct with comp no internet',
                     'pct without a computer',
                     'pct <10k',
                     'pct <10k with dial-up',
                     'pct <10k with broadband',
                     'pct <10k without internet sub',
                     'pct 10k-20k',
                     'pct 10k-20k with dial-up',
                     'pct 10k-20k with broadband',
                     'pct 10k-20k without internet sub',
                     'pct 20k-35k',
                     'pct 20k-35k with dial-up',
                     'pct 20k-35k with broadband',
                     'pct 20k-35k without internet sub',
                     'pct 35k-50k',
                     'pct 35k-50k with dial-up',
                     'pct 35k-50k with broadband',
                     'pct 35k-50k without internet sub',
                     'pct 50k-75k',
                     'pct 50k-75k with dial-up',
                     'pct 50k-75k with broadband',
                     'pct 50k-75k without internet sub',
                     'pct >75k',
                     'pct >75k with dial-up',
                     'pct >75k with broadband',
                     'pct >75k without internet sub',
                     'pct under 18, has a computer',
                     'pct under 18, has a comp + dial-up',
                     'pct under 18, has a comp + broadband',
                     'pct under 18, has a comp, no internet',
                     'pct under 18, no comp',
                     'pct 18-64, has a computer',
                     'pct 18-64, has a comp + dial-up',
                     'pct 18-64, has a comp + broadband',
                     'pct 18-64, has a comp, no internet',
                     'pct 18-64, no comp',
                     'pct 65+, has a computer',
                     'pct 65+, has a comp + dial-up',
                     'pct 65+, has a comp + broadband',
                     'pct 65+, has a comp, no internet',
                     'pct 65+, no comp',
                     'pct <HS',
                     'pct <HS, has a computer',
                     'pct <HS, has a comp + dial-up',
                     'pct <HS, has a comp + broadband',
                     'pct <HS, has a comp, no internet',
                     'pct <HS, no comp',
                     'pct HS grad',
                     'pct HS grad, has a computer',
                     'pct HS grad, has a comp + dial-up',
                     'pct HS grad, has a comp + broadband',
                     'pct HS grad, has a comp, no internet',
                     'pct HS grad, no comp',
                     'pct Bach+',
                     'pct Bach+, has a computer',
                     'pct Bach+, has a comp + dial-up',
                     'pct Bach+, has a comp + broadband',
                     'pct Bach+, has a comp, no internet',
                     'pct Bach+, no comp',
                     'pct civilian-employed',
                     'pct civilian-employed, has a computer',
                     'pct civilian-employed, has a comp + dial-up',
                     'pct civilian-employed, has a comp + broadband',
                     'pct civilian-employed, has a comp, no internet',
                     'pct , no comp',
                     'pct civilian-unemployed',
                     'pct civilian-unemployed, has a computer',
                     'pct civilian-unemployed, has a comp + dial-up',
                     'pct civilian-unemployed, has a comp + broadband',
                     'pct civilian-unemployed, has a comp, no internet',
                     'pct civilian-unemployed, no comp',
                     'pct not in labor force',
                     'pct not in labor force, has a computer',
                     'pct not in labor force, has a comp + dial-up',
                     'pct not in labor force, has a comp + broadband',
                     'pct not in labor force, has a comp, no internet',
                     'pct not in labor force, no comp',
                     'ALAND',
                     'AWATER',
                     'SUBREGION',
                     'Index, FY 2020',
                     'Index Rank, FY 2020',
                     'Index Percentile, FY 2020',
                     'Economic Status, FY 2020',
                     'Index, FY 2019',
                     'Index Rank, FY 2019',
                     'Index Percentile, FY 2019',
                     'Economic Status, FY 2019',
                     'Index, FY 2018',
                     'Index Rank, FY 2018',
                     'Index Percentile, FY 2018',
                     'Economic Status, FY 2018',
                     'Index, FY 2017',
                     'Index Rank, FY 2017',
                     'Index Percentile, FY 2017',
                     'Economic Status, FY 2017',
                     'Index, FY 2016',
                     'Index Rank, FY 2016',
                     'Index Percentile, FY 2016',
                     'Economic Status, FY 2016',
                     'Index, FY 2015',
                     'Index Rank, FY 2015',
                     'Index Percentile, FY 2015',
                     'Economic Status, FY 2015',
                     'Index, FY 2014',
                     'Index Rank, FY 2014',
                     'Index Percentile, FY 2014',
                     'Economic Status, FY 2014',
                     'Index, FY 2013',
                     'Index Rank, FY 2013',
                     'Index Percentile, FY 2013',
                     'Economic Status, FY 2013',
                     'Index, FY 2012',
                     'Index Rank, FY 2012',
                     'Index Percentile, FY 2012',
                     'Economic Status, FY 2012',
                     'Index, FY 2011',
                     'Index Rank, FY 2011',
                     'Index Percentile, FY 2011',
                     'Economic Status, FY 2011',
                     'Index, FY 2010',
                     'Index Rank, FY 2010',
                     'Index Percentile, FY 2010',
                     'Economic Status, FY 2010',
                     'Index, FY 2009',
                     'Index Rank, FY 2009',
                     'Index Percentile, FY 2009',
                     'Economic Status, FY 2009',
                     'Index, FY 2008',
                     'Index Rank, FY 2008',
                     'Index Percentile, FY 2008',
                     'Economic Status, FY 2008',
                     'Index, FY 2007',
                     'Index Rank, FY 2007',
                     'Index Percentile, FY 2007',
                     'Economic Status, FY 2007',
                     '5 year Economic Percentile',
                     '5 year Economic Designation',
                     'Access to Broadband',
                     'Broadband Provider Count',
                     'Lowest Broadband Price',
                     'pct acces to Broadband',
                     'Higher Ed Inst Count',
                     'Higher Ed Enrollment Count',
                     'school density',
                     'pct enrolled',
                     'enrollment_percentile',
                     'Mine Count 2020',
                     'mine density',
                     'mine per capita',
                     'Mine Category',
                     'bb_access_percentile',
                     'bb_usage_percentile',
                     'bb_price_percentile',
                     'young_working_age_percentile',
                     'hs_less_percentile',
                     'hs_grad_percentile',
                     'bach+_percentile',
                     'Broadband Infrastructure Index',
                     'Broadband Subsidy Index',
                    'geometry'], axis=1, inplace=True)

## Calculate Index Values

Now that the data is all loaded, the next cell will calculate an example Broadband Infrastructure Index.

In [None]:
#Example Code:
app_data['Example Broadband Infrastructure Index'] = ((2 * (100 - app_data['bb_access_percentile'])) + #Low access x 2
                                              (1.5 * (100 - app_data['bb_usage_percentile'])) + #Low usage x 1.5
                                              (0.5 * (app_data['bb_price_percentile'])) + #highest cost x .5
                                              (1.25 * (100 * app_data['Index Percentile, FY 2020'])) + #low 2020 econ x .75
                                              (1.25 * (100 * app_data['5 year Economic Percentile'])) + #low 5-year econ x .75
                                              (0.5 * (app_data['Mine Category'])) + #high mine location x .5
                                              (1 * (app_data['young_working_age_percentile'])) + #high young working age x 1
                                              (1 * (100 - app_data['pop_density_percentile'])) +  #low density x 1
                                              (0.5 * (100 - app_data['enrollment_percentile'])) + #low ed access x .5 
                                              (0.5 * (app_data['hs_less_percentile'])) + #high <HS x .5
                                              (0.25 * (app_data['hs_grad_percentile'])) + #high HS+ x .25
                                              (0.25 * (100 - app_data['bach+_percentile'])) #low bach degree x .25
                                              )

This BII equation above is just an example.  In the cell below create your own BII using the following steps:

1) insert the features you think are most important by copying the column name from the cell above that lists all<br>
2) determine if they are positive (leave alone) or negative (subtract from 100, like the first feature in the example above <br>
3) multiply it by your desired weight (like 2 for the first feautre in the example above).

Give the features you think are most important higher weights and the features that you want to include, but are not as important, lower weights.

In [None]:
#Fill in:                                    (weight * (value))   
app_data['Broadband Infrastructure Index'] = ((()) + #feature 1
                                              (()) + #feature 2
                                              (()) + #feature 3
                                              (()) + #feature 4
                                              (()) #feature 5
                                              )

The next cell will create an example Broadband Subsidy Index.

In [None]:
#Example Code:
app_data['Example Broadband Subsidy Index'] = ((2 * (app_data['bb_access_percentile'])) + #High access x 2
                                       (2 * (100 - app_data['bb_usage_percentile'])) + #Low usage x 2
                                       (0.5 * (app_data['bb_price_percentile'])) + #high cost x 1
                                       (1 * (100 * app_data['Index Percentile, FY 2020']))) #low 5-year econ x .75   

Follow the same steps as above to create your BSI in the cell below.

In [None]:
#Fill in:                                    (weight * (value))   
app_data['Broadband Subsidy Index'] = ((()) + #feature 1
                                              (()) + #feature 2
                                              (()) + #feature 3
                                              (()) #feature 4
                                              )

## Map Index Values

The cell below will create a map with the values of your custom BII.  Uncomment the last line and add your desired pathname to save the map as an html file.

In [None]:
#map: BII
m = folium.Map(
    location=[37.59846053047453, -83.45363151803205],
    tiles='cartodbpositron',
    zoom_start=5.5
)

folium.Choropleth(
    geo_data=counties_geojson,
    data=app_data,
    key_on='feature.properties.GEOID',
    columns=["GEOID", 'Broadband Infrastructure Index'],
    fill_color='YlGnBu',
    fill_opacity=0.7,
    line_opacity=0.5,
    line_weight=0.1,
    legend_name='Broadband Infrastructure Index',
    name='choropleth',
).add_to(m)

m
#m.save('____insert pathname here______')

Next, the cell below will create a map with the values of your custom BSI.  Once again, you can save your map using the last line of code.

In [None]:
#map: BSI
m = folium.Map(
    location=[37.59846053047453, -83.45363151803205],
    tiles='cartodbpositron',
    zoom_start=5.5
)

folium.Choropleth(
    geo_data=counties_geojson,
    data=app_data,
    key_on='feature.properties.GEOID',
    columns=["GEOID", 'Broadband Subsidy Index'],
    fill_color='YlGnBu',
    fill_opacity=0.7,
    line_opacity=0.5,
    line_weight=0.1,
    legend_name='Broadband Subsidy Index',
    name='choropleth',
).add_to(m)

m
#m.save('____insert pathname here______')

## List top 10 of each Index

Finally, the cell below will list the top 10 counties according to BII

In [None]:
BII_order = app_data.sort_values(by=['Broadband Infrastructure Index'])
BII_order[-10:]

And the last cell will list the top 10 counties according to BSI.

In [None]:
BSI_order = app_data.sort_values(by=['Broadband Subsidy Index'])
BSI_order[-10:]

## Save your list

Add a pathname below to save your assigned indexes.

In [None]:
app_data.to_csv('____instert desired pathname here____')