# Processing the data


In [1]:
import pandas as pd
import numpy as np

# src is the local package. See README.md.
from src import get_cook_county_fcc_data
from src import get_max_speed_by_block_code
from src import get_cook_county_block_code_map
from src import combine_fastest_speed_with_map_data

running: /Users/erik/broadband_access_research/broadband-map-experiment/src/get_max_speed_by_block_code.py


The first step is to load the data from the fcc, which is a large file (~10 GB), so we will extract the part for
cook county and save is as a .csv.

In [None]:
df = get_cook_county_fcc_data.read_fcc_cook()
get_cook_county_fcc_data.save_fcc_cook(df)


The next step is to extract those records in cook county which are consumer listings and save 
the fastest speed from each census block. (There are multiple records per block).  
Note: it might be a good idea to refactor and separate the "isolate cook county" from
the "get max speeds", so that it will be easier to insert other analysis into the process. 

In [None]:
df = get_max_speed_by_block_code.load_cook_county()
max_speed_by_block_code = get_max_speed_by_block_code.select_consumer(df)
get_max_speed_by_block_code.save_max_speeds(max_speed_by_block_code)
print('hello')

Meanwhile, I used a command line tool called org2ogr to convert the 
shapefiles in /data/raw/tl_2018_17_tabblock10/* into a geojson, 
which is saved as /data/raw/tl_2018_17_tabblock10.geojson.
This file contains a listing of all the census block codes in Illinois.
It is fairly large (750 MB), so we will extract just the cook county 
data from it.

In [None]:
map_il = get_cook_county_block_code_map.load_block_codes()
map_cook_county = get_cook_county_block_code_map.get_cook_county_codes(map_il)
get_cook_county_block_code_map.save_cook_county_map(map_cook_county)

Next, combine the cook county speed by blockcode data with the
cook county bloockcode map data. The result will be stored to 
a geojson file.

In [2]:
map_cook_county = combine_fastest_speed_with_map_data.load_cook_county_map()
speeds = combine_fastest_speed_with_map_data.load_max_speeds()
map_d = combine_fastest_speed_with_map_data.combine_map_and_speeds(map_cook_county, speeds)
combine_fastest_speed_with_map_data.save_speed_map(map_d)

tl_2018_17_tabblock10.geojson loaded as map_d
max_speed_by_block_code.csv loaded as speed_by_block
combining data complete
writing to cook_county_map_with_top_speeds.geojson done


Now, this file is ready to be uploaded to Mapbox as a tyleset, where it 
will be displayed with the census blocks color coded according to
their max available internet speed. 
