# Real Choropleths from Zip Codes with GeoTable

In the second part of the workshop, we will cover how to process data to generate the choropleths, including file formats, spatial reference systems, table joins and color stops.

We will use the following datasets, courtesy of [NYC Open Data](https://data.cityofnewyork.us).

- [NYC Zip Code Boundaries](https://data.cityofnewyork.us/Business/Zip-Code-Boundaries/i8iw-xf4u)

For more 2D spatial data, please see [NYC DoITT 2D Data](https://www1.nyc.gov/site/doitt/residents/gis-2d-data.page).

In [None]:
pip install geotable --upgrade

## File Formats

Standard file formats for geospatial data include geojson, shapefile, kml and kmz.

### Exercise: Determine File Format

1. Visit [NYC Police Precincts](https://data.cityofnewyork.us/Public-Safety/Police-Precincts/78dh-3ptz).
2. Click the Export tab and download each of the geospatial file formats.
3. Note the file extension and file size for each file format. 
4. Visit [NYC Zip Code Boundaries](https://data.cityofnewyork.us/Business/Zip-Code-Boundaries/i8iw-xf4u).
5. Download the file and examine its contents to determine its file format.

In [None]:
# Load NYC zip codes via URL using geotable
import geotable
url = 'https://data.cityofnewyork.us/download/i8iw-xf4u/application%2Fzip'
t = raw_nyc_zip_code_table = geotable.load(url)
len(t)

In [None]:
# Show the first two rows
t[:2]

In [None]:
# Render geometries
t.draw()

In [None]:
# Render each geometry with a random color
from geotable import ColorfulGeometryCollection
ColorfulGeometryCollection(t.geometries)

In [None]:
from os.path import join

datasets_folder = 'datasets'
# Save as a geojson
t.save_geojson(join(datasets_folder, 'raw-nyc-zip-codes.geojson'))

In [None]:
ls $datasets_folder -lh

In [None]:
# Save as a geojson using the longitude latitude coordinate system
t.save_geojson(join(
    datasets_folder, 'nyc-zip-codes.geojson',
), target_proj4=geotable.LONGITUDE_LATITUDE_PROJ4)

# Save as a shapefile using the longitude latitude coordinate system
t.save_shp(join(
    datasets_folder, 'nyc-zip-codes.shp.zip',
), target_proj4=geotable.LONGITUDE_LATITUDE_PROJ4)

# Save as a kmz using the longitude latitude coordinate system
t.save_kmz(join(
    datasets_folder, 'nyc-zip-codes.kmz',
), target_proj4=geotable.LONGITUDE_LATITUDE_PROJ4)

# Save as a csv using the longitude latitude coordinate system
t.save_csv(join(
    datasets_folder, 'nyc-zip-codes.csv',
), target_proj4=geotable.LONGITUDE_LATITUDE_PROJ4)

In [None]:
ls $datasets_folder/*.geojson -lh

In [None]:
ls $datasets_folder -lh

In [None]:
# Load geojson
t = geotable.load(join(datasets_folder, 'nyc-zip-codes.geojson'))
t[:2]

In [None]:
# Load kmz
t = geotable.load(join(datasets_folder, 'nyc-zip-codes.kmz'))
t[:2]

In [None]:
t['POPULATION'].describe()

In [None]:
from shapely.geometry import GeometryCollection
GeometryCollection(t.geometries).centroid.coords[0]

In [None]:
import json
d = json.load(open('nyc-zip-codes.geojson', 'rt'))

In [None]:
d.keys()

In [None]:
d['features'][0]['properties']

In [None]:
# Load geojson from shapely geometries

In [None]:
# Load shapely geometries from geojson
from shapely.geometry import shape
shape(d['features'][0]['geometry'])

In [None]:
from shapely.geometry import GeometryCollection
collection = GeometryCollection([shape(_['geometry']) for _ in d['features']])
collection.centroid.coords[0]

In [None]:
center_coordinates = collection.centroid.coords[0]
center_coordinates

In [None]:
# Get geojson
# Render geojson

In [None]:
from mapboxgl.utils import create_color_stops, create_numeric_stops
from mapboxgl.viz import ChoroplethViz
from os import getenv

MAPBOX_TOKEN = getenv('MAPBOX_TOKEN', 'YOUR-MAPBOX-TOKEN')

In [None]:
v = ChoroplethViz(
    d,
    style='mapbox://styles/mapbox/dark-v10',    
    access_token=MAPBOX_TOKEN,
    color_property='POPULATION',
    color_stops=create_color_stops([0, 49.5, 27985, 54445, 109069], colors='Greens'),
    color_function_type='interpolate',
    line_stroke='--',
    line_color='Yellow',
    line_width=1,
    line_opacity=0.9,
    opacity=0.8,
    center=center_coordinates,
    zoom=9,
    below_layer='waterway-label',
    legend_layout='horizontal',
    legend_key_shape='bar',
    legend_key_borders_on=False)
v.show()

In [None]:
t[:3]

In [None]:
# load geojson
# load shapefile
# load kml
# load kmz

# transform shapefile -> geojson
# transform kml -> geojson
# transform kmz -> geojson

In [None]:
geojson
shapefile
kml
how to transform between

load zip code
load bbl buildings

## Spatial Reference Systems

In [None]:
t = geotable.load('nyc-zip-codes.geojson')

In [None]:
t.geometry_proj4[0]

In [None]:
t[:2]

In [None]:
t = geotable.load('nyc-zip-codes.geojson', target_proj4=geotable.SPHERICAL_MERCATOR_PROJ4)
t.geometry_proj4[0]

In [None]:
t[:2]

In [None]:
t = geotable.load('https://data.cityofnewyork.us/download/i8iw-xf4u/application%2Fzip')

In [None]:
# Load into longitude latitude spatial reference

In [None]:
t['geometry_proj4'][0]

how to transform between

In [None]:
how to id which srs
# compare geopy geocode with actual
# get distance and rank
# https://spatialreference.org

## Table Joins

In [None]:
how to join

In [None]:
import pandas as pd

In [None]:
url = 'https://data.cityofnewyork.us/api/views/kku6-nxdu/rows.csv'
statistics_table = pd.read_csv(url, dtype=str)

In [None]:
len(statistics_table)

In [None]:
statistics_table.iloc[0]

In [None]:
t.iloc[0]

In [None]:
joined_t = pd.merge(t, statistics_table, left_on='ZIPCODE', right_on='JURISDICTION NAME')

In [None]:
dict(joined_t.iloc[0])

## Color Stops

In [None]:
how to determine color stop max min and in betweens

In [None]:
use describe