# Converting County Rawdata into Boundary GeoJSON

## Part 1: Introduction

This Jupyter Notebook is intended to create county boundary GeoJSON based on county GeoJSON/Shapefile.

## Part 2: Preparation

We will be using **Jupyter Notebook(anaconda 3)** to edit and run the script. Information on Anaconda installation can be found <a href='https://docs.anaconda.com/anaconda/install/'>here</a>. Please note that this script is running on Python 3.

***Usually, You can download county boundary data from state data portals.***

To run this script you need:
- county GeoJSON/Shapefile stored in **state** folder
- directory path (**geojsons** folder > **state** folder)

The script currently prints one GeoJSON file:
- **state_County_boundaries.json**

>Original created on Feb 4 2021<br>
@author: Yijing Zhou @YijingZhou33

## Part 3: Get Started

###  Step 1: Import modules

In [None]:
import pandas as pd
import os
import geopandas as gpd
import json
from itertools import chain
import string
import folium

### Step 2: Manual items to change
> Uncomment one of the code blocks based on file type

In [None]:
###### Target state ######
state = 'New Hampshire'

###### Rawdata is Shapefile ######
countydata = 'New Hampshire_counties'

###### Rawdata is GeoJSON ######
# countydata = '' 

### Step 3: Set file path

In [None]:
rootpath = os.path.dirname(os.getcwd())
output = os.path.join(rootpath, 'geojsons', state, state + '_County_boundaries.json')

## Part 4: Build up city GeoJSON schema

###  Step 4: Convert rawdata into GeoJSON
> Uncomment one of the code blocks based on file type

In [None]:
###### Rawdata is Shapefile ######
def shp_to_gdf(rawdata):
    path = os.path.join(rootpath, 'geojsons', state, rawdata)
    shp = gpd.read_file(path, driver = 'shapefile').to_crs('EPSG:4326')
    return shp

# **************** uncomment **********************
gdf_county = shp_to_gdf(countydata)
# *************************************************

###### Rawdata is GeoJSON ######
# def geojson_to_gdf(rawdata):
#     path = os.path.join(rootpath, 'geojsons', state, rawdata + '.geojson')
#     geojson = gpd.read_file(path).to_crs('EPSG:4326')
#     return geojson

# **************** uncomment **********************
# gdf_county = geojson_to_gdf(countydata)
# *************************************************

###  Step 5: Inspect the dataframe to find out the column of county name

We're going to extract the columns containing county name and geometry information. Since name conventions differ in states, it's necessary to rename it.<br>
> By default, the script has already listed some possible column names, but you may need to manually input one. 

In [None]:
gdf_county.head()

In [None]:
def rename(df):
    ## possible county column names in the dataframe
    clist = ['COUNTY', 'NAME', 'COUNTY_NAM', 'COUNTY_NAME', 'CTY_NAME']
    
    if set(df.columns).intersection(set(clist)):
        cname = ''.join(set(df.columns).intersection(set(clist)))
    else:
        ## You may need to input one if it isn't clist
        cname = input('Please enter the column storing county names: ').strip()
          
    df = df[[cname, 'geometry']].rename(columns={cname:'County'})      
    ## capitalize the first letter of each word in the county name
    df['County'] = df['County'].apply(lambda row: string.capwords(row) + ' County')
    
    return df

gdf_merged = rename(gdf_county)

### Step 6: Convert GeoJSON into JSON

In [None]:
def conversion(inputfile):
    ## convert file to json 
    inputfile = json.loads(inputfile.to_json())
    ## display features properties as dataframe
    df = pd.json_normalize(inputfile['features'])
    return df

df_merged = conversion(gdf_merged)

###  Step 7: Create bounding box

In [None]:
def round_coordinates(l, precision):
    def round_element(e):
        if isinstance(e, list):
            return round_coordinates(e, precision)
        else:
            return round(e, precision)
    return [round_element(e) for e in l]

df_merged['geometry.coordinates'] = round_coordinates(df_merged['geometry.coordinates'], 4)
df_merged.head()

## Part 5: Create County GeoJSON

###  Step 8: Create geojson features

In [None]:
def create_geojson_features(df):
    print('> Creating GeoJSON features...')
    features = []
    geojson = {
        'type': 'FeatureCollection',
        'features': features
    }
        
    for _, row in df.iterrows():
        if type(row['geometry.coordinates'][0][0][0]) is float:
            geometry_type = 'Polygon'
        else:
            geometry_type = 'MultiPolygon'
        feature = {
            'type': 'Feature',
            'geometry': {
                'type': geometry_type, 
                'coordinates': row['geometry.coordinates']
            },
            'properties': {
                'County': row['properties.County'], 
                'State': state
            }
        }

        features.append(feature)
    return geojson

data_geojson = create_geojson_features(df_merged)

###  Step 9: Generate geojson file

In [None]:
with open(output, 'w') as txtfile:
    json.dump(data_geojson, txtfile)
print('> Creating GeoJSON file...')

## Part 6: Inspect bounding box map

In [None]:
print('> Making map...')
## change the location here to zoom to the center
m = folium.Map(location = [42.3756, -93.6397], control_scale = True, zoom_start = 4)

## check if the indexmap geojson files can be rendered properly
folium.GeoJson(output, 
               tooltip = folium.GeoJsonTooltip(fields=('County', 'State'),
               aliases=('County', 'State')),
               show = True).add_to(m)
m