# Converting County GeoJSON into Regular Bounding Box

## Part 1: Introduction

This Jupyter Notebook is intended to create county bounding box based on county GeoJSON/Shapefile.

## Part 2: Preparation

We will be using **Jupyter Notebook(anaconda 3)** to edit and run the script. Information on Anaconda installation can be found <a href='https://docs.anaconda.com/anaconda/install/'>here</a>. Please note that this script is running on Python 3.

***Usually, You can download county boundary GeoJSON file from state data portal, which typically is ArcGIS Hub.***

To run this script you need:
- county GeoJSON/Shapefile stored in **state** folder
- directory path (**geojson** folder > **state** folder)

The script currently prints one GeoJSON file:
- **state_County_bbox.json**

>Original created on Jan 31 2021<br>
@author: Yijing Zhou @YijingZhou33

## Part 3: Get Started

###  Step 1: Import modules

In [None]:
import pandas as pd
import os
import geopandas as gpd
import json
from itertools import chain
import string
import folium

### Step 2: Manual items to change

In [None]:
state = 'Minnesota'

###### Rawdata is Shapefile ######
rawdata = 'shp_bdry_mcdistricts_2013'

###### Rawdata is GeoJSON ######
# rawdata = 'Maryland_Physical_Boundaries_-_County_Boundaries__Detailed_'

output = os.path.join('geojson', state, state + '_County_bbox.json')

## Part 4: Build up county GeoJSON schema

###  Step 3: Convert GeoJSON/Shapefile to JSON

In [None]:
def conversion(inputfile):
    ## convert file to json 
    inputfile = json.loads(inputfile.to_json())
    ## display features properties as dataframe
    df = pd.json_normalize(inputfile['features'])
    return df

###### Rawdata is Shapefile ######

# **************** uncomment **********************
shp = os.path.join('geojson', state, rawdata)
county_shp = gpd.read_file(shp, driver = 'shapefile').to_crs('EPSG:4326')
df = conversion(county_shp)
# *************************************************

###### Rawdata is GeoJSON ######

# **************** uncomment **********************
# geojson = os.path.join('geojson', state, rawdata + '.geojson')
# county_geojson = gpd.read_file(geojson).to_crs('EPSG:4326')
# df = conversion(county_geojson)
# *************************************************
df

### Step 4: Clean up and format columns

In [None]:
def rename(df):
    ## possible county column names in the dataframe
    clist = ['properties.COUNTY', 'properties.NAME', 'properties.COUNTY_NAM', 'properties.COUNTY_NAME', 'properties.CTY_NAME']
    
    if set(df.columns).intersection(set(clist)):
        cname = ''.join(set(df.columns).intersection(set(clist)))
    else:
        cname = input('Please enter the column storing county names: ').strip()
          
    df = df[[cname, 'geometry.coordinates']].rename(
            columns={cname:'County', 'geometry.coordinates':'boundingBox'})      
    ## capitalize the first letter of each word in the county name
    df['County'] = df['County'].apply(lambda row: string.capwords(row))
    df['State'] = state
    
    return df

df = rename(df)

###  Step 5: Create bounding box

In [None]:
def bbox(points):
    x_coordinates, y_coordinates = zip(*points)
    return ','.join(str(x) for x in [min(x_coordinates), min(y_coordinates), max(x_coordinates), max(y_coordinates)])

def coordinates_bbox(df):
    for _, row in df.iterrows():
        ## geometry is Polygon
        if type(row['boundingBox'][0][0][0]) is float:
            row['boundingBox'] = bbox(row['boundingBox'][0])
        else:
        ## geometry is Multipolygon
            row['boundingBox'] = bbox(list(chain.from_iterable([l[0] for l in row['boundingBox']])))
            
coordinates_bbox(df)

###  Step 6: Round coordinates to 2 decimal places

In [None]:
## create regular bouding box coordinate pairs and round them to 2 decimal places
df = pd.concat([df, df['boundingBox'].str.split(',', expand=True).astype(float).round(2)], axis=1).rename(
    columns={0:'minX', 1:'minY', 2:'maxX', 3:'maxY'})
df['maxXmaxY'] = df.apply(lambda row: [row.maxX, row.maxY], axis = 1)
df['maxXminY'] = df.apply(lambda row: [row.maxX, row.minY], axis = 1)
df['minXminY'] = df.apply(lambda row: [row.minX, row.minY], axis = 1)
df['minXmaxY'] = df.apply(lambda row: [row.minX, row.maxY], axis = 1)
df['Coordinates'] = df[['maxXmaxY', 'maxXminY', 'minXminY', 'minXmaxY', 'maxXmaxY']].values.tolist()

## clean up unnecessary columns
df_clean = df.drop(columns =['minX', 'minY', 'maxX', 'maxY', 'maxXmaxY', 'maxXminY', 'minXminY', 'minXmaxY', 'boundingBox'])

## Part 5: Create County GeoJSON

###  Step 7: Create geojson features

In [None]:
# create_geojson_features 
def create_geojson_features(df):
    print('> Creating GeoJSON features...')
    features = []
    geojson = {
        'type': 'FeatureCollection',
        'features': features
    }
    for _, row in df.iterrows():
        feature = {
            'type': 'Feature',
            'geometry': {
                'type':'Polygon', 
                'coordinates':[row['Coordinates']]
            },
            'properties': {
                'County': row['County'] + ' County',
                'State': row['State']
            }
        }

        features.append(feature)
    return geojson

data_geojson = create_geojson_features(df_clean)

###  Step 8: Generate geojson file

In [None]:
with open(output, 'w') as txtfile:
    json.dump(data_geojson, txtfile)
print('> Creating GeoJSON file...')

## Part 6: Inspect bounding box map

In [None]:
print('> Making map...')
## change the location here to zoom to the center
m = folium.Map(location = [42.3756, -93.6397], control_scale = True, zoom_start = 5)

## check if the indexmap geojson files can be rendered properly
folium.GeoJson(data_geojson, 
               tooltip = folium.GeoJsonTooltip(fields=('County', 'State'),
               aliases=('County', 'State')),
               show = True).add_to(m)
m