# Automate OpenIndexMaps GeoJSON Creation
## *Regular Bounding Box*

## Part 1: Introduction

This demonstration is the Jupyter Notebook version of ***indexmap*** script, which is used to convert **regular** bounding box stored in csv file to <a href="https://openindexmaps.org/">OpenIndexMaps GeoJSON</a>.

## Part 2: Preparation

We will be using **Jupyter Notebook(anaconda 3)** to edit and run the script. Information on Anaconda installation can be found <a href='https://docs.anaconda.com/anaconda/install/'>here</a>. Please note that this script is running on Python 3.

Here are all dependencies needed to be installed properly:
- <a href='https://numpy.org/install/'>numpy</a>
- <a href='https://python-visualization.github.io/folium/installing.html'>folium</a>

To run this script you need:
- **code**.csv formatted in GBL Metadata Template 
- directory path (**data** folder > **code** folder > **code**.csv)

The script currently prints one GeoJSON file:
- **code**.geojson

>Original created on Dec 1 2020<br>
@author: Yijing Zhou @YijingZhou33

## Part 3: Get Started

###  Step 1: Import modules

In [1]:
import os
import pandas as pd
import json
import folium
import numpy as np

### Step 2: Manual items to change

In [2]:
##### Manually changed items #####
## code and title of the metadata 
code = '03d-02'
title = 'Iowa County Atlases - 03d-02'

## Part 4: Create OpenIndexMap GeoJSON

### Step 3: Convert GeoBlackLight Metadata csv file to dataframe

In [3]:
## list of metadata fields from the GBL metadata template for open data portals desired in the final OpenIndexMap geojson.
collist = ['Title', 'Bounding Box', 'Identifier']

## convert the whole csv file to dataframe
df = pd.read_csv(os.path.join('data', code, code+'.csv'))

## check if the metadata contains 'Image' column, if so then add it to the list
## also more properties can be added here!
if 'Image' in df.columns:
    collist.append('Image')

## only extract fields required for OpenIndexMap geojson properties
df = df[collist]

df.head()

Unnamed: 0,Title,Bounding Box,Identifier
0,"Plat book of Adair County, Iowa","-94.7025, 41.1561, -94.2424, 41.5033",a0c1f157-6f8d-49e2-b30c-d3823b865500
1,"Plat book of Adams County, Iowa","-94.9308, 40.8989, -94.4722, 41.1578",8761dd95-7911-46f9-8c1b-641735e591dc
2,"Plat book of Audubon County, Iowa","-95.0947, 41.5032, -94.7014, 41.8626",4fa559cd-91f6-4f9f-94ea-99c5f0efa2df
3,"Plat book of Benton County, Iowa","-92.2996, 41.8617, -91.8318, 42.2987",62abb614-fc33-4368-8daa-2821995d86f8
4,"Plat book of Black Hawk County, Iowa","-92.5554, 42.2972, -92.0647, 42.6413",132fd790-ff0b-4771-938a-f16b72325216


In [4]:
## get the list of all duplicated records with same title
## if so, go back to csv file to delete the redundant one and save it
## then go to Kernel > Restart & Run All
df[df.duplicated(['Title'], keep=False)].sort_values('Title')

Unnamed: 0,Title,Bounding Box,Identifier


### Step 4: Build up OpenIndexMap schema

In [5]:
## create regular bouding box coordinate pairs and round them to 2 decimal places
df = pd.concat([df, df['Bounding Box'].str.split(',', expand=True).astype(float).round(2)], axis=1).rename(
    columns={0:'minX', 1:'minY', 2:'maxX', 3:'maxY'})
df['maxXmaxY'] = df.apply(lambda row: [row.maxX, row.maxY], axis = 1)
df['maxXminY'] = df.apply(lambda row: [row.maxX, row.minY], axis = 1)
df['minXminY'] = df.apply(lambda row: [row.minX, row.minY], axis = 1)
df['minXmaxY'] = df.apply(lambda row: [row.minX, row.maxY], axis = 1)
df['coordinates'] = df[['maxXmaxY', 'maxXminY', 'minXminY', 'minXmaxY', 'maxXmaxY']].values.tolist()

## concatenate landing page links
df['websiteURL'] = 'https://geo.btaa.org/catalog/' + df['Identifier']

## clean up unnecessary columns
df_clean = df.drop(columns =['minX', 'minY', 'maxX', 'maxY', 'maxXmaxY', 'maxXminY', 'minXminY', 'minXmaxY', 'Bounding Box'])

df_clean.head()

Unnamed: 0,Title,Identifier,coordinates,websiteURL
0,"Plat book of Adair County, Iowa",a0c1f157-6f8d-49e2-b30c-d3823b865500,"[[-94.24, 41.5], [-94.24, 41.16], [-94.7, 41.1...",https://geo.btaa.org/catalog/a0c1f157-6f8d-49e...
1,"Plat book of Adams County, Iowa",8761dd95-7911-46f9-8c1b-641735e591dc,"[[-94.47, 41.16], [-94.47, 40.9], [-94.93, 40....",https://geo.btaa.org/catalog/8761dd95-7911-46f...
2,"Plat book of Audubon County, Iowa",4fa559cd-91f6-4f9f-94ea-99c5f0efa2df,"[[-94.7, 41.86], [-94.7, 41.5], [-95.09, 41.5]...",https://geo.btaa.org/catalog/4fa559cd-91f6-4f9...
3,"Plat book of Benton County, Iowa",62abb614-fc33-4368-8daa-2821995d86f8,"[[-91.83, 42.3], [-91.83, 41.86], [-92.3, 41.8...",https://geo.btaa.org/catalog/62abb614-fc33-436...
4,"Plat book of Black Hawk County, Iowa",132fd790-ff0b-4771-938a-f16b72325216,"[[-92.06, 42.64], [-92.06, 42.3], [-92.56, 42....",https://geo.btaa.org/catalog/132fd790-ff0b-477...


### Step 5: Create geojson features

In [6]:
# create_geojson_features 
def create_geojson_features(df):
    print('> Creating GeoJSON features...')
    features = []
    geojson = {
        'type': 'FeatureCollection',
        'title': title,
        'features': features
    }
    for _, row in df.iterrows():
        feature = {
            'type': 'Feature',
            'id': row['Identifier'],
            'geometry': {
                'type':'Polygon', 
                'coordinates':[row['coordinates']]
            },
            'properties': {
                'label': row['Title'],
                'title': row['Title'],
                'recordIdentifier': row['Identifier'],
                'websiteUrl': row['websiteURL']
            }
        }
        ### add more properties here if applicable
        if 'Image' in df.columns:
            feature['properties']['thumbnailUrl'] = row['Image']

        features.append(feature)
    return geojson

data_geojson = create_geojson_features(df_clean)

> Creating GeoJSON features...


### Step 6: Generate geojson file

In [7]:
with open(os.path.join('data', code, code+'.geojson'), 'w') as txtfile:
    json.dump(data_geojson, txtfile)
print('> Creating GeoJSON file...')

> Creating GeoJSON file...


## Part 5: Draw the index maps

In [8]:
print('> Making map...')
## change the location here to zoom to the center
m = folium.Map(location = [42.3756, -93.6397], control_scale = True, zoom_start = 7)

## check if the indexmap geojson files can be rendered properly
folium.GeoJson(open(os.path.join('data', code, code+'.geojson'), 'r').read(),
               tooltip = folium.GeoJsonTooltip(fields=('title', 'websiteUrl'),
                                               aliases=('title','websiteUrl')),
               show = True).add_to(m)
m

> Making map...
