# Spatial Index Notebook (general) with utilities module

### Goals: (1) Generate and save colonies dataset (2) Create function that creates service index. (3) Apply service index function to all available services (4) Add overall PSI column. (5) Save/ship to colleagues

In this notebook, I will complete the following tasks:
* Load in `colonies_with_neighbors` and process **[DONE]**
    * Merge population data with colonies data
    * Create nbr_dist column
    * Add ndmc_dist column
    * Remove extraneous columns and save as pickle file
* Calculate and add services indices to GeoDataFrame
    * Where needed, convert CSV lat/long to GeoDataFrame with Shapely Points
    * Convert shapefiles to the same coordinate reference system
    * Check that all point coordinates are in Delhi
    * Make sure that no geometry fields are missing
    * Calculate service index and add to GeoDataFrame
* Descriptive statistics for service indices
    * mean, min, max
    * Grouped by settlement type: take average PSI for all colonies with a specific settlement type
    * Grouped by MCD category: take average PSI for all colonies with a specific MCD category
    * Grouped by distance from NDMC? (optional)
* Save service indices and descriptive statistics to .csv file

In [319]:
# Import necessary modules
from itertools import islice
import pickle
from importlib import reload
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
from shapely.geometry import box, Polygon, Point
from shapely.ops import cascaded_union
from pyproj import CRS
import spatial_index_utils

# Reload spatial_index_utils
reload(spatial_index_utils)

# Constants
# Pseudo Mercator
epsg_code = 3857 

## Process and save colonies shapefile for PSI

In [None]:
# open colonies_with_neighbors file
# This has NDMC+JJC colonies with neighbors of each polygon
# based on other polygons that intersect, cross, touch
# or overlap.

with open('colonies_with_neighbors.data', 'rb') as f:
    colonies = pickle.load(f)

In [None]:
# Reproject to EPSG 3857
colonies = spatial_index_utils.reproject_gdf(colonies, epsg_code)

In [None]:
## Calculate distances from polygon centroid to centroids of its neighbors

# Distance should be in meters
# https://en.wikipedia.org/wiki/Easting_and_northing

colonies = spatial_index_utils.calc_nbr_dist(colonies)

In [None]:
# Import 2020 population data
worldpop2020 = pd.read_csv("population_data/pop_colony_wp_2020.csv")

# Restrict dataframe to only two columns:
# layer: population data
# uso_area_u: unique id for colonies
worldpop2020 = worldpop2020[['layer', 'uso_area_u']]

# Merge population data with colonies data
colonies = colonies.merge(worldpop2020, how='inner', 
                          left_on="USO_AREA_U", right_on='uso_area_u')

# Rename 'layer' column as 'population'
colonies = colonies.rename(columns={'layer': 'population'})

# Remove additional column 'uso_area_u'
colonies = colonies.drop(columns=['uso_area_u'])

In [None]:
# Code to generate ndmc_distances

## 28.632846 77.219639 Rajiv Chowk is considered the center of the city. 
# Cities of Delhi used used this same notional center.
# EPSG3857, WGS84

colonies['ndmc_dist'] = 0
ndmc = Point(28.632846, 77.219639)
for idx, row in colonies.iterrows():
    colonies.loc[idx, 'ndmc_dist'] = ndmc.distance(row['centroid'])

In [None]:
# Remove centroid and polygon_neighbors columns, which
# are no longer needed now that we have nbr_dist
colonies = colonies.drop(columns=['centroid', 'polygon_neighbors'])

In [None]:
colonies.head(2)

In [None]:
## Save `colonies` as pickle file
with open('colonies_for_psi.data', 'wb') as f:
    pickle.dump(colonies, f)

## START HERE: Load colonies GeoDataFrame for PSI

In [111]:
with open('colonies_for_psi.data', 'rb') as f:
    colonies = pickle.load(f)

In [112]:
colonies.head(2)

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0


## Load Ration Shop and check for missing geometries or duplicate rows

In [113]:
# Read in ration shop data
ration_shops = gpd.read_file('RationShops.shp')

# Make sure that all ration shops have a geometry
ration_shops[ration_shops['geometry'] == None]

Unnamed: 0,S No.,District,License No,FPS ID,Circle,FPS Shop N,Address Of,Contact No,Latitude,Longitude,geometry


In [114]:
# Check for duplicate rows
ration_shops[ration_shops.duplicated()]

Unnamed: 0,S No.,District,License No,FPS ID,Circle,FPS Shop N,Address Of,Contact No,Latitude,Longitude,geometry


## TODO: 
* Check for geometry duplicates
* Check that all geometries are in Delhi
* Create function `check_shapefile`
* Duplicates
    * be more selected in checking for duplicates? Check for most minimal things
    * be restricted in what I am checking out - map no, reg no, and geometry.
    * print this out...
* Unauthorized colonies

## Convert ration shop index creation into function 

In [122]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=ration_shops, 
                                                    service_name="ration", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]


In [123]:
colonies.head()

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176
2,Bhor Garh,1276,H,Industrial,"POLYGON Z ((8581345.143 3353980.079 0.000, 858...","[(2082, 1167.0172138500839), (1148, 1349.08887...",6984.20004,1437127.0,2.9e-05
3,Gautam Colony,1528,G,RUAC,"POLYGON Z ((8580819.492 3356801.814 0.000, 858...","[(1760, 568.7093143498952), (2082, 1269.644646...",27286.639479,1438223.0,0.011375
4,Kureni,2082,H,RV,"POLYGON Z ((8582448.764 3356971.996 0.000, 858...","[(1276, 1167.0172138500839), (1528, 1269.64464...",30131.546842,1437800.0,0.006188


## Load in Schools Dataset and Generate Index

In [126]:
# Read in schools data
schools = gpd.read_file('DelhiSchoolsMerged.shp')

# Make sure that all ration shops have a geometry
schools[schools['geometry'] == None]

Unnamed: 0,objectid_1,objectid,vilname,schname,schcd,schcat,school_cat,pincode,rururb,location,...,school_typ,schmgt,management,dtname,stname,stcode11,dtcode11,sdtcode11,sdtname,geometry


In [127]:
# Check for duplicate rows
schools[schools.duplicated()]

Unnamed: 0,objectid_1,objectid,vilname,schname,schcd,schcat,school_cat,pincode,rururb,location,...,school_typ,schmgt,management,dtname,stname,stcode11,dtcode11,sdtcode11,sdtname,geometry


In [128]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=schools, 
                                                    service_name="school", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]


In [129]:
colonies.head()

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx,school_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054,0.038763
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176,0.025417
2,Bhor Garh,1276,H,Industrial,"POLYGON Z ((8581345.143 3353980.079 0.000, 858...","[(2082, 1167.0172138500839), (1148, 1349.08887...",6984.20004,1437127.0,2.9e-05,0.020475
3,Gautam Colony,1528,G,RUAC,"POLYGON Z ((8580819.492 3356801.814 0.000, 858...","[(1760, 568.7093143498952), (2082, 1269.644646...",27286.639479,1438223.0,0.011375,0.010481
4,Kureni,2082,H,RV,"POLYGON Z ((8582448.764 3356971.996 0.000, 858...","[(1276, 1167.0172138500839), (1528, 1269.64464...",30131.546842,1437800.0,0.006188,0.005708


## Convert CSV data into shapefiles

In [235]:
# Delhi constants
lat_min = 28.412593
lat_max = 28.881338
lon_min = 76.83806899999999
lon_max = 77.3484578

In [236]:
delhi_bbox = box(lon_min, lat_min, lon_max, lat_max)

### ATM dataset

In [357]:
atm = gpd.read_file('atm_wgs84.shp')

In [358]:
atm.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [359]:
len(atm)

8226

In [360]:
atm = atm[atm.intersects(delhi_bbox)]

In [361]:
len(atm)

8215

In [362]:
# Make sure that all have a geometry
atm[atm['geometry'] == None]

Unnamed: 0,objectid,Bank_Type,bank_name,bk_corp_bc,Lattitude,Longitude,geometry


In [363]:
# Check for duplicate rows
atm[atm.duplicated()]

Unnamed: 0,objectid,Bank_Type,bank_name,bk_corp_bc,Lattitude,Longitude,geometry


In [364]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=atm, 
                                                    service_name="atm", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]
printing new point column
                    AREA 

In [365]:
colonies.head(2)

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx,school_idx,bank_idx,atm_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054,0.038763,0.002249,1.4e-05
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176,0.025417,0.002235,0.013669


In [246]:
colonies['atm_idx'].max()

nan

### Bank Index

In [348]:
bank = gpd.read_file('bank_wgs84.shp')

In [349]:
bank.head()

Unnamed: 0,objectid,bank_name,bank_cd,Latitude,Longitude,field_6,geometry
0,138188,Rural Cooperative Bank,NRCB,28.436036,77.175627,,POINT (77.17563 28.43604)
1,138229,Punjab National Bank,PUNB,28.440588,77.165772,,POINT (77.16577 28.44059)
2,138358,Canara Bank,CNRB,28.45269,77.15218,,POINT (77.15218 28.45269)
3,138386,HDFC Bank Ltd,HDFC,28.4542,77.168,,POINT (77.16800 28.45420)
4,138403,Canara Bank,CNRB,28.45502,77.1837,,POINT (77.18370 28.45502)


In [350]:
bank.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [289]:
#bank = bank[bank.intersects(delhi_bbox)]

In [352]:
# Make sure that all have a geometry
bank[bank['geometry'] == None]

Unnamed: 0,objectid,bank_name,bank_cd,Latitude,Longitude,field_6,geometry


In [353]:
bank[bank.duplicated()]

Unnamed: 0,objectid,bank_name,bank_cd,Latitude,Longitude,field_6,geometry


In [355]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=bank, 
                                                    service_name="bank", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]
printing new point column
                    AREA 

In [356]:
colonies.head()

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx,school_idx,bank_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054,0.038763,0.002249
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176,0.025417,0.002235
2,Bhor Garh,1276,H,Industrial,"POLYGON Z ((8581345.143 3353980.079 0.000, 858...","[(2082, 1167.0172138500839), (1148, 1349.08887...",6984.20004,1437127.0,2.9e-05,0.020475,0.001429
3,Gautam Colony,1528,G,RUAC,"POLYGON Z ((8580819.492 3356801.814 0.000, 858...","[(1760, 568.7093143498952), (2082, 1269.644646...",27286.639479,1438223.0,0.011375,0.010481,0.004003
4,Kureni,2082,H,RV,"POLYGON Z ((8582448.764 3356971.996 0.000, 858...","[(1276, 1167.0172138500839), (1528, 1269.64464...",30131.546842,1437800.0,0.006188,0.005708,0.002308


### Metro

In [366]:
metro = gpd.read_file('metro_wgs84.shp')

In [367]:
metro.head()

Unnamed: 0,S #,Metro Stat,Latitude,Longitude,geometry
0,1,Dilshad Garden Metro Station,28.675898,77.321517,POINT (77.32152 28.67590)
1,2,Jhilmil Metro Station,28.675773,77.312494,POINT (77.31249 28.67577)
2,3,Mansarovar Park Metro Station,28.675566,77.300119,POINT (77.30012 28.67557)
3,4,Shahadra Metro Station,28.673475,77.290076,POINT (77.29008 28.67347)
4,5,Welcome Metro Station,28.67179,77.277681,POINT (77.27768 28.67179)


In [368]:
# Make sure that all have a geometry
metro[metro['geometry'] == None]

Unnamed: 0,S #,Metro Stat,Latitude,Longitude,geometry


In [369]:
# Make sure none are duplicated
metro[metro.duplicated()]

Unnamed: 0,S #,Metro Stat,Latitude,Longitude,geometry


In [370]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=metro, 
                                                    service_name="metro", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]
printing new point column
                    AREA 

In [371]:
colonies.head()

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx,school_idx,bank_idx,atm_idx,metro_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054,0.038763,0.002249,1.4e-05,0.0
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176,0.025417,0.002235,0.013669,0.0
2,Bhor Garh,1276,H,Industrial,"POLYGON Z ((8581345.143 3353980.079 0.000, 858...","[(2082, 1167.0172138500839), (1148, 1349.08887...",6984.20004,1437127.0,2.9e-05,0.020475,0.001429,0.00532,0.0
3,Gautam Colony,1528,G,RUAC,"POLYGON Z ((8580819.492 3356801.814 0.000, 858...","[(1760, 568.7093143498952), (2082, 1269.644646...",27286.639479,1438223.0,0.011375,0.010481,0.004003,0.007698,0.0
4,Kureni,2082,H,RV,"POLYGON Z ((8582448.764 3356971.996 0.000, 858...","[(1276, 1167.0172138500839), (1528, 1269.64464...",30131.546842,1437800.0,0.006188,0.005708,0.002308,0.002876,0.0


### Police

In [375]:
police = gpd.read_file('police_wgs84.shp')

In [376]:
police.head()

Unnamed: 0,NAME,POLICE_STA,DISTRICT,FID,Station,Latitude,Longitude,layer,path,geometry
0,PS CIVIL LINES,CIVIL LINES,NORTH,,,,,Police_Station,C:/Users/bwbel/Downloads/Public Services-20200...,POINT (77.22172 28.68900)
1,PS TIMAR PUR,TIMAR PUR,NORTH,,,,,Police_Station,C:/Users/bwbel/Downloads/Public Services-20200...,POINT (77.22440 28.70679)
2,PS ROOP NAGAR,ROOP NAGAR,NORTH,,,,,Police_Station,C:/Users/bwbel/Downloads/Public Services-20200...,POINT (77.20254 28.68486)
3,PS SARAI ROHILLA,SARAI ROHILLA,NORTH,,,,,Police_Station,C:/Users/bwbel/Downloads/Public Services-20200...,POINT (77.18357 28.66875)
4,PS BARA HINDU RAO,BARA HINDU RAO,NORTH,,,,,Police_Station,C:/Users/bwbel/Downloads/Public Services-20200...,POINT (77.20814 28.66568)


In [377]:
# Make sure that all have a geometry
police[police['geometry'] == None]

Unnamed: 0,NAME,POLICE_STA,DISTRICT,FID,Station,Latitude,Longitude,layer,path,geometry


In [378]:
# Make sure none are duplicated
police[police.duplicated()]

Unnamed: 0,NAME,POLICE_STA,DISTRICT,FID,Station,Latitude,Longitude,layer,path,geometry


In [379]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=police, 
                                                    service_name="police", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]
printing new point column
                    AREA 

In [380]:
colonies.head()

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx,school_idx,bank_idx,atm_idx,metro_idx,police_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054,0.038763,0.002249,1.4e-05,0.0,0.0
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176,0.025417,0.002235,0.013669,0.0,0.060013
2,Bhor Garh,1276,H,Industrial,"POLYGON Z ((8581345.143 3353980.079 0.000, 858...","[(2082, 1167.0172138500839), (1148, 1349.08887...",6984.20004,1437127.0,2.9e-05,0.020475,0.001429,0.00532,0.0,0.0
3,Gautam Colony,1528,G,RUAC,"POLYGON Z ((8580819.492 3356801.814 0.000, 858...","[(1760, 568.7093143498952), (2082, 1269.644646...",27286.639479,1438223.0,0.011375,0.010481,0.004003,0.007698,0.0,1.8e-05
4,Kureni,2082,H,RV,"POLYGON Z ((8582448.764 3356971.996 0.000, 858...","[(1276, 1167.0172138500839), (1528, 1269.64464...",30131.546842,1437800.0,0.006188,0.005708,0.002308,0.002876,0.0,0.0


### Bus

In [381]:
bus = gpd.read_file('bus.shp')

In [382]:
bus.head()

Unnamed: 0,stop_id,stop_code,stop_name,stop_lat,stop_lon,geometry
0,0,,Adarsh Nagar / Bharola Village,28.715917,77.170867,POINT (77.17087 28.71592)
1,1,,British High Comission,28.598533,77.191383,POINT (77.19138 28.59853)
2,2,,Azad Market,28.6647,77.2084,POINT (77.20840 28.66470)
3,3,,Kidwai Nagar,28.5757,77.2097,POINT (77.20970 28.57570)
4,4,,Rashid Market,28.6502,77.278667,POINT (77.27867 28.65020)


In [383]:
len(bus)

3210

In [384]:
# Make sure that all have a geometry
bus[bus['geometry'] == None]

Unnamed: 0,stop_id,stop_code,stop_name,stop_lat,stop_lon,geometry


In [385]:
# Make sure none are duplicated
bus[bus.duplicated()]

Unnamed: 0,stop_id,stop_code,stop_name,stop_lat,stop_lon,geometry


In [386]:
colonies = spatial_index_utils.create_service_index(polygon_gdf=colonies, 
                                                    point_gdf=bus, 
                                                    service_name="bus", 
                                                    epsg_code=epsg_code)

GeoDataFrame now has the following CRS:

PROJCRS["WGS 84 / Pseudo-Mercator",BASEGEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4326]],CONVERSION["Popular Visualisation Pseudo-Mercator",METHOD["Popular Visualisation Pseudo Mercator",ID["EPSG",1024]],PARAMETER["Latitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["easting (X)",east,ORDER[1],LENGTHUNIT["metre",1]],AXIS["northing (Y)",north,ORDER[2],LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["World - 85°S to 85°N"],BBOX[-85.06,-180,85.06,180]],ID["EPSG",3857]]
printing new point column
                    AREA 

In [387]:
colonies.head()

Unnamed: 0,AREA,USO_AREA_U,HOUSETAX_C,USO_FINAL,geometry,nbr_dist,population,ndmc_dist,ration_idx,school_idx,bank_idx,atm_idx,metro_idx,police_idx,bus_idx
0,Singhola,3058,H,RV,"POLYGON Z ((8587300.847 3355178.518 0.000, 858...","[(2183, 2016.7464333325045), (1148, 1905.25373...",4415.586042,1440049.0,0.014054,0.038763,0.002249,1.4e-05,0.0,0.0,0.0
1,Indra Colony (Narela),1760,G,RUAC,"POLYGON Z ((8580725.093 3357134.173 0.000, 858...","[(1528, 568.7093143498952), (2869, 601.1292656...",4547.507628,1437958.0,0.000176,0.025417,0.002235,0.013669,0.0,0.060013,0.005749
2,Bhor Garh,1276,H,Industrial,"POLYGON Z ((8581345.143 3353980.079 0.000, 858...","[(2082, 1167.0172138500839), (1148, 1349.08887...",6984.20004,1437127.0,2.9e-05,0.020475,0.001429,0.00532,0.0,0.0,3.5e-05
3,Gautam Colony,1528,G,RUAC,"POLYGON Z ((8580819.492 3356801.814 0.000, 858...","[(1760, 568.7093143498952), (2082, 1269.644646...",27286.639479,1438223.0,0.011375,0.010481,0.004003,0.007698,0.0,1.8e-05,0.004751
4,Kureni,2082,H,RV,"POLYGON Z ((8582448.764 3356971.996 0.000, 858...","[(1276, 1167.0172138500839), (1528, 1269.64464...",30131.546842,1437800.0,0.006188,0.005708,0.002308,0.002876,0.0,0.0,0.006876


## Save ration shop index (and descriptive data) to CSV and pickle files

In [393]:
colonies.to_csv('colonies_psi_15july.csv')

In [392]:
with open('colonies_psi_15july.data', 'wb') as f:
    pickle.dump(colonies, f)

In [390]:
pickle.dump?

[1;31mSignature:[0m [0mpickle[0m[1;33m.[0m[0mdump[0m[1;33m([0m[0mobj[0m[1;33m,[0m [0mfile[0m[1;33m,[0m [0mprotocol[0m[1;33m=[0m[1;32mNone[0m[1;33m,[0m [1;33m*[0m[1;33m,[0m [0mfix_imports[0m[1;33m=[0m[1;32mTrue[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Write a pickled representation of obj to the open file object file.

This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may
be more efficient.

The optional *protocol* argument tells the pickler to use the given
protocol supported protocols are 0, 1, 2, 3 and 4.  The default
protocol is 3; a backward-incompatible protocol designed for Python 3.

Specifying a negative protocol version selects the highest protocol
version supported.  The higher the protocol used, the more recent the
version of Python needed to read the pickle produced.

The *file* argument must have a write() method that accepts a single
bytes argument.  It can thus be a file object opened for binary
wri