# OSM Data Exploration

## Extraction of districts from shape files
For our experiments we consider two underdeveloped districts Araria, Bihar and Namsai, Arunachal Pradesh, the motivation of this comes from this [dna](https://www.dnaindia.com/india/report-out-of-niti-aayog-s-20-most-underdeveloped-districts-19-are-ruled-by-bjp-or-its-allies-2598984) news article, quoting a Niti Aayog Report. We also consider a developed city Bangalore in the south of India.

In [None]:
import os
from dotenv import load_dotenv
load_dotenv()
# Read India shape file with level 2 (contains district level administrative boundaries)
india_shape = os.environ.get("DATA_DIR") + "/gadm36_shp/gadm36_IND_2.shp"

In [None]:
import geopandas as gpd 
india_gpd = gpd.read_file(india_shape)
#inspect
import matplotlib.pyplot as plt 
%matplotlib inline
india_gpd.plot();

In [None]:
# Extract Araria district in Bihar state
araria_gdf = india_gpd[india_gpd['NAME_2'] == 'Araria']
araria_gdf

In [None]:
# Extract two main features of interest
araria_gdf = araria_gdf[['NAME_2', 'geometry']]
araria_gdf.plot()

In [None]:
# Extract Namsai district in Arunachal Pradesh state. 
namsai_gdf = india_gpd[india_gpd['NAME_2'] == 'Namsai']
namsai_gdf

In [None]:
# Extract the two main features
namsai_gdf = namsai_gdf[['NAME_2', 'geometry']]
namsai_gdf.plot()

In [None]:
# Extract Bangalore district 
bangalore_gdf = india_gpd[india_gpd['NAME_2'] == 'Bangalore']
bangalore_gdf = bangalore_gdf[['NAME_2', 'geometry']]
bangalore_gdf.plot()

## Creating geographic extracts from OpenStreetMap Data

Given a geopandas data frame representing a district boundary we find its bounding box

In [None]:
# Get the coordinate system for araria data frame
araria_gdf.crs

In [None]:
araria_bbox = araria_gdf.bounds
print(araria_bbox)
type(araria_gdf)

## Fetch Open Street Map Data within Boundaries as Data Frame 
We use 'add_basemap' function of contextily to add a background map to our plot and make sure the added basemap has the same co-ordinate system (crs) as the boundary extracted from the shape file. 

In [None]:
import contextily as ctx 
araria_ax = araria_gdf.plot(figsize=(20, 20), alpha=0.5, edgecolor='k')
ctx.add_basemap(araria_ax, crs=araria_gdf.crs, zoom=12)

In [None]:
#Using contextily to download basemaps and store them in standard raster files Store the base maps as tif 
w, s, e, n = (araria_bbox.minx.values[0], araria_bbox.miny.values[0], araria_bbox.maxx.values[0], araria_bbox.maxy.values[0])
_ = ctx.bounds2raster(w, s, e, n, ll=True, path = os.environ.get("OSM_DIR") + "araria.tif", 
                        source=ctx.providers.CartoDB.Positron)

In [None]:
import rasterio
from rasterio.plot import show
r = rasterio.open(os.environ.get("OSM_DIR") + "araria.tif")
plt.imshow(r.read(1))
#show(r, 2)
plt.rcParams["figure.figsize"] = (20, 20)
plt.rcParams["grid.color"] = 'k'
plt.rcParams["grid.linestyle"] = ":"
plt.rcParams["grid.linewidth"] = 0.5
plt.rcParams["grid.alpha"] = 0.5
plt.show()

Other than the raster image tiles of the map there is also the Knots and Edges Model associated with a map, which is the vector data in the geopandas data frame and visualized below

In [None]:
import osmnx as ox
araria_graph = ox.graph_from_bbox(n, s, e, w)

In [None]:
type(araria_graph)

In [None]:
araria_fig, araria_ax = ox.plot_graph(araria_graph)
plt.tight_layout()

The following section deals with creation of GeoDataFrame of OSM entities within a N, S, E, W bounding box and tags which is a dictionary of tags used for finding objects in the selected area. Results returned are the union, not intersection of each individual tag. All Open Street Map tags can be found [here](https://wiki.openstreetmap.org/wiki/Map_features)

In [None]:
tags = {'amenity':True, 'building':True, 'emergency':True, 'highway':True, 'footway':True, 'landuse': True, 'water': True}

In [None]:
araria_osmdf = ox.geometries.geometries_from_bbox(n, s, e, w, tags=tags)

In [None]:
araria_osmdf.head()

In [None]:
# Copy the dataframe as a csv
araria_osmdf.to_csv(os.environ.get("OSM_DIR") + "araria_osmdf.csv")