## Spatial Data Science with CityJSON

The purpose of this Notebook is to ***work with*** the product of [osm_LoD1_3DCityModel](https://github.com/AdrianKriger/osm_LoD1_3DCityModel); a previously created CityJSON city model.

**This notebook will:**

> **1. allow the user to execute an application of Spatial Data Science**  
>
>> **a)  population estimation and**  
>> **b)  a measure of [Building Volume per Capita](https://www.researchgate.net/publication/343185735_Building_Volume_Per_Capita_BVPC_A_Spatially_Explicit_Measure_of_Inequality_Relevant_to_the_SDGs).**
>
> **2. an interactive visualization** *-via [pydeck](https://deckgl.readthedocs.io/en/latest/)- which a user can navigate, query and share* ***[to do]***.  

In [None]:
#load the magic

%matplotlib inline
import os
from pathlib import Path

import numpy as np
import pandas as pd
import geopandas as gpd
import shapely
from shapely.geometry import Polygon, shape, mapping
import json
import geojson

from cjio import cityjson

import matplotlib.pyplot as plt
import pydeck as pdk

**The area under investigation is [University Estate]((https://en.wikipedia.org/wiki/University_Estate). Its 3D.CityJSON is available as citjsnClean_uEstate10m.json in the [result folder](https://github.com/AdrianKriger/osm_LoD1_3DCityModel/blob/main/village_campus/result/citjsnClean_uEstate10m.json)**

In [None]:
#- use the same parameter file from osm_LoD1_3DCityModel ~~ osm3DuEstate_param.json
jparams = json.load(open('osm3DuEstate_param.json'))

In [None]:
cm = cityjson.load(path=jparams['cjsn_solid'])

In [None]:
df = cm.to_dataframe()
df = df[1:]

In [None]:
footprints = []

for co_id, co in cm.cityobjects.items():
    if co.type == 'Building':
        [geometry] = co.geometry[0].boundaries
        l = geometry[-1]
        for i in l:
            p = Polygon(i)
        footprints.append(p)

# Create a GeoDataFrame
gdf = gpd.GeoDataFrame(df, geometry=footprints, crs=jparams['crs'])

## 1. Spatial Data Science

<div class="alert alert-block alert-warning"><b>We start with basic spatial analysis</b>  
    
     
- We'll estimate the population, within our area of interest, and then  
- calculate the Building Volume Per Capita (BVPC).
</div>

While estimating population is well documented; recent investigations to **understand overcrowding** have led to newer measurements.  

The most noteable of these is **Building Volume Per Capita (BVPC)** [(Ghosh, T; et al. 2020)](https://www.researchgate.net/publication/343185735_Building_Volume_Per_Capita_BVPC_A_Spatially_Explicit_Measure_of_Inequality_Relevant_to_the_SDGs). BVPC is the cubic meters of building per person. **BVPC tells us how much space one person has per residential living unit** (a house / apartment / etc.). It is ***a proxy measure of economic inequality and a direct measure of housing inequality***.

BVPC builds on the work of [(Reddy, A and Leslie, T.F., 2013)](https://www.tandfonline.com/doi/abs/10.1080/02723638.2015.1060696?journalCode=rurb20) and attempts to integrate with several **[Sustainable Development Goals](https://sdgs.un.org/goals)** (most noteably: **[SDG 11: Developing sustainable cities and communities](https://sdgs.un.org/goals/goal11)**) and captures the average ***'living space'*** each person has in their home.

<div class="alert alert-block alert-info"><b>These analysis expect the user to have some basic knowledge about the environment under inquiry / investigation</b> </div>

In [None]:
gdf.head(2)

Unnamed: 0,osm_id,osm_address,osm_building,osm_building:levels,plus_code,ground_height,building_height,roof_height,osm_name,osm_office,osm_type,osm_website,osm_operator,geometry
739615941,739615941.0,10 Rhodes Avenue University Estate Cape Town,house,2,4FRW3C6X+WRG,96.75,6.9,103.65,,,,,,"POLYGON Z ((264270.087 6241814.254 96.750, 264..."
740820432,740820432.0,100 Upper Roodebloem Road University Estate Ca...,house,2,4FRW3F62+R87,85.37,6.9,92.27,,,,,,"POLYGON Z ((264380.263 6241798.225 83.530, 264..."


<div class="alert alert-block alert-success"><b>1.  a) Estimate Population:</b></div>

In [None]:
#--we only want building=house or =apartment or =residential
gdf = gdf[gdf["osm_building"].isin(['house', 'apartment', 'residential'])].copy()

In [None]:
len(gdf)

295

**This area is urban with single level housing units. To estimate population is thus pretty straight forward.**

<div class="alert alert-block alert-info"><b>We start with local knowledge.</b></div>

**On average there are roughly `4` people per `building:house` in this area.**  

**Additionally an *informal* structure is tagged `building:residential` and houses `3` people.**

<div class="alert alert-block alert-warning"><b></b>  
    
**Furthermore:**  
    - `building:apartment` harvests the `building:flats` *'key:value'* pair *(the number of units)* to calculate `*3` people per apartment.  
    - Student accomodation is tagged `building:residential` with `residential:student` and then harvests the `building:flats` *'key:value'* pair *(the number of units)* to calculate `*1` people per apartment; if `level: > 1` else `*3` people in a house share.
    
**The tagging scheme and numbers is based on *how your community is mapped* and local knowledge**
</div>

In [None]:
def pop(row):
    if row['osm_building'] == 'house':
        return 4
    if row['osm_building'] == 'apartment':
        return row['flats'] * 3
    if row['osm_building'] == 'residential': #here should be an additional: and row['res'] == 'informal':
        return 3
    if row['osm_building'] == 'residential' and row['res'] == 'student':
        if row['levels'] > 1:
            return row['flats'] * 1
        else:
            3

gdf['pop'] = gdf.apply(lambda x: pop(x), axis=1)

est_pop = gdf['pop'].sum()
print('The estimated population is:', est_pop)

The estimated population is: 1180


**The official [STATSSA 2011 census figure](https://en.wikipedia.org/wiki/University_Estate), for this community, is 987** and suggests a population growth rate of approximately 1.49% per year.

This growth rate is calculated using the formula for **[Annual population growth](https://databank.worldbank.org/metadataglossary/health-nutrition-and-population-statistics/series/SP.POP.GROW):**

$$r = \frac{\ln{[\frac{End Population}{Start Population}}]}{n} * 100 = \frac{\ln{[\frac{1 180}{987}}]}{12} * 100   = 1.49\%$$


<div class="alert alert-block alert-success"><b>1. b) Building Volume Per Capita (BVPC):</b>  
BVPC = total population of a community divided by sum of building volume</div>

In [None]:
gdf['area'] = gdf['geometry'].area#\.map(lambda p: p.area)
gdf['volume'] = gdf['area'] * gdf['building_height']
gdf['bvpc'] =  gdf['volume'] / gdf['pop']

gdf.tail(2)

Unnamed: 0,osm_id,osm_address,osm_building,osm_building:levels,plus_code,ground_height,building_height,roof_height,osm_name,osm_office,osm_type,osm_website,osm_operator,geometry,pop,area,volume,bvpc
1025219390,1025219000.0,10 Kylemore Road University Estate Cape Town,house,2,4FRW3C6X+RJF,105.13,6.9,112.03,,,,,,"POLYGON Z ((264216.844 6241792.852 101.710, 26...",5,123.180202,849.943394,169.988679
1025219391,1025219000.0,2 Rhodes Avenue University Estate Cape Town,house,2,4FRW3C7X+2JJ,101.25,6.9,108.15,,,,,,"POLYGON Z ((264223.549 6241845.738 101.250, 26...",5,105.445078,727.571038,145.514208


In [None]:
print(gdf['bvpc'].describe())

count    295.000000
mean     170.914266
std      101.480504
min       27.591950
25%      103.277695
50%      143.377712
75%      212.110726
max      777.464197
Name: bvpc, dtype: float64


In [None]:
bvpc = round(gdf['volume'].sum() / est_pop, 3)

print('Building Volume Per Capita (BVPC):', bvpc)

Building Volume Per Capita (BVPC): 170.914


**This BVPC value is general.**  

We can seperate `building:house` and `building:residential` to undertand the differences between ***formal and informal*** housing in this area.
    
**We want to understand the living space *(the cubic-meter BVPC value)* each person has in thier home**
</div>

In [None]:
formal = gdf[gdf["osm_building"].isin(['house'])].copy()
f_pop = formal['pop'].sum()
#f_area = formal['area'].mean()

informal = gdf[gdf["osm_building"].isin(['residential'])].copy()
inf_pop = informal['pop'].sum()
#inf_area = formal['area'].mean()

bvpc_formal = round(formal['volume'].sum() / est_pop, 3)
bvpc_informal = round(informal['volume'].sum() / est_pop, 3)

print('FORMAL: Population: ', f_pop, ' with Building Volume Per Capita (BVPC):', bvpc_formal)
print('')
print('INFORMAL: Polutation: ', inf_pop, ' with Building Volume Per Capita (BVPC)', bvpc_informal)

FORMAL: Population:  1475  with Building Volume Per Capita (BVPC): 170.914

INFORMAL: Polutation:  0  with Building Volume Per Capita (BVPC) 0.0
