# Week 10

Example geospatial analysis, with "cinemas in Singapore" data.

### 1 Setup

We will use our trusty `pandas`, but will also use `geopandas` and `folium`. Please install them if you haven't done so (see the video for installation instructions). 

In [None]:
!pip install geopandas
!pip install folium

### 2 Loading data

If you are on Colab, please download and unzip the folder witht the sample data.

In [None]:
!wget -q https://github.com/gei1002/sampledata/raw/main/week10_data.zip
!unzip -q data.zip

### 3 Loading modules

In [None]:
import pandas as pd
import geopandas as gpd
import folium
from folium import Choropleth

### 4 Analysis

Here we load the Singapore shapefile as a geopandas dataframe (note that we only need to specify the directory of the shapefile)

In [None]:
sg = gpd.read_file("singapore/shp")

In [None]:
sg.head()

We need to change the Coordinate Reference System to EPSG 4316 (we **always** need this CRS for folium), but some shapefiles have this CRS. Here we don't, so we change the settings. We use `inplace=True` to change our original dataframe called `sg`.

In [None]:
sg.to_crs(epsg=4326, inplace=True)

In [None]:
sg.head()

We then set **REGION_N** as the index.

In [None]:
sg.set_index("REGION_N", inplace=True)

In [None]:
sg.head()

Next stept is to load our cinemas data into a dataframe. This time we are using a csv file, but the syntax is almost the same, just note it is `read_csv()` instead of `read_excel()`.

In [None]:
df = pd.read_csv("data/cinemas_singapore.csv")

In [None]:
df.head()

We also need to set the index to **Region**, to match our geopandas dataframe. The values in the index ("North Region", "West Region", etc.) need to be the same in both cases, even if the name of the original column is not (i.e.,  ""REGION_N" in `sg` and "Region" in `df`)

In [None]:
df.set_index("Region",inplace=True)

In [None]:
df

### 5 Creating the choropleth

Some notes for the `Map()` function:
- `location` indicates the starting position of your map in latitude and longitude, separated by a comma, inside square brackets.
- `zoom_start` indicates the level at which you want the map to be initially zoomed.
- `tiles` indicates which base map you want to use, it takes in the following options:
    - openStreetMap
    - cartodbPositron
    - cartodbDark_Matter
    - stamenTerrain
    - stamenToner
    - stamenWatercolor
    
For the `Choropleth()` function:
- `geo_data` specifies the boundaries of each geographical area. You always need to append `__geo_interface__` to the name of your geopandas dataframe.
- `data` specifies the quantitative values that will be used to determine the color of each geographical area.  
- `key_on` will always be set to `feature.id`.  
- `fill_color` sets the color scale. You can use most Matplotlib color maps: https://matplotlib.org/stable/tutorials/colors/colormaps.html.
- `legend_name` labels the legend in the top right corner of the map.


In [None]:
sg_map = folium.Map(location=[1.35,103.8198], zoom_start=11, tiles="openStreetMap")

Choropleth(geo_data=sg.__geo_interface__,
          data=df["Cinemas"],
          key_on="feature.id",
          fill_color="Blues",
          leged_name="Cinemas in Singapore").add_to(sg_map)

sg_map