# Polygon data

In [None]:
# import modules
import geopandas as gpd
import folium
import json
import branca

## Polygon data formats

In this tutorial we will work with a shapefile of *polygons* representing municipalities in the provinces of Utrecht and Noord-Holland. 

In [None]:
df = gpd.read_file('Data/polygon/Utrecht_NoordHolland.shp', geometry='geometry')

In [None]:
df.head()

Provinces are identified by the "prov_name" column. The different municipalities are identified by "gem_name" (gem stands for gemeente in Dutch). Municipalities can be further subdivided in multiple connected small cities and villages. The "inhabitant" column identifies the number of people living in each subdivision according to census. For instance, the gemeente of *Amsterdam* is made of 8 densely populated parts.

In [None]:
# show number of subdivision in each gemeente
df['gem_name'].value_counts()

In [None]:
df.sort_values(by='inhabitant', ascending=False).head()

Similar to what we saw for Points and LineStrings, a basic plot of each polygon can be obtained in the following fashion

In [None]:
df.geometry[0]

The `exterior` attribute contains geometrical information concerning the boundary of the polygon

In [None]:
df.geometry[0].exterior

The `area` attribute reports the area within the perimeter. A quick comparison for the [city of Utrecht](https://en.wikipedia.org/wiki/Utrecht) suggests these are pretty accurate.

In [None]:
# divide by 1000x1000 to have the value in sq. km

df[df['gem_name']=='Utrecht'].area.sum()/10**6

## Polygon data visualization

While not extremely sophisticated, `geopandas` offers some interesting plotting functionalities for polygons. Here is for instance a quick color-coded plot for the number of inhabitants within each polygon.

In [None]:
df.plot(column = 'inhabitant', cmap='YlOrRd', legend=True);

We resort to `folium` for better visualization. First, we transform our coordinate system to the standard "EPSG:4326" used by the library. We then plot a base map centered average latitude/longitude coordinates of the gemeente of Utrecht. After that, we plot all the polygons for this gemeente along with some informative pop-up windows.

In [None]:
df.to_crs(epsg=4326, inplace=True)

In [None]:
# select Utrecht rows
utrecht_sel = df['gem_name']=='Utrecht'

# compute average coordinates
avg_x_coord = (df[utrecht_sel].bounds.minx+df[utrecht_sel].bounds.maxx)/2
avg_y_coord = (df[utrecht_sel].bounds.miny+df[utrecht_sel].bounds.maxy)/2

In [None]:
# create folium base map
poly_map = folium.Map(
    location=[avg_y_coord.mean(),avg_x_coord.mean()],    
    zoom_start=12
)

# visualize Utrecht polygons
utrecht_gjson = folium.features.GeoJson(
    df[utrecht_sel],
).add_to(poly_map)

# add informative pop-up windows
folium.features.GeoJsonPopup(
    fields=['mzr_name', 'gem_name', 'inhabitant'],
    aliases=['Location', 'Municipality', 'Population'],
    labels=True
).add_to(utrecht_gjson)

# visualize map
poly_map

### Tooltip

Clicking on each displayed element to retrieve information can be tedious; `folium` offers an interesting alternative with the `Tooltip` feature. By using `Tooltip` we can easily show information when hovering the mouse pointer on each polygon.

In [None]:
# create folium base map
poly_map = folium.Map(
    location=[avg_y_coord.mean(),avg_x_coord.mean()],    
    zoom_start=12
)

# visualize Utrecht polygons
utrecht_gjson = folium.features.GeoJson(
    df[utrecht_sel],
).add_to(poly_map)

# add tooltip functionality
folium.features.GeoJsonTooltip(
    fields=['mzr_name', 'gem_name', 'inhabitant'],
    aliases=['Location', 'Municipality', 'Population']
).add_to(utrecht_gjson)

# visualize map
poly_map

### Choropleth

While prettier, the maps above are somewhat less informative than the `geopandas` visualization on the distribution of inhabitants across the municipalities. 

These type of maps are known as [choropleth map](https://en.wikipedia.org/wiki/Choropleth_map). Choropleth maps are statistical thematic maps that use color coding to provide aggregate summary of a geographic characteristic within spatial enumeration units, such as totals (e.g. total population) or averages (e.g. population density or per-capita income). Choropleth maps provide an easy way to visualize how a variable varies across a geographic area or show the level of variability within a region.

[There are many ways](https://towardsdatascience.com/creating-choropleth-maps-with-pythons-folium-library-cfacfb40f56a) to create such visualizations with `folium`. Below we provide a solution that exploits `style_functions` and the `branca` library as introduced in the [previous notebook](./2_Lines.ipynb) on `LineStrings`.

In [None]:
# create learn colormap interpolating 3 colors
colors = branca.colormap.LinearColormap(
    ['green', 'yellow', 'red'], vmin=df.inhabitant.min(), vmax=df.inhabitant.max())

In [None]:
# define style function
def population_choropleth(row):
    return {
        "fillColor": colors(row['properties']['inhabitant']),
        "color": "white",
        "weight": 1,
        "fillOpacity": 0.75,
    }

In [None]:
# create base map
poly_map = folium.Map(
    location=[avg_y_coord.mean(),avg_x_coord.mean()],    
    zoom_start=8
)

# overlay choropleth
gjson = folium.features.GeoJson(
    df,
    style_function=population_choropleth,
    ).add_to(poly_map)

# add colormap to the map
poly_map.add_child(colors)

# display
poly_map

### Exercise

You are given a geospatial dataset for the provinces in Vietnam. The dataset contains the polygon geometry along with information on the name of the province, whether it is a city or not, and its surface area in sq. km. 

In [None]:
gdf_vietnam = gpd.read_file('./Data/case_study/vietnam_bound.geojson')
gdf_vietnam['area_sqkm']=round(gdf_vietnam.to_crs(epsg=9215).area/10**6)
gdf_vietnam = gdf_vietnam[['VARNAME_1','ENGTYPE_1','area_sqkm','geometry']]
gdf_vietnam.columns = ['name','type','area_sqkm','geometry']

In [None]:
gdf_vietnam.plot(column='type');

Your task is to use `folium` as shown earlier in the notebook to display a Cloropeth map for the Vietnam provinces, where:

1. The map is centered on Hanoi, e.g. `latitude = 21.03 N`, `longitude = 105.8 E`;
2. The Cloropeth shows the surface area of each province; 
3. Colors are assigned using a linear `branca` colormap scaled between the minimum and the maximum;
4. The colorbar is visible on the map;
5. The map has a `Tooltip` feature that shows all information available on the province.

#### Solution

In [None]:
""" Your code here"""