# 6-Visualize Clusters

In this notebook we use the results from the previous one to display the geographic clusters on a map.

**Requirements:**

- Please run the `5-trip-endpoints.ipynb` notebook first and its dependencies.
- Recommended install: [ipywidgets](https://ipywidgets.readthedocs.io/en/stable/user_install.html). Enable using `jupyter nbextension enable --py widgetsnbextension --sys-prefix` for Jupyter Notebook and `jupyter labextension install @jupyter-widgets/jupyterlab-manager` for Jupyter Lab.

In [1]:
import numpy as np
import folium

from folium.vector_layers import PolyLine
from sqlapi import VedDb
from h3 import h3
from shapely.geometry import Polygon
from shapely.ops import cascaded_union

Create an object of the `VedDB` type to interface with the database.

In [2]:
db = VedDb()

To illustrate the process of retrieval of an endpoint's hexagons, let's run a simple query to return all the H3 codes for cluster number 23.

In [3]:
sql = """
select h3 from cluster_point where cluster_id = 23
"""
hexes = list({h[0] for h in db.query(sql)})
hexes

['8c274996972dbff',
 '8c27499690de3ff',
 '8c27499690de1ff',
 '8c27499690de5ff',
 '8c27499690c25ff',
 '8c27499690d17ff',
 '8c27499690d3dff',
 '8c27499690d11ff',
 '8c27499690de7ff',
 '8c27499690d57ff',
 '8c27499690d33ff',
 '8c27499690dc9ff',
 '8c27499690d1dff',
 '8c27499690d15ff',
 '8c27499690d3bff',
 '8c27499690d8dff',
 '8c274996972d5ff',
 '8c27499690dc7ff',
 '8c27499690de9ff',
 '8c27499690d19ff',
 '8c27499690d07ff',
 '8c27499690debff',
 '8c27499690d1bff',
 '8c27499690c27ff',
 '8c27499690d13ff',
 '8c27499690dedff',
 '8c27499690dc5ff',
 '8c27499690d03ff',
 '8c27499690d0bff',
 '8c27499690d37ff',
 '8c274996972ddff',
 '8c274996972d9ff',
 '8c274996972cbff',
 '8c27499690d31ff',
 '8c27499690d39ff']

The code above uses a Python `set` comprehension in order to retrieve the unique H3 codes. As you probably guessed, it is very likely to find overlapping H3 hexagons in the same cluster, and using a `set` eliminates repetitions.

To convert an H3 code into a map object, we must first expand it into a set of six geo locations using the `h3_to_geo_boundary` function.

In [4]:
h = hexes[0]

The function call merely returns the hexagon vertices' geographic coordinates. Note how we copy the first coordinate to the back of the list in order to "close" the hexagon.

In [5]:
geo_lst = h3.h3_to_geo_boundary(h3_address=h)
geo_lst.append(geo_lst[0])
hexagon = np.array(geo_lst)
hexagon

array([[ 42.24338307, -83.74964819],
       [ 42.24335015, -83.74977634],
       [ 42.24325502, -83.74979783],
       [ 42.24319282, -83.74969117],
       [ 42.24322574, -83.74956301],
       [ 42.24332087, -83.74954153],
       [ 42.24338307, -83.74964819]])

## Display the Hexagon on the Map

To display the hexaagon, we can now use code similar to the one we used in notebook number 4 to display a trip on the map.

In [6]:
tiles = "cartodbpositron"
map = folium.Map(prefer_canvas=True)
t = folium.TileLayer(tiles).add_to(map)

Determine the shape's bounding box and fit the map view to it.

In [7]:
min_lat, max_lat = hexagon[:, 0].min(), hexagon[:, 0].max()
min_lon, max_lon = hexagon[:, 1].min(), hexagon[:, 1].max()
map.fit_bounds([[min_lat, min_lon], [max_lat, max_lon]])

In [8]:
color = '#3388ff'
opacity = 0.7
polyline = PolyLine(hexagon, color=color, opacity=opacity, fill=color)
p = polyline.add_to(map)

In [9]:
map

## Display the Cluster on the Map

We can now extend the code to the whole cluster and display all the hexaagons side-by side.

In [10]:
tiles = "cartodbpositron"
map = folium.Map(prefer_canvas=True)
t = folium.TileLayer(tiles).add_to(map)

bb_list = []  # List for the bounding-box calculation

for h in hexes:
    geo_lst = h3.h3_to_geo_boundary(h3_address=h)
    bb_list.extend(geo_lst)
    geo_lst.append(geo_lst[0])
    polyline = PolyLine(geo_lst, color=color, opacity=opacity, fill=color)
    p = polyline.add_to(map)
    
locations = np.array(bb_list)
min_lat, max_lat = locations[:, 0].min(), locations[:, 0].max()
min_lon, max_lon = locations[:, 1].min(), locations[:, 1].max()
map.fit_bounds([[min_lat, min_lon], [max_lat, max_lon]])
map

But we can still do better: how about only displaying the cluster outline? We can do so by merging all the hexagons together using Shapely's `Polygon` object and the `cascaded_union` function. The idea is to model eaach hexagon as a `Polygon` and then merge them into a single map polygon.

In [11]:
def create_map_polygon(xy, tooltip='',
                       color='#3388ff',
                       opacity=0.7,
                       fill_color='#3388ff',
                       fill_opacity=0.4, 
                       weight=3):
    points = [[x[0], x[1]] for x in xy]
    polygon = folium.vector_layers.Polygon(locations=points,
                                           tooltip=tooltip,
                                           fill=True,
                                           color=color,
                                           fill_color=fill_color,
                                           fill_opacity=fill_opacity,
                                           weight=weight,
                                           opacity=opacity)
    return polygon

Start by creating the map with the whitewashed tiles.

In [12]:
tiles = "cartodbpositron"
map = folium.Map(prefer_canvas=True)
t = folium.TileLayer(tiles).add_to(map)

Now, generate the H3 hexagons and convert them into Shapely `Polygon`s.

In [13]:
bb_list = []  # List for the bounding-box calculation
polygons = []
for h in hexes:
    points = h3.h3_to_geo_boundary(h3_address=h)
    xy = [[x[1], x[0]] for x in points]
    xy.append([points[0][1], points[0][0]])
    polygons.append(Polygon(xy))
    bb_list.extend(points)

With a single call to `cascaded_union`, we merge all the `Polygon` objects into a single one that we can conveniently plot on the map. Note that by converting all hexagons into a single polygon, we are not only making the display cleaner, but we are also reducing the amount of redundant points used to define the shape. If you decide to store the shape on a database, this will prove extremely useful.

In [14]:
merged = cascaded_union(polygons)

The polygon merge operation may have created a complex polygonal shape with "holes". To gracefully handle this situation, we must check if the generated object is a simple `Polygon` or a `MultiPolygon`. The former case is simple to handle, but for the latter we simply use the largest one.

In [15]:
if merged.geom_type == "MultiPolygon":
    max_len = 0
    largest = None
    for geom in merged.geoms:
        xy = geom.exterior.coords.xy
        lxy = list(zip(xy[1], xy[0]))

        if len(lxy) > max_len:
            max_len = len(lxy)
            largest = lxy

    create_map_polygon(largest).add_to(map)
elif merged.geom_type == "Polygon":
    xy = merged.exterior.coords.xy
    lxy = list(zip(xy[1], xy[0]))

    create_map_polygon(lxy).add_to(map)

We can now fit the shape boundaries to the map and display it.

In [16]:
locations = np.array(bb_list)
min_lat, max_lat = locations[:, 0].min(), locations[:, 0].max()
min_lon, max_lon = locations[:, 1].min(), locations[:, 1].max()
map.fit_bounds([[min_lat, min_lon], [max_lat, max_lon]])
map

If you hover the mouse cursor over the shape you will see an empty tooltip. This feature would be interesting to use should we be able to automatically name these clusters, using real street name information. This is the challenge for the next notebook.