<a href="https://colab.research.google.com/github/kentstephen/duckdb_h3/blob/main/road_complexity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Analyzing Road Complexity by Aggregating Intersection Data from Overture Maps

In [None]:
! pip install keplergl duckdb pandas -q

In [None]:
import duckdb
con = duckdb.connect(config={"allow_unsigned_extensions": True})
con.sql(""" INSTALL h3ext FROM 'https://pub-cc26a6fd5d8240078bd0c2e0623393a5.r2.dev';
            LOAD h3ext;
            INSTALL spatial;
            LOAD spatial;
            INSTALL httpfs;
            LOAD httpfs;
            SET s3_region='us-west-2';""")

This is how you find areas to query from Overture. It takes some getting used to but you can query it to troubleshoot, I suggest looking at [Overture's Docs](https://docs.overturemaps.org/schema/reference/divisions/division_area/) to get a sense of what to look for

In [None]:
bbox, wkt_geom = con.sql("""
select
    bbox,
    ST_AsText(ST_GeomFromWKB(geometry)) AS wkt
from read_parquet('s3://overturemaps-us-west-2/release/2024-06-13-beta.0/theme=divisions/type=division_area/*', filename=true, hive_partitioning=1)
where
    country = 'US'
    and region = 'US-CA'
    and subtype = 'county'
   and names.primary = 'Los Angeles'

"""
).fetchall()[0]

### We are querying the connnector type from the transportation theme.
1. We first create a geom from Overture's binary geometries, then finding centroids (probably unnecessary but useful if there's a surprise).

2. We filter by bbox and wkt (I call it `wkt_geom` because if I am using shapely too it can get confusing) to find our area of interest.

3. We then make cells with the lat and long, addding a `COUNT(1)` in order to get the volume of connectors for each cell.

4. Then we turn the H3 into strings, which is useful to visualize in the [Kepler.gl](kepler.gl) web app or [Four Square Studio](https://studio.foursquare.com/), but not necessary for Kepler in Jupyter.

5. We then calculate the SUM of the counts, but I'm not sure if I really need to do that. Maybe some SQL wizard will let me know.

I keep hearing about about the Ibis framework, if someone wants to convert this query into Ibis, I'd be very curious to see what it would look like.

In [None]:
resolution = 9
query=  f"""
    WITH geometry_cte AS (
        SELECT
            ST_Y(ST_Centroid(ST_GeomFromWKB(geometry))) AS latitude,
            ST_X(ST_Centroid(ST_GeomFromWKB(geometry))) AS longitude
        FROM read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=transportation/type=connector/*', filename=true, hive_partitioning=1)
         WHERE
            bbox.xmin <= {bbox["xmax"]}
            AND bbox.xmax >= {bbox["xmin"]}
            AND bbox.ymin <= {bbox["ymax"]}
            AND bbox.ymax >= {bbox["ymin"]}
            AND ST_Intersects(ST_GeomFromWKB(geometry), ST_GeomFromText('{wkt_geom}'))
    ),
    h3_cells_cte AS (
        SELECT
            h3_latlng_to_cell(latitude, longitude, {resolution}) AS cell_id,
            COUNT(1) as cnt
        FROM geometry_cte
        GROUP BY 1
    )
        SELECT
            h3_h3_to_string(cell_id) AS cell_id,
            h3_cell_to_boundary_wkt(cell_id) as cell_boundary,
            SUM(cnt) as cnt
        FROM h3_cells_cte
        GROUP BY ALL
        """
df = con.sql(query).df()

Kepler config. You can save yours after you launch the map and make changes, go to the print statement below and then swap the JSON.

In [None]:
kepler_config = {'version': 'v1', 'config': {'visState': {'filters': [], 'layers': [{'id': '5t1wszk', 'type': 'geojson', 'config': {'dataId': 'Intersection Denisty', 'label': 'Intersection Denisty', 'color': [34, 63, 154], 'highlightColor': [252, 242, 26, 255], 'columns': {'geojson': 'cell_boundary'}, 'isVisible': True, 'visConfig': {'opacity': 0.57, 'strokeOpacity': 0.8, 'thickness': 0.5, 'strokeColor': [218, 112, 191], 'colorRange': {'name': 'Uber Viz Diverging 2.5', 'type': 'diverging', 'category': 'Uber', 'colors': ['#00939C', '#3EADB3', '#7CC7CB', '#BAE1E2', '#F8C0AA', '#E68F71', '#D45F39', '#C22E00']}, 'strokeColorRange': {'name': 'Global Warming', 'type': 'sequential', 'category': 'Uber', 'colors': ['#5A1846', '#900C3F', '#C70039', '#E3611C', '#F1920E', '#FFC300']}, 'radius': 10, 'sizeRange': [0, 10], 'radiusRange': [0, 50], 'heightRange': [0, 500], 'elevationScale': 5, 'enableElevationZoomFactor': True, 'stroked': False, 'filled': True, 'enable3d': False, 'wireframe': False}, 'hidden': False, 'textLabel': [{'field': None, 'color': [255, 255, 255], 'size': 18, 'offset': [0, 0], 'anchor': 'start', 'alignment': 'center'}]}, 'visualChannels': {'colorField': {'name': 'cnt', 'type': 'integer'}, 'colorScale': 'quantile', 'strokeColorField': None, 'strokeColorScale': 'quantile', 'sizeField': None, 'sizeScale': 'linear', 'heightField': None, 'heightScale': 'linear', 'radiusField': None, 'radiusScale': 'linear'}}], 'interactionConfig': {'tooltip': {'fieldsToShow': {'Intersection Denisty': [{'name': 'cnt', 'format': None}]}, 'compareMode': False, 'compareType': 'absolute', 'enabled': True}, 'brush': {'size': 0.5, 'enabled': False}, 'geocoder': {'enabled': False}, 'coordinate': {'enabled': False}}, 'layerBlending': 'normal', 'splitMaps': [], 'animationConfig': {'currentTime': None, 'speed': 1}}, 'mapState': {'bearing': 0, 'dragRotate': False, 'latitude': 34.03500665586551, 'longitude': -118.33464976136455, 'pitch': 0, 'zoom': 11.43681840906374, 'isSplit': False}, 'mapStyle': {'styleType': 'dark', 'topLayerGroups': {'label': True, 'road': True}, 'visibleLayerGroups': {'label': True, 'road': True, 'border': False, 'building': True, 'water': True, 'land': True, '3d building': False}, 'threeDBuildingColor': [9.665468314072013, 17.18305478057247, 31.1442867897876], 'mapStyles': {}}}}

This is for Colab:

In [None]:
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    from google.colab import output
    output.enable_custom_widget_manager()
else:
    print("Not running in Google Colab. Skipping custom widget manager setup.")

In [None]:
# Load an empty map
from keplergl import KeplerGl

# Create an instance of KeplerGl
map_1 = KeplerGl()

# Add your data to the map, we only want the boundary and the cnt
map_1.add_data(data=df[["cell_boundary", "cnt"]], name='Intersection Denisty')

map_1.config = kepler_config
# Display the map
map_1

In [None]:
print(map_1.config)

## If you're thinking:
### *Hey Stephen that was pretty cool but I want to see what the whole world looks like, all right?*
Uncomment the following cells and see for yourself! Takes me about 5 minutes in VS Code, a good amount longer on Colab, I suggest running the following locally

I think it's prudent to not go past resolution 6. 4 is the most Colab would let me visualize and I would keep it there if you are going to keep using Kepler in Jupyter.

In [None]:
# resolution = 4
# query=  f"""
#     WITH geometry_cte AS (
#         SELECT
#             ST_Y(ST_Centroid(ST_GeomFromWKB(geometry))) AS latitude,
#             ST_X(ST_Centroid(ST_GeomFromWKB(geometry))) AS longitude
#         FROM read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=transportation/type=connector/*', filename=true, hive_partitioning=1)
#     ),
#     h3_cells_cte AS (
#         SELECT
#             h3_latlng_to_cell(latitude, longitude, {resolution}) AS cell_id,
#             COUNT(1) as cnt
#         FROM geometry_cte
#         GROUP BY 1
#     )
#         SELECT
#             h3_h3_to_string(cell_id) AS cell_id,
#             h3_cell_to_boundary_wkt(cell_id) as cell_boundary,
#             SUM(cnt) as cnt
#         FROM h3_cells_cte
#         GROUP BY ALL
#         """
# world_df = con.sql(query).df()

You can unncomment and run this if you'd like to run in the [Kepler.gl](kepler.gl) web app or [Four Square Studio](https://studio.foursquare.com/). You just need the cell_id string and cnt. It's a little easier on your machine when you don't use the hexagon polygon. Just drop in the csv file.

In [None]:
# con.sql("""copy (
#     select
#         cell_id,
#         cnt
#     from world_df
#     )
#         to 'world_intersections_r_4.csv'

#         """)

In [None]:
kepler_config_world = {'version': 'v1', 'config': {'visState': {'filters': [{'dataId': ['Intersection Denisty'], 'id': '304eu5ond', 'name': ['cnt'], 'type': 'range', 'value': [1000, 641169], 'enlarged': False, 'plotType': 'histogram', 'animationWindow': 'free', 'yAxis': None, 'speed': 1}], 'layers': [{'id': 'cwsk4o', 'type': 'geojson', 'config': {'dataId': 'Intersection Denisty', 'label': 'Intersection Denisty', 'color': [18, 147, 154], 'highlightColor': [252, 242, 26, 255], 'columns': {'geojson': 'cell_boundary'}, 'isVisible': True, 'visConfig': {'opacity': 0.8, 'strokeOpacity': 0.8, 'thickness': 0.5, 'strokeColor': [221, 178, 124], 'colorRange': {'name': 'Global Warming', 'type': 'sequential', 'category': 'Uber', 'colors': ['#5A1846', '#900C3F', '#C70039', '#E3611C', '#F1920E', '#FFC300']}, 'strokeColorRange': {'name': 'Global Warming', 'type': 'sequential', 'category': 'Uber', 'colors': ['#5A1846', '#900C3F', '#C70039', '#E3611C', '#F1920E', '#FFC300']}, 'radius': 10, 'sizeRange': [0, 10], 'radiusRange': [0, 50], 'heightRange': [0, 500], 'elevationScale': 5, 'enableElevationZoomFactor': True, 'stroked': False, 'filled': True, 'enable3d': False, 'wireframe': False}, 'hidden': False, 'textLabel': [{'field': None, 'color': [255, 255, 255], 'size': 18, 'offset': [0, 0], 'anchor': 'start', 'alignment': 'center'}]}, 'visualChannels': {'colorField': {'name': 'cnt', 'type': 'integer'}, 'colorScale': 'quantile', 'strokeColorField': None, 'strokeColorScale': 'quantile', 'sizeField': None, 'sizeScale': 'linear', 'heightField': None, 'heightScale': 'linear', 'radiusField': None, 'radiusScale': 'linear'}}], 'interactionConfig': {'tooltip': {'fieldsToShow': {'Intersection Denisty': [{'name': 'cnt', 'format': None}]}, 'compareMode': False, 'compareType': 'absolute', 'enabled': True}, 'brush': {'size': 0.5, 'enabled': False}, 'geocoder': {'enabled': False}, 'coordinate': {'enabled': False}}, 'layerBlending': 'normal', 'splitMaps': [], 'animationConfig': {'currentTime': None, 'speed': 1}}, 'mapState': {'bearing': 0, 'dragRotate': False, 'latitude': 22.75549645676714, 'longitude': 22.565008872105324, 'pitch': 0, 'zoom': 0.7987392656530802, 'isSplit': False}, 'mapStyle': {'styleType': 'dark', 'topLayerGroups': {}, 'visibleLayerGroups': {'label': False, 'road': False, 'border': False, 'building': True, 'water': True, 'land': True, '3d building': False}, 'threeDBuildingColor': [9.665468314072013, 17.18305478057247, 31.1442867897876], 'mapStyles': {}}}}

In [None]:
# # Load an empty map
# from keplergl import KeplerGl

# # Create an instance of KeplerGl
# map_2 = KeplerGl()

# # Add your data to the map
# map_2.add_data(data=world_df[["cell_boundary", "cnt"]], name='Intersection Denisty')

# map_2.config = kepler_config_world
# # Display the map
# map_2

In [None]:
# print(map_2.config)