# `dfipy` Quick Start Guide - Creating a Polygon

This notebook will guide you through querying a large synthetic traffic dataset in the General System Platform.

Please refer to our [developers guide](https://developers.generalsystem.com) for the most up-to-date companion documentation.

Additional resources and help are available on the [General System support pages](https://support.generalsystem.com).

### Let's go

### Install Dependencies

If you are using Google Colab, this will set up all required dependencies:

In [None]:
from copy import deepcopy
from getpass import getpass
from typing import Dict, List, Optional, Union

import geopandas as gpd
import pandas as pd
import requests
from IPython.display import Image
from shapely import points
from shapely.geometry import Polygon

# Google Colab setup
try:
    from google.colab import output

    output.enable_custom_widget_manager()  # allows KeplerGL map to display

    ! pip install dfipy==9.0.1 h3==3.7.6 keplergl==0.3.2

    import dfi.models.filters.geometry as geom
    from dfi import Client
    from dfi.models.filters import TimeRange
    from keplergl import KeplerGl

except ModuleNotFoundError:
    import dfi.models.filters.geometry as geom
    from dfi import Client
    from dfi.models.filters import TimeRange
    from keplergl import KeplerGl


Next, enter you API access token

In [None]:
api_token = getpass("Enter your API access token: ")


This code below will load the correct instance and namespace to allow you access the desired dataset.

In [None]:
base_url = "https://api.aspen.generalsystem.com"
dataset_id = "gs.prod-3"

dfi = Client(
    api_token=api_token,
    base_url=base_url,
    progress_bar=True,
)


### This tutorial will go over some of the basics of how to use DFI. It's designed for new users who are less familiar with Python.

In this tutorial we will cover some of the basic concepts such as:

- Building and displaying a single polygon
- Building and displaying multiple polygons
- Adding polygons to existing maps
- Saving polygon configurations



# First we will define our polygon using known points.

This works by defining what you want your polygon to be called then listing the coordinates in [longitude, latitude] in decimal format

In [None]:
tower_bridge = [
    [-0.07497244569200447, 51.506879956036116],
    [-0.07417159032214424, 51.50674703079539],
    [-0.07577330106092327, 51.50420476093609],
    [-0.07646737571403371, 51.50438754324021],
    [-0.07497244569200447, 51.506879956036116],
]


Next we visualise the results with kepler, via the `show_map` function defined below.

In [None]:
def show_map(
    list_polygons: Optional[List[List[List[float]]]] = None,
    dict_dfs: Optional[Dict[str, Union[gpd.GeoDataFrame, pd.DataFrame]]] = None,
    map_height: int = 1200,
    config: Optional[dict] = None,
) -> KeplerGl:
    if list_polygons is None:
        list_polygons = []

    dict_polygons = {f"polygon {idx}": poly for idx, poly in enumerate(list_polygons)}

    kepler_data = {}

    if len(dict_polygons) > 0:
        kepler_data.update(
            {
                "polygons": gpd.GeoDataFrame(
                    dict_polygons.keys(),
                    geometry=[Polygon(x) for x in dict_polygons.values()],
                )
            }
        )

    if dict_dfs is not None:
        for key, df in dict_dfs.items():
            kepler_data.update({key: df.copy()})

    if config is None:
        return KeplerGl(data=deepcopy(kepler_data), height=map_height)
    return KeplerGl(data=deepcopy(kepler_data), height=map_height, config=config)


In [None]:
show_map(map_height=400, list_polygons=[tower_bridge])


We are able to use the same process to show multiple polygons

In [None]:
tower_bridge = [
    [-0.07497244569200447, 51.506879956036116],
    [-0.07417159032214424, 51.50674703079539],
    [-0.07577330106092327, 51.50420476093609],
    [-0.07646737571403371, 51.50438754324021],
    [-0.07497244569200447, 51.506879956036116],
]
london_bridge = [
    [-0.08774610658001879, 51.50673153372835],
    [-0.08845738460526711, 51.5068549133129],
    [-0.08752456096646478, 51.509104717729485],
    [-0.08703482855566101, 51.50898860151003],
    [-0.08774610658001879, 51.50673153372835],
]
southwark_bridge = [
    [-0.09464692707221982, 51.50800378328573],
    [-0.09450260195906579, 51.5079669682238],
    [-0.09330918296130603, 51.50976974829333],
    [-0.09366147964557642, 51.509867198333346],
    [-0.09464692707221982, 51.50800378328573],
]


Next we run the same code as before just listing out all of the desired polygons

In [None]:
show_map(map_height=400, list_polygons=[tower_bridge, london_bridge, southwark_bridge])


We are also able to visualise a map and use the built in kepler functions to create a polygon. The steps are as follows:

- First, you run the code below
- Select the polygon symbol on the right side of the map and draw your polygon
- Copy this geometry by right clicking on the inside of the polygon
- Paste the geometry into a separate cell and label it accordingly

In [None]:
# Display the gif example of how to draw a polygon
image_path = (
    "https://raw.githubusercontent.com/thegeneralsystem/dfipy-examples/main/examples/pictures/draw_a_polygon.gif"
)
Image(image_path)


In [None]:
# this creates a blank Kepler map, centered on London so that we can draw a polygon on it

url = "https://raw.githubusercontent.com/thegeneralsystem/dfipy-examples/main/examples/kepler_config/create_a_polygon.json"
response = requests.get(url, timeout=30)
kepler_config = response.json()

show_map(map_height=400, config=kepler_config)


Next we paste the coordinates to a variable

In [None]:
# Display the video example of how to format the new variable
image_path = (
    "https://raw.githubusercontent.com/thegeneralsystem/dfipy-examples/main/examples/pictures/paste_coordinates.gif"
)
Image(image_path)


In [None]:
waterloo_bridge = [
    [-0.11833561399725126, 51.50956788006355],
    [-0.11741263557939023, 51.50984639894247],
    [-0.11534292639998052, 51.50739189270742],
    [-0.11615402864550156, 51.5070785419682],
    [-0.11833561399725126, 51.50956788006355],
]


After drawing your polygon paste the coordinates with this format:

`Polygon_Name=[[long1, lat1], [long2, lat2], ect]`

Below is a dictionary of example polygons created from pasted coordinates

In [None]:
bridge_dict = {
    "tower_bridge": [
        [-0.07497244569200447, 51.506879956036116],
        [-0.07417159032214424, 51.50674703079539],
        [-0.07577330106092327, 51.50420476093609],
        [-0.07646737571403371, 51.50438754324021],
        [-0.07497244569200447, 51.506879956036116],
    ],
    "london_bridge": [
        [-0.08774610658001879, 51.50673153372835],
        [-0.08845738460526711, 51.5068549133129],
        [-0.08752456096646478, 51.509104717729485],
        [-0.08703482855566101, 51.50898860151003],
        [-0.08774610658001879, 51.50673153372835],
    ],
    "southwark_bridge": [
        [-0.09464692707221982, 51.50800378328573],
        [-0.09450260195906579, 51.5079669682238],
        [-0.09330918296130603, 51.50976974829333],
        [-0.09366147964557642, 51.509867198333346],
        [-0.09464692707221982, 51.50800378328573],
    ],
    "waterloo_bridge": [
        [-0.11833561399725126, 51.50956788006355],
        [-0.11741263557939023, 51.50984639894247],
        [-0.11534292639998052, 51.50739189270742],
        [-0.11615402864550156, 51.5070785419682],
        [-0.11833561399725126, 51.50956788006355],
    ],
    "westminster_bridge": [
        [-0.12358711992523448, 51.50076409572846],
        [-0.12357022401304299, 51.501063851357834],
        [-0.11992915504511152, 51.50091660322511],
        [-0.11993760300072383, 51.500643141144444],
        [-0.12358711992523448, 51.50076409572846],
    ],
    "blackfriers_bridge": [
        [-0.10461014353003963, 51.51097942297182],
        [-0.10409203766559522, 51.51097942297182],
        [-0.10413404624939136, 51.50844332058941],
        [-0.1046801578360697, 51.50846075132744],
        [-0.10461014353003963, 51.51097942297182],
    ],
}


# Saving as multiple polygons to one map

It is often easier to save multiple polygons that will be referenced in the future to one map. This simplifies the code and makes it so we can easily add data and export our work to share with others.

The first step is to assign the map we already created to a new variable to simplify the code.

In [None]:
map_1 = show_map(
    map_height=400,
    list_polygons=[waterloo_bridge, tower_bridge, london_bridge, southwark_bridge],
)


Now we can show the map with far less code.

In [None]:
map_1


# Saving the polygons

You are able to export and save your polygons to share with co-workers by using the below code


In [None]:
map_1.save_to_html(file_name="map.html")


Replace 'map.html' with the desired file name and extension. The file will be saved in your current working directory.

You can then download the file to be shared and used on other projects.

# Query the Polygons

Now that we have polygons, what can we do with them? While there are many functions in DFI, we will only be using a few during this example. We will focus on functions that will allow us to collect all the records inside of our polygons, then figure out the unique records, and finally visualise the results.

**Bare in mind the synthetic data set the DFI is querying is over 92Bn records**

Below is a basic query to introduce you to the idea. First, your time window must be defined using the [TimeRange](https://dfipy.docs.generalsystem.com/reference/models/filters/time_range/) class: in this example we used the `from_strings` method. Next, define the geometry you want to search within via the [Polygon](https://dfipy.docs.generalsystem.com/reference/models/filters/geometry/polygon/) class, which is part of the geometry module. Finally, you can use `query.records()`, with the geometry and time range as arguments, to retrieve the data from the DFI!

In [None]:
time_range = TimeRange().from_strings(min_time="2022-01-01T08:00:00+00:00", max_time="2022-01-01T09:30:00+00:00")
tower_bridge_polygon = geom.Polygon().from_raw_coords(coordinates=tower_bridge, geojson=True)

tower_bridge_records = dfi.query.records(dataset_id=dataset_id, geometry=tower_bridge_polygon, time_range=time_range)
print(f"Records downloaded: {len(tower_bridge_records):,}")
print(f"Vehicles found: {len(tower_bridge_records.id.unique()):,}")


For this example we will determine the most utilized bridge to cross the river Thames in downtown London between the hours of 8:00 to 9:30. Utilizing our polygons already have drawn, lets see which is the busiest.

In [None]:
# Create polygon geometries for, and query, each bridge in the dictionary
bridge_records = {}
for bridge_name, bridge_coords in bridge_dict.items():
    bridge_polygon = geom.Polygon().from_raw_coords(coordinates=bridge_coords, geojson=True)

    records = dfi.query.records(dataset_id=dataset_id, geometry=bridge_polygon, time_range=time_range)
    # Create Shapely Points from coordinates column and form a GeoDataFrame
    bridge_records[bridge_name] = gpd.GeoDataFrame(
        records[["id", "time"]], geometry=points(records.coordinate.to_list()), crs="EPSG:4326"
    )
for bridge_name, bridge_df in bridge_records.items():
    print(f"Vehicles found on {bridge_name}: {len(bridge_df.id.unique())}")


After this quick search it is easy to see the busiest bridge is by far the London bridge, with 698 unique vehicle IDs passing through within the time range. Southwark bridge is the least busy bridge used with only 4 vehicles.

Finally lets visualize the results for the busiest bridge using a heatmap.


In [None]:
kepler_map = show_map(map_height=400, dict_dfs=bridge_records, config=kepler_config)
kepler_map


End of notebook