# 06-Trajectory Querying

This notebooks exemplifies the querying of quadkey-indexed trajectories using the Extended Vehicle Energy Dataset.

**Requirements**: Run the `calculate-trajectories.py` script before running this notebook.

In [None]:
import folium
import numpy as np
import pandas as pd
import osmnx as ox
import geopandas as gpd
import networkx as nx
import math

from itertools import pairwise
from db.api import EVedDb
from folium.vector_layers import PolyLine, CircleMarker
from pyquadkey2 import quadkey
from numba import jit
from db.api import EVedDb
from tqdm.notebook import tqdm
from raster.drawing import smooth_line
from geo.qk import tile_to_str

from geo.trajectory import GraphRoute, GraphTrajectory, load_signal_range, load_trajectory_points, load_link_points, load_matching_links

## 06.01-Preparation

We start by loading the road network from Ann Arbor, Michigan, using the `GraphRoute` class.

In [None]:
gr = GraphRoute('Ann Arbor, Michigan')

Now, we create an arbitrary route useing two addresses. The `generate_route` function geocodes both addresses and returns the graph path that best represents the route.

In [None]:
route = gr.generate_route(addr_ini="122 N Thayer St, Ann Arbor, MI 48104, USA",
                          addr_end="1431 Ardmoor Ave, Ann Arbor, MI 48103, USA")

The `fit_bounding_box` uses a list of locations to fit a bounding box for the displayed data and set the appropriate map center and zoom.

In [None]:
def fit_bounding_box(html_map, bb_list):
    if isinstance(bb_list, list):
        ll = np.array(bb_list)
    else:
        ll = bb_list
        
    min_lat, max_lat = ll[:, 0].min(), ll[:, 0].max()
    min_lon, max_lon = ll[:, 1].min(), ll[:, 1].max()
    html_map.fit_bounds([[min_lat, min_lon], [max_lat, max_lon]])
    return html_map

In [None]:
def map_graph_route(graph_route):
    html_map = folium.Map(prefer_canvas=True, tiles="cartodbpositron", max_zoom=20, control_scale=True)
    
    empty_edges = []
    bb_list = []
    route_nodes = graph_route.get_route_nodes()
    
    for loc in route_nodes:
        bb_list.append((loc['y'], loc['x']))
    
    for l0, l1 in pairwise(route_nodes):
        line = [(l0['y'], l0['x']), (l1['y'], l1['x'])]
        
        PolyLine(line, weight=5, opacity=0.5).add_to(html_map)
        
    return fit_bounding_box(html_map, bb_list)

In [None]:
map_graph_route(gr)

## 06.02-Querying Using an Arbitrary Trajectory

In this section we will use the above trajectory to query the database for overlapping _trajectories_ and _trajectory segments_. Wer start by declaring some supporting functions and explain the process along the way.

Let's try it out with the above route, and convert it to the corresponding level 20 quadkeys.

In [None]:
route_df = pd.DataFrame(data=gr.get_route_quadkeys(), columns=["quadkey", "bearing"])
route_df

As you can see from the result above, we can now match the quadkeys to the existing _links_ while enforcing a similar bearing. This is, in essence, how we query. Let's see the result of querying the links that overlap the query trajectory. The function `get_overlapping_links` of the `GraphRoute` class returns a list of tuples containing the `link_id`, `traj_id`, `signal_ini` and `signal_end` values. These last two are identifiers of the `signal` table and define the range of signals in the link.

In [None]:
gr.get_overlapping_links()

To get the matching trajectories, we only need to retrieve the unique values of `traj_id` from the list above. This is already done for you in the `get_matching_trajectories` function.

In [None]:
gr.get_matching_trajectories()[0]

Note that for convenience reasons the above function returns a tuple containing the unique trajectory identifiers and the same result as the previous function. We can now retrieve all trajectory data from the database, but most of them will only have a small overlap with the query trajectory. To get the trajectories that overlap the most with the query trajectory, we use the `get_top_match_trajectories` function. By default, it matches the top 5% of trajectories and returns them.

In [None]:
gr.get_top_match_trajectories()

As you can see, there is a substantial reduction of trajectories when we filter out the lower-matching 95% of trajectories.

In [None]:
gr.get_matching_trajectories()[0].shape[0], gr.get_top_match_trajectories().shape[0]

In [None]:
def map_matching_links(graph_route):
    html_map = folium.Map(prefer_canvas=True, tiles="cartodbpositron", max_zoom=20, control_scale=True)
    
    empty_edges = []
    bb_list = []
            
    line = [(loc['y'], loc['x']) for loc in graph_route.get_route_nodes()]
    bb_list.extend(line)
        
    PolyLine(line, weight=12, opacity=0.5).add_to(html_map)
    
    ranges = graph_route.get_overlapping_signal_ranges()
    for r in tqdm(ranges):
        line = load_signal_range(r)
        if len(line):
            bb_list.extend(line)
            PolyLine(line, weight=3, color="red", opacity=0.5, popup=r).add_to(html_map)

    return fit_bounding_box(html_map, bb_list)

In [None]:
map_matching_links(gr)

In [None]:
# map_matching_links(g, route)

In [None]:
match_df = pd.DataFrame(data=gr.calculate_trajectory_matches(), columns=['traj_id', 'similarity'])

In [None]:
match_df["percent_rank"] = match_df["similarity"].rank(pct=True)

In [None]:
match_df.sort_values("percent_rank", ascending=False)

In [None]:
match_df[match_df["percent_rank"] > 0.95].sort_values("percent_rank", ascending=False)

In [None]:
def map_top_matching_trajectories_r(graph_route, top=0.05):
    html_map = folium.Map(prefer_canvas=True, tiles="cartodbpositron", max_zoom=20, control_scale=True)
    
    empty_edges = []
    bb_list = []
    
    line = []
    for loc in graph_route.get_route_nodes():
        p = (loc['y'], loc['x'])
        line.append(p)
        bb_list.append(p)
        
    PolyLine(line, weight=12, opacity=0.5).add_to(html_map)
    
    trajectories = graph_route.get_top_match_trajectories(top=top)
    for traj_id in trajectories:
        line = load_trajectory_points(int(traj_id))
        if len(line) > 0:
            bb_list.extend(line)
            PolyLine(line, weight=3, color="red", opacity=0.5).add_to(html_map)

    return fit_bounding_box(html_map, bb_list)

In [None]:
map_top_matching_trajectories_r(gr)

## 06.03-Querying Using an Existing Trajectory

In this section we will perform the same query but using a known trajectory instead

In [None]:
def map_top_matching_trajectories_t(traj_id, top=0.05):
    html_map = folium.Map(prefer_canvas=True, tiles="cartodbpositron", max_zoom=20, control_scale=True)
    
    bb_list = []
    gt = GraphTrajectory(int(traj_id))

    line = load_trajectory_points(int(traj_id))
    PolyLine(line, weight=12, opacity=0.5).add_to(html_map)
    
    trajectories = gt.get_top_matching_trajectories(top)
    for trajectory in trajectories:
        if trajectory != traj_id:
            line = load_trajectory_points(int(trajectory))
            if len(line) > 0:
                bb_list.extend(line)
                PolyLine(line, weight=3, color="red", opacity=0.5, popup=str(trajectory)).add_to(html_map)
                
    return fit_bounding_box(html_map, bb_list)

In [None]:
map_top_matching_trajectories_t(traj_id=4, top=0.01)

In [None]:
gt = GraphTrajectory(4)

In [None]:
gt.get_top_matching_trajectories()

In [None]:
def map_matching_links_t(traj_id):
    html_map = folium.Map(prefer_canvas=True, tiles="cartodbpositron", max_zoom=20, control_scale=True)
    
    bb_list = []
    gt = GraphTrajectory(int(traj_id))

    line = load_trajectory_points(int(traj_id))
    PolyLine(line, weight=12, opacity=0.5).add_to(html_map)
    
    bb_list.extend(line)
    
    links = gt.get_matching_links()
    print(len(links))
    for link in links:
        line = load_link_points(int(link))
        if len(line) > 0:
            bb_list.extend(line)
            PolyLine(line, weight=3, color="red", opacity=0.5, popup=str(link)).add_to(html_map)

    return fit_bounding_box(html_map, bb_list)

In [None]:
map_matching_links_t(traj_id=4)

In [None]:
gt = GraphTrajectory(4)

In [None]:
gt.get_top_matching_trajectories()

In [None]:
df = load_matching_links(4)

In [None]:
df

In [None]:
df[df["traj_id"] != 4]