## Setup

In [7]:
from IPython.display import IFrame, HTML

In [8]:
CSS = """
.output {
    align-items: center;
}
"""

HTML('<style>{}</style>'.format(CSS))

## Map Matching

### Background
After the data has been sorted, map-matching needs to be done for accurate analysis of the data.
This is because some GPS pings may not be accurate, and appear on non-roads.
Benefits of map-matching include:
- Proper comparison between the route taken by the vehicle, and routes obtained through Google Maps API/ OpenStreetMap API
- Distance can be calculated accurately

After the data has been sorted, map-matching needs to be done for accurate analysis of the data.
This is because some GPS pings may not be accurate, and appear on non-roads.
Benefits of map-matching include:
- Proper comparison between the route taken by the vehicle, and routes obtained through Google Maps API/ OpenStreetMap API
- Distance can be calculated accurately

Inspired by [this article](https://towardsdatascience.com/map-matching-done-right-using-valhallas-meili-f635ebd17053), [`Valhalla`](https://github.com/valhalla/valhalla), a C++ library is utlised to do the map matching.
`Valhalla` is an open source routing engine that provides features such as turn-by-turn directions, isochrones, tour optimisation and map matching, for use with OpenStreetMap data.

Put simply, `Valhalla`'s map matching is based on obtaining the most likely map match through the Viterbi algorithm in the context of Hidden Markov Models.
In layman terms, for each point the nearest candidates of road segments within a given radius are found, and the segment with the highest probability is chosen.

### Implementation details
In order to utilise `Valhalla`'s functionality, it first needs to be run on a server.
For quick reference, a demo server with an open-source web app interface is available [here](https://valhalla.openstreetmap.de/).

For the purposes of this project, a local server was used as the demo servers are rate-limited.
The server was set up based on instructions from [here](https://gis-ops.com/valhalla-part-1-how-to-install-on-ubuntu/) and [here](https://gis-ops.com/valhalla-part-2-how-to-run-valhalla-on-ubuntu/).
The OpenStreetMap data for Singapore and Jakarta were obtained from [Geofabrik](https://download.geofabrik.de/).

After setting up the server locally at `http://localhost:8002`, requests can then be sent to it to obtain a map-matched version of trajectories.
Instead of matching all trajectories, a subset of them which start and end at Point-of-Interests (POIs) were considered.

#### Preparing individual trajectory data from processed data

In [None]:
path = "./data"
files = os.listdir(path)

path_data = os.path.join(path,"processed_sgp.ftr")

df=pd.read_feather(path_data)
groups = df.groupby("trj_id")

In [None]:
# Reading in the list of trajectories
with open(os.path.join(path,"jkt_matching_poi.txt"), 'r') as f:
    l = list(map(str,eval(f.read())))

#### Forming the request and sending to Valhalla

In [None]:
for id in tqdm(l):
    if os.path.isfile(f"./data/matched/sgp/{id}.ftr"):
        continue
    try:
        match_location(group)
    except Exception as e:
        print(f"Request for {id} failed with message {e}")

In [None]:
def match_location(group):
    df = group[["rawlat","rawlng"]]
    df.columns=['lat','lon']
    name = group.iloc[0,0]
    meili_coordinates = df.to_json(orient='records')
    meili_head = '{"shape":'
    meili_tail = ""","search_radius": 150, "shape_match":"map_snap", "costing":"auto", "format":"osrm"}"""
    meili_request_body = meili_head + meili_coordinates + meili_tail
    
    url = "http://localhost:8002/trace_route"
    headers = {'Content-type': 'application/json'}
    data = str(meili_request_body)
    r = requests.post(url, data=data, headers=headers)
    
    assert r.status_code == 200
    
    # Parsing the response
    response_text = json.loads(r.text)
    resp = str(response_text['tracepoints'])
    resp = resp.replace("'waypoint_index': None", "'waypoint_index': '#'")
    resp = resp.replace("None", "{'matchings_index': '#', 'name': '', 'waypoint_index': '#', 'alternatives_count': 0, 'distance': 0, 'location': [0.0, 0.0]}")
    resp = resp.replace("'", '"')
    resp = json.dumps(resp)
    resp = json.loads(resp)
    df_response = pd.read_json(resp)
    df_response = df_response[['location']]
    group["location"]=df_response.iloc[:,0].values
    
    # Save the data
    group.reset_index(drop=True).to_feather(f"./data/matched/sgp/{name}.ftr")
    return 

#### Visualising the outputs
The matched data can then be plotted to visualise the significance of map matching.
Several examples are shown below.
The yellow dots are the raw data, while the blue dots are the matched data.

In [None]:
from keplergl import KeplerGl
def visualise(trj_id, config=None):
    df = pd.read_feather(f"./data/{trj_id}.ftr")

    matched_lon=df["location"].apply(lambda x:x[0])
    matched_lat=df["location"].apply(lambda x:x[1])
    matched_data=[matched_lon, matched_lat]
    matched_df=pd.concat(matched_data, axis=1)
    matched_df.columns=["lon","lat"]

    raw_df = df[["rawlng","rawlat"]]
    raw_df.columns=["lon","lat"]

    title = f'{trj_id}.html'
    map = KeplerGl(height=600, width=600)
    # add data to keplergl map
    map.add_data(data=raw_df, name="Raw data")
    map.add_data(data=matched_df, name="Matched data")
    if config: map.config=config
    map.save_to_html(file_name=title)
    return map

### Good example outputs (Remove this cell before submitting)
- 999
- 2644
- 3026
- 7387
- 9173

##### 9173

In [12]:
IFrame('./data/matched/final_maps/9173.html', width=600, height=600)

##### 7387

In [15]:
IFrame('./data/matched/final_maps/7387.html', width=600, height=600)

##### 3026

In [14]:
IFrame('./data/matched/final_maps/3026.html', width=600, height=600)