# Visualizing Adatrap data (public transportation trips in Santiago) in Flowmap using the Unitrip format

This notebook shows how to use the module created to convert public transportation trips data to the Unitrip format. Then, from the format created, the data can be converted so that it can be visualized in flowmap.blue

## Preamble

In [1]:
import seaborn as sns
import pandas as pd
import geopandas as gpd

import sys
import os.path
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(os.getcwd()))))

%matplotlib inline
sns.set(context='notebook', font='Lucida Sans Unicode', style='white', palette='plasma')

Import Adatrap public transportation trips data

In [3]:
adatrap_table = pd.read_csv('../data/other_format/adatrap/2019-05-19.viajes', sep='|')
columns = ['id', 'nviaje', 'netapa', 
        'paraderosubida_1era', 'paraderosubida_2da', 'paraderosubida_3era', 'paraderosubida_4ta', 'tiemposubida_1era', 'tiemposubida_2da', 'tiemposubida_3era', 'tiemposubida_4ta',
        'paraderobajada_1era', 'paraderobajada_2da', 'paraderobajada_3era', 'paraderobajada_4ta', 'tiempobajada_1era', 'tiempobajada_2da', 'tiempobajada_3era', 'tiempobajada_4ta']
adatrap_table = adatrap_table[columns]
adatrap_table.head()

  adatrap_table = pd.read_csv('../data/other_format/adatrap/2019-05-19.viajes', sep='|')


Unnamed: 0,id,nviaje,netapa,paraderosubida_1era,paraderosubida_2da,paraderosubida_3era,paraderosubida_4ta,tiemposubida_1era,tiemposubida_2da,tiemposubida_3era,tiemposubida_4ta,paraderobajada_1era,paraderobajada_2da,paraderobajada_3era,paraderobajada_4ta,tiempobajada_1era,tiempobajada_2da,tiempobajada_3era,tiempobajada_4ta
0,1163146,1,2,L-1-6-5-PO,T-20-73-OP-45,-,-,2019-05-19 16:10:32,2019-05-19 16:27:44,-,-,T-20-203-NS-10,-,-,-,2019-05-19 16:18:21,-,-,-
1,3819058,1,2,T-4-12-PO-20,E-17-12-SN-30,-,-,2019-05-19 14:50:24,2019-05-19 15:15:01,-,-,E-17-12-NS-25,L-17-39-5-NS,-,-,2019-05-19 15:07:57,2019-05-19 15:24:08,-,-
2,3819058,2,2,L-17-39-5-NS,E-17-12-SN-35,-,-,2019-05-19 20:01:32,2019-05-19 20:14:01,-,-,E-17-12-NS-20,T-3-12-OP-25,-,-,2019-05-19 20:08:34,2019-05-19 20:29:37,-,-
3,3862258,1,2,L-27-22-15-SN,LA CISTERNA L4A,-,-,2019-05-19 15:26:05,2019-05-19 15:43:13,-,-,I-26-228-SN-25,VICUNA MACKENNA,-,-,2019-05-19 15:39:50,2019-05-19 15:54:43,-,-
4,3862258,2,1,VICUNA MACKENNA,-,-,-,2019-05-19 18:36:56,-,-,-,-,-,-,-,-,-,-,-


Imports the data of the stops and stations with their location.

In [4]:
stations_table = pd.read_csv('../data/other_format/adatrap/DIC_777_fixed.csv', sep=',')
stations_table = stations_table[['parada/est.metro', 'x', 'y']]
stations_table.head()

Unnamed: 0,parada/est.metro,x,y
0,L-13-16-PO-5,335889,6292782
1,L-13-36-NS-7,334219,6292589
2,L-13-36-SN-17,334281,6292830
3,L-13-31-OP-5,334732,6292745
4,L-13-31-SN-10,334731,6292721


The module is used to convert the format to unitrip.

In [5]:
from unitrip.format_conversion.adatrap_converter import adatrap_to_unitrip

adatrap_to_unitrip(adatrap_table, stations_table, '../data/unified_format/santiago_adatrap_unitrip.parquet', h3_res=12)

There are 918270 trips from the first stop to the first depature
There are 230216 trips from the second stop to the second depature
There are 26509 trips from the third stop to the third depature
There are 666 trips from the fourth stop to the fourth depature
There are 1175661 trips in total
Fixing stations name format ...
There are 887331 trips with recognizable stations
File stored at ../data/unified_format/santiago_adatrap_unitrip.parquet


## Let's aggregate trips and create flows in unitrip format (unitrip -> uniflow)

Load the parquet to a dataframe with the unitrip data

In [6]:
santiago_unitrip = pd.read_parquet('../data/unified_format/santiago_adatrap_unitrip.parquet', columns=['user_id', 'o_h3_cell', 'd_h3_cell'])
santiago_unitrip.head()

Unnamed: 0_level_0,user_id,o_h3_cell,d_h3_cell
trip_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,1163146,8cb2c55450d2dff,8cb2c5541c589ff
1,3819058,8cb2c55603401ff,8cb2c556d284bff
2,3819058,8cb2c519e4d0bff,8cb2c556d2a2dff
3,3862258,8cb2c544052e9ff,8cb2c546302b9ff
4,4235386,8cb2c540b81e1ff,8cb2c5476d221ff


The aggregation will be performed using h3 cells, with a specific level of resolution and filtering with a minimum number of trips per flow

In [7]:
from unitrip.unified_format.trip_to_flow import unitrip_to_uniflow

unitrip_to_uniflow(santiago_unitrip, '../data/unified_format/santiago_adatrap_uniflow.parquet', flow_res=8, minimun_trips=10)

Set the h3 cell columns to a parent resolution ...
Trips from one cell to the same cell are deleted, and the same OD trips made by a user are grouped together. ...
There are 866892 unique trips
Aggregating trips of the same OD and with a minimun number of trips ...
There are 14874 flows
File stored at ../data/unified_format/santiago_adatrap_uniflow.parquet


## Let's create the flowmap.blue input from flows in unitrip format (uniflow)

Load the parquet to a dataframe with the Uniflow data

In [8]:
santiago_uniflow = pd.read_parquet('../data/unified_format/santiago_adatrap_uniflow.parquet')
santiago_uniflow.head()

Unnamed: 0,o_h3_cell,d_h3_cell,count
10,88b2c50825fffff,88b2c50827fffff,13
14,88b2c50825fffff,88b2c5086dfffff,27
18,88b2c50825fffff,88b2c5090bfffff,110
19,88b2c50825fffff,88b2c50919fffff,16
21,88b2c50825fffff,88b2c50929fffff,24


The module is used to convert flows (in unitrip format) to flowmap input

In [9]:
from unitrip.use_cases.flowmap_blue.flow_to_flowmap import generate_flows_locations

generate_flows_locations(santiago_uniflow, '../data/use_cases/flowmap/santiago_adatrap_flows.csv', '../data/use_cases/flowmap/santiago_adatrap_locations.csv')

flows file stored at ../data/use_cases/flowmap/santiago_adatrap_flows.csv
locations file stored at ../data/use_cases/flowmap/santiago_adatrap_locations.csv


Once these two files have been uploaded to flowmap.blue, the visualization can be seen [HERE](https://www.flowmap.blue/1FYctUexmJY863rKdKh1EIXdcGeKrpJKMmLGbO8Mo9Ho)