# Partial Data - Smartcity TPN Traffic Light Preemption

First, we need to import some Python modules. All charts-related functions are in our [helper file](charts_helper/helper.py).

In [1]:
import pandas as pd
import triscale
import sys
sys.path.append('/home/rodrigo/charts')

from charts_helper import helper

## Getting started

We need to define the experiment parameters, considering the experiment's runtime durations and desired confidence intervals. 

Using  [_Triscale_](https://github.com/romain-jacob/triscale) [[Paper]](http://doi.org/10.5281/zenodo.3464273)., we see that we need at least 25 samples to obtain a confidence level of 90% at the 15th percentile, allowing to discard the worst outlier. Without the outlier exclusion, 15 samples are enough to get the same confidence level at the same percentile.

In [2]:
percentile = 15
confidence = 90
triscale.experiment_sizing(
    percentile, 
    confidence,
    robustness=1,
    verbose=True); 

A one-sided bound of the 	15-th percentile
with a confidence level of	90 % 
requires a minimum of 		25 samples
with the worst 			1 run(s) excluded



Triscale also informs us that the confidence level at the desired percentile is symmetric.

In [3]:
percentile = 15 
confidence = 90 # the confidence level, in %

if (triscale.experiment_sizing(percentile,confidence,robustness=1) == 
    triscale.experiment_sizing(100-percentile,confidence,robustness=1)):
    print("It takes the same number of samples to estimate \
the \n{}-th and \n{}-th percentiles.".format(percentile, 100-percentile))

It takes the same number of samples to estimate the 
15-th and 
85-th percentiles.


## _Turin TuSTScenario_

This scenario was adapted from [this repository](https://github.com/marcorapelli/TuSTScenario) [[Paper]](https://ieeexplore.ieee.org/abstract/document/8958652). From those files, some minor changes have been made (the original work used mesoscopic simulation).

3 EVs were chosen to evaluate our proposal. There is nothing special in those routes, but we avoid low-priority streets. We  don't want leader vehicles being blocked, which raises the global timeloss and can happen with any preemption strategy (including No Preemption one, that does nothing). A fair comparison requires no blocking situation excluding, of course, the ones caused by traffic jams. We get the GPS position of our EVs in the simulation. The EV's routes can be checked below:

In [15]:
df = pd.read_csv('/home/rodrigo/charts/smartcity-tpn/turin-gps.csv')
helper.make_map(df,10,helper.make_title('route','turin'))

We restricted the entry window of vehicles to 6h-8h AM. One simulation (with a pre-defined seed) was done from 6h to EV's entry time (which is 7h AM). The simulation state was saved. After that, the EV was inserted in the network and a new seed is set to each simulation, which runs independently.

Hence, Our experiment, for that scenario, is:

_Number of Vehicles to be inserted in the original scenario (24h):_ **2,202,814**

_Number of Vehicles to be inserted in the our version (6h-8h):_ **175,873**

| EV       |      Distance         |  Traffic Lights |         
|----------|:---------------------:|----------------:|     
| EV1      |  9,961.68m             | 35              |     
| EV2      |    9,952.10m           |   19            |     
| EV3      | 6,059.09m              |    31           |

Our results are presented below.

In [9]:
df = pd.read_csv('/home/rodrigo/charts/smartcity-tpn/turin.csv')
df_algs = df[df['alg'] != 'no-preemption']
df_no_preemption = df[df['alg'] == 'no-preemption']

We want to see how many times our and other algorithms are better (or worse) than the baseline No Preemption version, using as metric the Time-loss (the difference between the optimal total travel time when the vehicle is free-flowing and the current total travel time). Hence, the Time-loss Improvement is $\frac{timeloss~before}{timeloss~after}$, if $timeloss~before \geq timeloss~after$ and $-\frac{timeloss~after}{timeloss~before}$ otherwise.

In [10]:
helper.make_boxplot_grouped(df_algs, 'imp', helper.make_title('tl-imp','turin'))

If we are looking for the Time-Loss Improvement percentage, we must use $(1-\frac{timeloss~after}{timeloss~before})\times100$. However, this analysis is upper-bound limited to $100\%$, while there are no limit to the negative counterpart.

In [11]:
helper.make_boxplot_grouped(df[df['alg'] != 'no-preemption'], 'perc', 'Time-Loss Improvement - {}'.format(helper.scenarios['turin']))

In the sequence, we can see the Time-Loss variation only in the No Preemption version (to see our improvement potentials and limits). Fo example, No Preemption version for EV2 vary between 340 and 360 seconds (with one 448.55 second outlier). In this case, we can obtain good percentage, as we did, but not many times better (see the first chart), at least comparing with the other two EVS.

In [12]:
helper.make_boxplot(df_no_preemption, 'tl', helper.make_title('tl-no-preemption','turin'))

To get a better understand of the situation, we can compare Time-Loss as is between the solutions. With that, we can capture the scale of our improvement (the less the better).

In [13]:
helper.make_boxplot_grouped(df_algs, 'tl', helper.make_title('tl-algs','turin'))

At last but not least we can check the cost of runtime experiments. The medians are around 25k seconds (aproximatelly 7 hours), for each experiment. For that scenario, we could run up to 200 simulations (of 300, 25 seeds $\times$ 4 algorithms $\times$ 3 EVs) using our current computers.

In [14]:
helper.make_boxplot_grouped(df_algs, 'rt', helper.make_title('runtime','turin'))

## _TAPAS Cologne_

This scenario was adapted from [this page](https://sumo.dlr.de/docs/Data/Scenarios/TAPASCologne.html) [[Paper]](https://elib.dlr.de/45058/2/SRL_81_-_Beitrag_Varschen.pdf). The same process used in Turin TuSTScenario was used here. So, we'll state the differences.

_Number of Vehicles to be inserted in the original scenario (24h):_ **1,549,612**

_Number of Vehicles to be inserted in the our version (6h-8h):_ **252,754**

| EV       |      Distance         |  Traffic Lights |         
|----------|:---------------------:|----------------:|     
| EV1      |        8,351.42m       |       26        |     
| EV2      |         5,044.34m      |       27        |     
| EV3      |          6,344.31     |         24      |

The following can be interpreted like the previous scenario. You can see the map, routes and our results.

In [20]:
df = pd.read_csv('/home/rodrigo/charts/smartcity-tpn/cologne-gps.csv')
helper.make_map(df,9.25,helper.make_title('route','cologne'))

In [16]:
df = pd.read_csv('/home/rodrigo/charts/smartcity-tpn/cologne.csv')
df_algs = df[df['alg'] != 'no-preemption']
df_no_preemption = df[df['alg'] == 'no-preemption']

In [17]:
helper.make_boxplot_grouped(df_algs, 'imp', helper.make_title('tl-imp','cologne'))

In [18]:
helper.make_boxplot_grouped(df_algs, 'perc', helper.make_title('tl-perc','cologne'))

In [19]:
helper.make_boxplot(df_no_preemption, 'tl', helper.make_title('tl-no-preemption', 'cologne'))

In [20]:
helper.make_boxplot_grouped(df_algs, 'tl', helper.make_title('tl-algs','cologne'))

In [21]:
helper.make_boxplot_grouped(df_algs, 'rt', helper.make_title('runtime', 'cologne'))

## _Metro OD SP 2017_

This scenario was made in this work, using Survey data from Metro SP by the year of 2017 (available [here](https://transparencia.metrosp.com.br/dataset/pesquisa-origem-e-destino/resource/4362eaa3-c0aa-410a-a32b-37355c091075)). 

The coverage area is the Metropolitan Region of São Paulo, which includes 39 cities composing a total of 517 survey zones (see picture below). A trip in this survey contains a Origin zone, a Destination Zone, a travel mode, and a depart time (within a daily 24 time window). The original data is a sample of ~90k daily trips , which corresponds to 28M motorized (includind shared) trips.

| ![Coverage Area - OD Survey - Metro SP 2017](./smartcity-tpn/coverage-area.png) |
|:--:| 
| *Source: [This PDF, page 33](https://transparencia.metrosp.com.br/dataset/pesquisa-origem-e-destino/resource/b3d93105-f91e-43c6-b4c0-8d9c617a27fc)* |

We use this data to generate a 11M vehicle trips (for the shared trips, we considered an average load of 40 passengers by vehicle), including ordinary automobiles, motorcycles and buses. This complete SUMO Network and files were available [here](https://gitlab.com/rodrigo.g.branco/metro-od-2017).

However, SUMO can't handle this amount of data in a reasonable running time. Because of that, we limited our experiment to a area known as Expanded Center of São Paulo (we'll present this area when the EVs are introduced). Beside that, the normal window of 6h-8h like the previous scenarios, here, is too much for SUMO (aprox. 1M trips). So, our depart windows in this very scenario is, in fact, 6h30-7h30.

With that in mind, we have this configuration:

_Number of Vehicles to be inserted in the original (complete) scenario (24h):_ **13,215,558**

_Number of Vehicles to be inserted in the our (restricted to Expanded Center) version  (6h30-7h30):_ **601,114**

| EV       |      Distance         |  Traffic Lights |         
|----------|:---------------------:|----------------:|     
| EV1      |        17,760.26m      |        86      |     
| EV2      |       6,912.21m        |        15       |     
| EV3      |      11,486.74m          |       61        |

Now we can see our EVs routes, plus the limit of the expanded center.

In [22]:
df = pd.read_csv('/home/rodrigo/charts/smartcity-tpn/expanded-sp-gps.csv')
helper.make_map(df, 11.10, helper.make_title('route','metro-od-2017'))

In [23]:
df = pd.read_csv('/home/rodrigo/charts/smartcity-tpn/expanded-center.csv')
df_algs = df[df['alg'] != 'no-preemption']
df_no_preemption = df[df['alg'] == 'no-preemption']

In [24]:
helper.make_boxplot_grouped(df_algs, 'imp', helper.make_title('tl-imp', 'metro-od-2017'))

In [25]:
helper.make_boxplot_grouped(df_algs, 'perc', helper.make_title('tl-perc', 'metro-od-2017'))

In [26]:
helper.make_boxplot(df_no_preemption, 'tl', helper.make_title('tl-no-preemption', 'metro-od-2017'))

In [27]:
helper.make_boxplot_grouped(df_algs, 'tl', helper.make_title('tl-algs', 'metro-od-2017'))

In [28]:
helper.make_boxplot_grouped(df_algs, 'rt', helper.make_title('runtime','metro-od-2017'))