# TriScale Demo

This notebook demonstrates TriScale, what the framework functions do, and its basic usage.


## List of Imports

In [None]:
import os
from pathlib import Path

import pandas as pd
import numpy as np

import triscale


%load_ext autoreload
%autoreload 2

## Experiment sizing

During the design phase of an experiment, one important question to answer is "how many time should the experiments be performed?" 

... 

TriScale `experiment_sizing()` function directly returns the minimal number of samples required to estimate any KPI with any confidence level. You can try out sample values in the following cell.

In [None]:
# Select the percentile we want to estimate 
percentile = 50

# Select the desired level of confidence for the estimation
confidence = 95 # in %

# Compute the minimal number of samples N required
triscale.experiment_sizing(
    percentile, 
    confidence,
    verbose=True);              

To get a better feeling of how this minimal number of samples evolves this increasing confidence and more extreme percentiles, let us compute a range of minimal number of samples and display the results in a table (where the columns are the percentiles to estimate).

In [None]:
percentiles = [0.1, 1, 5, 10, 25, 50, 75, 90, 95, 99, 99.9]
confidences = [75, 90, 95, 99, 99.9, 99.99]
min_number_samples = []

for c in confidences:
    tmp = []
    for p in percentiles:
        N = triscale.experiment_sizing(p,c)
        tmp.append(N[0])
    min_number_samples.append(tmp)
    
df = pd.DataFrame(columns=percentiles, data=min_number_samples)
df['Confidence level'] = confidences
df.set_index('Confidence level', inplace=True)

display(df)

Conversely, we can derive what inter-sample ranges define a CI for a certain percentile and confidence level.

(...)

## Computation of metrics

TriScale computes a metric ... (blablabla)

Valid input format:  a two-dimentional series used for the computation of
the metric: one control variate (x), one independent variate (y).
- When a string is passed, `data` is expected to be a name of a csv file
(comma separated) with `x` data in the first column and `y` data in the
second column.
- When a pandas DataFrame is passed, `data` must contain (at least)
columns named `x` and `y`.

In [None]:
# Input data file
data = 'ExampleData/bbr_datalink_delay_run9_flow1.csv'

# Definition of a TriScale metric
metric = {  'measure': 50,   # The percentile used as metric
            'name': 'One-way delay',   # For plotting only
            'unit': 'ms',     # For plotting only
            'bounds': [0,100], # Use for scaling (see convergence test description)
         }

convergence = {'expected': True}

has_converged, metric_measure, plot = triscale.analysis_metric( 
    data,
    metric,
    plot=True,
    convergence=convergence,
    custom_layout=None)

print(has_converged, metric_measure)

When `plot=True` is passed as argument to the `analysis_metric` function, the raw data is plotted together with the convergence test information (shall the convergend be expected). To see this better, we can zoom in the previous plot:

In [None]:
plot.update_layout(yaxis_range=[64,68])
plot.show()

The `Metric` data are being computed over a sliding window of measured data points (`Data`). The convergence test constists in performing a linear regression (`Slope`). TriScale defines that a run _has converged_ when the slope of the linear regression is sufficiently close to 0. 

Formally, a run _has converged_ if the confidence interval for the regression slope falls within the tolerence. TriScale uses default values of 95% confidence level and 1% tolerence (controllable by the user - see the `convergence` parameter of the `analysis_metric` function.

## Computation of KPIs

TriScale computes a metric ... (blablabla)

## Computation of Variability Scores

TriScale computes a metric ... (blablabla)

## Network profiling

In [None]:
# Recompute
# data_file = Path('UseCase_Glossy/Data_FlockLab/2019-08_FlockLab_sky.csv')
# df = flocklab.parse_data_file(str(data_file), active_link_threshold=50)

# Load
df = pd.read_csv('ExampleData/network_profiling.csv')

link_quality_bounds = [0,100]
link_quality_name = 'PRR [%]'

# Produce the plot
fig_theil, fig_autocorr = triscale.network_profiling(
                            df, 
                            link_quality_bounds, 
                            link_quality_name,
                            )
fig_autocorr.show()
fig_theil.show()
