# Atlas Trends API demonstration

The RIPE Atlas Trends API allows one to find patterns (*clusters*) in RIPE Atlas RTT measurements, in a way similar to what a human expert would do. The clustering is done using a nonparametric Bayesian model, the Hierarchical Dirichlet Process Hidden Markov Model (HDP-HMM).

1. [Minimal Example](#Minimal-Example)
1. [Examples](#Examples)
1. [API Endpoints](#API-Endpoints)

Notebook cells can be run by pressing <kbd>MAJ</kbd>+<kbd>Enter</kbd>.

In [None]:
# Import the `trends` module that contains the API client and various utilities.
try:
    import google.colab, sys
    !git clone https://github.com/maxmouchet/atlas-trends-demo.git
    sys.path.append('atlas-trends-demo')
except:
    import sys; sys.path.append('..')

In [None]:
from trends import *

In [None]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
set_mpl_style(font_size=12)

## Minimal Example

In this section we show the minimal code necessary to fetch a time series from the API and to display the segmentation.

In [None]:
client = AtlasTrendsClient(verbose=True) # The `verbose` flag enables query time reporting.

In [None]:
df = client.fetch_trends(
    msm_id   = 1437285,                   # Atlas measurement ID
    prb_id   = 6222,                      # Atlas probe ID
    start_dt = utc_datetime(2018, 5, 2),  # (Optional) Default: stop date - 7 days
    stop_dt  = utc_datetime(2018, 5, 10), # (Optional) Default: the current date
    as_df    = True                       # (Optional) Returns a Pandas DataFrame instead of a JSON object
)

In [None]:
df.head(3)

In [None]:
plot_trends(df)

## Examples

### Persistent Congestion

In [None]:
df = client.fetch_trends(1791307, 6042, start_dt=utc_datetime(2018,5,2), stop_dt=utc_datetime(2018,5,10), as_df=True)
plot_trends(df)

In [None]:
plot_trends(df)
plt.xlim(utc_datetime(2018, 5, 3, 10), utc_datetime(2018, 5, 6, 10));

In this example some link on the path seems to experience periodic congestion in the evening.  
*(Ticks on the x-axis corresponds to midnight UTC time).*

The green state which lasts for 40 minutes on average seems to correspond to a state where the traffic level is high but the link is not saturated.  
The pink state which lasts for 2h45 on average seems to correspond to a saturated link.  

In [None]:
plt.figure(figsize=(4, 2.5))
plot_kde(df, states=[3, 6])
plt.legend(['pre-congestion state', 'congestion state']);

## API Endpoints

The API base URL is https://trends.atlas.ripe.net/api/v1/.

The API offers 3 endpoints:

Method | Path | Parameters | Description | Example
:------|:-----|:-----------|:------------|:-------
GET | **`/trends/:msm_id/:prb_id`** | `start`, `stop` | Segment a time series and returns the RTT and its associated state | [/trends/1437285/6222?start=1525212000&stop=1525298400](https://trends.atlas.ripe.net/api/v1/trends/1437285/6222?start=1525212000&stop=1525298400)
GET | **`/trends/:msm_id/:prb_id/summary`** | `start`, `stop` | Segment a time series and return the segments | [/trends/1437285/6222/summary?start=1525212000&stop=1525298400](https://trends.atlas.ripe.net/api/v1/trends/1437285/6222/summary?start=1525212000&stop=1525298400)
GET | **`/ticks/:msm_id/:prb_id`** | `start`, `stop` | Output the ticks (with deduplication, ...) | [/ticks/1437285/6222?start=1525212000&stop=1525298400](https://trends.atlas.ripe.net/api/v1/ticks/1437285/6222?start=1525212000&stop=1525298400)

- Start and stop date are UTC time and can be specified as a Unix timestamp or as `YYYY-MM-DDTHH:MM` where `THH:MM` is optional (default to start of day).
- It will not work for durations longer than a month, or shorter than a 100 ticks (runs of the measurements, so for a measurement that runs every 4 minutes (the default) this will amount to at least 400 minutes).
- Segmentation takes a time linear with the number of observations.

In [None]:
client = AtlasTrendsClient()

In [None]:
query = {
    'msm_id':   1437285,
    'prb_id':   6222,
    'start_dt': utc_datetime(2018, 5, 2),
    'stop_dt':  utc_datetime(2018, 5, 2, 12)
}

**`Ticks` endpoint**

The `/ticks` endpoint returns the minimum RTT for a given pair with a constant time interval (duplicated results due to probes connectivity problems are suppressed, and missing results are explicitly inserted).

In [None]:
res = client.fetch_ticks(**query)
schema = res['metadata']['schema']
print(schema)

In [None]:
for (i, result) in enumerate(res['results'][:2]):
    print('\nResult #{}'.format(i))
    for (key, value) in zip(schema, result):
        print('- {} = {}'.format(key, value))

**`Trends` endpoint**

The `/trends` endpoint returns the minimum RTT and the associated segmentation.

In [None]:
res = client.fetch_trends(**query)
schema = res['metadata']['schema']
print(schema)

In [None]:
for (i, result) in enumerate(res['results'][:2]):
    print('\nResult #{}'.format(i))
    for (key, value) in zip(schema, result):
        print('- {} = {}'.format(key, value))

**`Summary` endpoint**

A summary of the time series can also be requested by appending `/summary` to the path.

In [None]:
res = client.fetch_summary(**query)

In [None]:
res['states']

In [None]:
res['segments']

### DataFrame conversion

*Ticks* and *trends* results can be easily converted to a [Pandas](https://pandas.pydata.org/) DataFrame, either by using the `to_dataframe` method, or by using the `as_df` parameter.

In [None]:
res = client.fetch_trends(**query)
to_dataframe(res).head(2)

In [None]:
client.fetch_trends(**query, as_df=True).head(2)