# Distribution Plot

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/distribution_plot.ipynb)

This notebook demonstrates the distribution plot of inspectus. The distribution plot is a plot that shows the distribution of a series of data.
At each step, the distribution of the data is calculated and maximum of 5 bands are drawn from 9 basis points. (0, 6.68, 15.87, 30.85, 50.00, 69.15, 84.13, 93.32, 100.00)
To focus on parts of the plot and zoom in, the minimap can be used. To select a single plot, use the legend on the top right.

![Training Loss Zoomed in](../assets/mnist_train_loss_tail.png)

Refer to the function documentation for more information. We'll explore common use cases in this notebook.

In [None]:
!pip install inspectus
!pip install --upgrade altair
!pip install labml

In [None]:
# Line chart
import inspectus

inspectus.distribution({'x': [x for x in range(0, 100)]})

To plot a distribution, instead of a single value at each point, a list of values can be provided. The distribution plot will show the distribution of the values at each point.

In [None]:
import numpy as np

def random_loss():
    return np.array([[1/x**2 + var/5 for var in np.random.rand(100)] for x in np.linspace(1, 5, num=500)])

train_loss = random_loss()
valid_loss = random_loss()+1

inspectus.distribution({'training_loss': train_loss, 'validation_loss': valid_loss})

In [None]:
# Not showing mean
inspectus.distribution({'training_loss': train_loss, 'validation_loss': valid_loss}, include_mean=False)

In [None]:
# showing borders
inspectus.distribution({'training_loss': train_loss, 'validation_loss': valid_loss}, include_borders=True)

Distribution function also directly supports the output format of our data logger.

In [None]:
from inspectus import data_logger

d_log = inspectus.data_logger.DataLogger('sample')
d_log.clear()
for i, entry in enumerate(train_loss):
    d_log.save('train_loss', entry, i)
for i, entry in enumerate(valid_loss):
    d_log.save('valid_loss', entry, i)
    
tl = d_log.read('train_loss')
vl = d_log.read('valid_loss')

inspectus.distribution({'training_loss': train_loss, 'validation_loss': valid_loss})

When passing an array of dictionaries as data, the format of each dictionary should be:
   - `"values": [list of values], "step": step`
      - In this case the distributions will be calculated for each step.
  - `"histogram": [list of values for basis points], "step": step, "mean": mean`
    - In this case the given values are used to render bands on the plot

To manually calculate percentiles for basis points of the distribution, use `inspectus.series_to_distribution` 