<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#New-features" data-toc-modified-id="New-features-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>New features</a></span></li><li><span><a href="#New-in-v0.4.0" data-toc-modified-id="New-in-v0.4.0-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>New in v0.4.0</a></span></li><li><span><a href="#Setup" data-toc-modified-id="Setup-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Setup</a></span></li><li><span><a href="#Data" data-toc-modified-id="Data-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Data</a></span></li><li><span><a href="#Representation" data-toc-modified-id="Representation-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Representation</a></span></li><li><span><a href="#Support-for-Indices-(including-Date-dtype)" data-toc-modified-id="Support-for-Indices-(including-Date-dtype)-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Support for Indices (including <code>Date</code> dtype)</a></span></li><li><span><a href="#Customization" data-toc-modified-id="Customization-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Customization</a></span></li><li><span><a href="#Custom-Graph-Objects" data-toc-modified-id="Custom-Graph-Objects-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Custom Graph Objects</a></span></li></ul></div>

# Jupyter DataTables v0.3.0 - ChartJS

<br>

## New features

- **ChartJS** charts (see https://github.com/CermakM/jupyter-datatables/issues/9)
    - [x] Create `Bar` graph object
    - [x] Create `CategoricalBar` graph object
    - [x] [optional] Create `Line` graph object
    - [x] [optional] Create `Scatter` graph object
    - [x] Create `Histogram` graph object
    - [x] <del>Create `TimeSeries` graph object</del> Implemented via `Linear` with timeseries index
    - [x] ChartJS graphs are persistent
    - [x] [stretch] There is a link between the table and ChartJS tooltip
    
- **modular** architecture (see https://github.com/CermakM/jupyter-datatables/issues/10)
    - [x] it is possible to add custom data type mapping form Jupyter Notebook
    - [x] it is possible to map data types to custom plotting function directly from Jupyter Notebook
    - [x] custom graph objects
    
- intercative **tooltips**
- static mode is more explanatory
- sample size includes outliers
    
## New in v0.4.0

- [x] Interactvie chart **Toolbar**
- [x] Graph type selection via the toolbar settings
- [x] Refactorings:
    - `dTypePlotMap` -> `chartMap`

## Setup

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import sys
import string

import numpy as np
import pandas as pd

In [3]:
sys.path.insert(0, '/home/macermak/code/jupyter-datatables/')

In [4]:
from jupyter_datatables import enable
from jupyter_datatables import disable
from jupyter_datatables import init_datatables_mode

In [5]:
init_datatables_mode()

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

<JupyterRequire.display.SafeScript object>

---

## Data

In [6]:
df      = pd.DataFrame(np.random.randn(50, 5), columns=list(string.ascii_uppercase[:5]))
df_long = pd.DataFrame(np.random.randn(int(1e5), 5), columns=list(string.ascii_uppercase[:5]))
df_wide = pd.DataFrame(np.random.randn(50, 20), columns=list(string.ascii_uppercase[:20]))

labels = ["{0} - {1}".format(i, i + 9) for i in range(0, 100, 10)]
df_categorical = pd.DataFrame({'value': np.random.randint(0, 100, 20)})
df_categorical['group'] = pd.cut(df_categorical.value, range(0, 105, 10), right=False, labels=labels)

In [7]:
dft = pd.DataFrame({'A': np.random.rand(5),
                    'B': [1, 1, 3, 2, 1],
                    'C': 'This is a very long sentence that should automatically be trimmed',
                    'D': [pd.Timestamp('20010101'), pd.Timestamp('20010102'), pd.Timestamp('20010103'), pd.Timestamp('20010104'), pd.Timestamp('20010105')],
                    'E': pd.Series([1.0] * 5).astype('float32'),
                    'F': [False, True, False, False, True],
                   })

dft.D = dft.D.apply(pd.to_datetime)
dft.set_index('D', inplace=True)

del dft.index.name

---

## Representation

In [8]:
df

<JupyterRequire.display.SafeScript object>

Unnamed: 0,A,B,C,D,E
0,-0.977652,1.926043,-0.266936,-0.398156,0.499058
1,0.04218,-0.583094,-0.651699,-0.301726,-0.756926
2,-0.471586,-1.408047,-0.85691,1.053319,1.293189
3,0.657062,0.663418,0.853544,1.163067,0.046137
4,-0.553223,0.335225,-0.704246,2.596127,-0.593825
5,0.496776,-0.314547,-0.666387,0.489981,0.205503
6,0.618987,-1.747047,-0.52746,1.255368,0.217774
7,1.915062,1.533664,0.503438,0.647955,1.050213
8,1.539036,-0.849044,-0.289341,0.527538,1.537687
9,-0.743186,1.589183,-1.792678,-0.224399,-0.401293


In [11]:
enable()

In [None]:
df_long

Notice the automatic sampling, we sampled to 5,902 samples out of 100,000 while still preserving value of the data!

If you wish, however, to disable that feature, you may do so:

In [None]:
from jupyter_datatables.config import defaults

defaults.sample_size = 1000

In [None]:
df_long

And to allow sampling again simply set `sample_size` to `None`:

In [None]:
defaults.sample_size = None

Sampling can also be disabled completely (although it is not recommended). The `defaults.limit` specifies the limit after which, when exceeded, is a sample size computed.

In [None]:
defaults.limit = None

Let's take a sampe from the table of size 10,000, otherwise the computation would take a while and will consume quite a lot of resources

In [None]:
df_long.sample(10000)

Wide DataTables work as expected:

In [None]:
df_wide

## Support for Indices (including `Date` dtype)

Lets change the default plot for `num` from `Histogram` to `Line` and check our timeseries-like DataFrame

In [None]:
%%requirejs

$.fn.dataTable.defaults.dTypePlotMap['num'].unshift('Line')

In [None]:
dft

---

## Customization

In [None]:
%load_ext jupyter_require

In [None]:
%%requirejs

let defaultElementConfig = $("<pre/>").html(JSON.stringify(Chart.defaults.global.elements, null, 4))

element.append(defaultElementConfig)

Check out [ChartJS](https://www.chartjs.org/docs/latest/general/) docs for more information about default settings

---

## Custom Graph Objects

You can create your custom GraphObjects by implementing a function of the following specification:

```ts
interface Index {
    data: Array<any>,
    dtype: string
}

function(data: Array<any>, index: Array<Index>, dtype: string): Chart
```

Suppose we wanna plot colours and we want a special kind of plot for that

In [None]:
%%requirejs chartjs

let isValidColour = function(colour) {
    let s = new Option().style
    s.color = colour
    
    return s.color !== '' || console.debug(`Invalid CSS colour: '${colour}'.`)
}

let ColorPalette = function(data, index, dtype) {
    const canvas = document.createElement('canvas')
    const ctx    = canvas.getContext('2d')
    
    // perform check if the pattern is correct
    if ( !data.every( d => typeof(d) === 'string' && isValidColour(d) ) ) {
        console.debug("Data does not match colour pattern.")
        return
    }
    
    // evenly slice the Pie chart by number of colours
    const slices = new Array(data.length).fill(Number(1 / data.length).toFixed(2))
    const labels = index[0].data
    
    let chart = new Chart(ctx, {
        type: 'pie',
        data: {
            labels: labels,
            datasets: [{
                data: slices,
                backgroundColor: data,
            }]
        },
    })
    
    return chart
}

// Register the new chart
$.fn.dataTable.defaults.graphObjects['ColorPalette'] = ColorPalette

And set it as default for the dtype you wanna use it for (in this case `string`):

    The default setting is:
   
```
   { 
       boolean:  ['CategoricalBar', 'Histogram'],
       date:     ['CategoricalBar', 'Histogram'],
       num:      ['Histogram', 'CategoricalBar', 'Bar', 'Line'],
       string:   ['CategoricalBar'],

       undefined: ['Bar']
   }
    
```

    The order specifies fallback plots.

In [None]:
%%requirejs

$.fn.dataTable.defaults.chartMap['string'].unshift('ColorPalette')

In [None]:
df_colours = pd.DataFrame([
    {
        "colour": "red",
        "value" : "rgb(255, 99, 132)",
    },
    {
        "colour": "blue",
        "value" : "rgb(54, 162, 235)"
    },
    {
        "colour": "lightyellow",
        "value" : "rgba(255, 205, 86, 0.3)"  # alpha values via `rgba()`
    },
    {
        "colour": "darkorange",
        "value" : "darkorange"  # any valid CSS specifier
    }
])

df_colours.set_index("colour", inplace=True)

# As of v0.3.0, DataTables do not support index names properly
del df_colours.index.name

df_colours

We fall back to the default chart if the colour value is invalid based on our check and use the second chart in order:

In [None]:
df_other = pd.DataFrame([
    {
        "colour": "red",
        "value" : "red",
    },
    {
        "colour": "green",
        "value" : "invalid",
    },
    {
        "colour": "blue",
        "value" : "blue",
    },
    {
        "colour": "other",
        "value" : "invalid",
    }
])

df_other.set_index("colour", inplace=True)

del df_other.index.name

df_other