# Khiva's time series resizing
This interactive notebook illustrates the usage of Khiva's time series resizing methods. It shows how these methods can be used to visualize time series, using less points and with a very similar graphical perception.

This exercise is focused on:
    
1. Time series resizing.
2. Time series visualisation using less points and with a very similar graphical perception.

The algorithms used in this notebook come from Khiva's [dimensionality](http://khiva-python.readthedocs.io/en/latest/khiva.html#module-khiva.dimensionality) module. In this case, three of those algorithms are going to be used:

1. [Visvalingam](http://khiva-python.readthedocs.io/en/latest/khiva.html#khiva.dimensionality.visvalingam)
2. [PIP](http://khiva-python.readthedocs.io/en/latest/khiva.html#khiva.dimensionality.pip)
3. [PAA](http://khiva-python.readthedocs.io/en/latest/khiva.html#khiva.dimensionality.paa)


In [68]:
from khiva.dimensionality import *
from khiva.library import *
from khiva.array import *

import pandas as pd

import warnings

warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=Warning)

import matplotlib.pyplot as plt
from ipywidgets import IntSlider, SelectionSlider, interact

from __future__ import print_function
import time

%config IPCompleter.greedy=True
%matplotlib inline
plt.rcParams['figure.figsize'] = [16, 5]

## Backend 
Prints the backend being used. The CPU, CUDA and OPENCL backends are available in Khiva.
    
This interactive application is being executed in **hub.mybinder** which doesn't provide a GPU and its CPU is quite limited so the resizing is going to take some time.

In [69]:
print(get_backend())

KHIVABackend.KHIVA_BACKEND_CPU


## Data load
A file containing information related with electrical consumptions is going to be loaded. The **value** column corresponds to electrical consumption of the given site. For this notebook, we are going to use just 3000 points given the limited CPU capabilities of **hub.mybinder**.

In [70]:
df = pd.read_csv('../../energy/data/data-enerNoc/all-data/csv/6.csv')
number_of_points = 3000
values = df['value'].as_matrix()[0:number_of_points]
a = Array([range(number_of_points),values.tolist()])
df.head(5)

Unnamed: 0,timestamp,dttm_utc,value,estimated,anomaly
0,1325376600,2012-01-01 00:10:00,52.1147,0,
1,1325376900,2012-01-01 00:15:00,50.9517,0,
2,1325377200,2012-01-01 00:20:00,49.8164,0,
3,1325377500,2012-01-01 00:25:00,49.1795,0,
4,1325377800,2012-01-01 00:30:00,47.6288,0,


## Visvalingam
Remember to visit the algorithm [documentation](http://khiva-python.readthedocs.io/en/latest/khiva.html#khiva.dimensionality.visvalingam) in order to understand it.
Use the slider to select the target number of points.

In [71]:
def use_visvalingam(points):
    global a
    global values
       
    if(points == number_of_points): 
        b = a.to_numpy()
        plt.plot(b[0], b[1])
        plt.title("No resize applied.")
    else:
        start = time.time()
        b = visvalingam(a, int(points)).to_numpy()
        print("Time taken: " + str(time.time() - start) + " seconds.")
        plt.plot(b[0], b[1])
        plt.title("Visvalingam applied. Converting " + str(number_of_points) + " points to " + str(points) + " points.")
    plt.show()
    
   
interact(use_visvalingam,points=IntSlider(min=100, max=number_of_points, step=1, continuous_update=False, value = number_of_points));

interactive(children=(IntSlider(value=3000, continuous_update=False, description='points', max=3000, min=100),…

## PIP
Please, select the desired number of points using the slider. Please, visit the algorithm [documentation](http://khiva-python.readthedocs.io/en/latest/khiva.html#khiva.dimensionality.pip) to understand how to use it correctly. 
> Note: This algorithm is slower than Visvalingam, but this one conserves better the peaks of the time series under study. As you increment the number of target points, the algorithm gets slower.

In [72]:
def use_pip(pips):
    global a
    global values
       
    if(pips == number_of_points):
        plt.plot(range(number_of_points), values)
        plt.title("No resize applied.")
    else:
        start = time.time()
        b = pip(a, int(pips)).to_numpy()
        print("Time taken: " + str(time.time() - start) + " seconds.")
        plt.plot(b[0], b[1])
        plt.title("Pip applied. Converting " + str(number_of_points) + " points to " + str(pips) + " points.")
    plt.show()
    
   
interact(use_pip,pips=IntSlider(min=100, max=number_of_points, step=1, continuous_update=False, value = 100));

interactive(children=(IntSlider(value=100, continuous_update=False, description='pips', max=3000, min=100), Ou…

## PAA
The PAA algorithm only works reducing the time series to a number of points equal to a factor of the original time series length. Because of this, the slider contains only the factors of the time series length to execute this algorithm.

For more information about this algorithm, please, visit its [documentation](http://khiva-python.readthedocs.io/en/latest/khiva.html#khiva.dimensionality.paa).

In [None]:

def get_factors(x):
    factors = []
    for i in range(1, x + 1):
        if x % i == 0:
            factors.append(i)
    return factors

divisors = get_factors(number_of_points)

def use_paa(bins):
    global a
    global values
    
    if(bins == number_of_points): 
        plt.plot(range(number_of_points), values)
        plt.title("No resize applied.")
    else:
        start = time.time()
        b = paa(a, int(bins)).to_numpy()
        print("Time taken: " + str(time.time() - start) + " seconds.")
        plt.plot(b[0], b[1])
        plt.title("PAA applied. Converting " + str(number_of_points) + " points to " + str(bins) + " points.")

    plt.show()
    
interact(use_paa, bins=SelectionSlider(options = divisors, continuous_update=False, value = divisors[len(divisors) - 1]));

interactive(children=(SelectionSlider(continuous_update=False, description='bins', index=31, options=(1, 2, 3,…