# HMTSR: Hierarchical Multiresolution Time Series Representation

This notebook shows how to apply the representation method proposed in:

Igor Manojlović, Goran Švenda, Aleksandar Erdeljan, Milan Gavrić, Darko Čapko, *Hierarchical Multiresolution Representation of Streaming Time Series*, Big Data Research 26: 100256, 2021, DOI: [10.1016/j.bdr.2021.100256](https://doi.org/10.1016/j.bdr.2021.100256)


In [None]:
from culearn.features import *
from random import random

# First, we need to prepare the time resolutions.
resolutions = TimeTree(
    TimeResolution(minutes=5),
    TimeResolution(minutes=10),
    TimeResolution(minutes=15),
    TimeResolution(minutes=30),
)

# Then, we can create the representation model.
model = HMTSR(resolutions, MultiSeriesDictionary())
# By default, this model will store buffer and disc in a dictionary,
# while applying Piecewise Statistical Approximation to time series values.
# However, this can be changed via multiseries and approximation parameters, respectively.

# The model is updated by processing time series tuples as a stream.
for i in range(1000):
    # We can just generate some random time series value.
    x = TimeSeriesTuple(Time.unix(i), random())

    # The process function will update the model.
    model.process(x)

# Finally, we can explore the results:
for resolution in resolutions:
    print("Time resolution:", resolution)
    print("\tTuple in buffer:\n\t", model.multiseries.buffer(resolution))
    print("\tLast tuple on disk:\n\t", model.multiseries.disc(resolution)[-1])

In [None]:
from culearn.data import REFIT
from culearn.util import parallel

# We can also use predefined data sources.
source = REFIT('../data/REFIT')
# Checkout other data sources in the Datasets notebook.

# We just need to prepare the dataset first.
dataset = source.dataset()
# Please note that REFIT currently does not support automatic download,
# so it will raise an exception instructing you to download the data manually.
# You will be able to create the dataset after that by re-executing the code.

# We can process the time series in parallel, but we need new process function.
def process(ts):
    m = HMTSR(resolutions, MultiSeriesDictionary())
    for x in ts.stream():
        m.process(x)
    return ts.ts_id, m

# Finally, we can call the new process function from multiple threads.
models = parallel(process, dataset.y)