# An Introduction to Time Series Classification in Julia

## Preliminaries

In [1]:
#using MLJ
using MLJTime

## Load data

* link to data set description
* plot a time series in X
* show unique class values in y 
* describe what the learning problem is

In [2]:
X, y = ts_dataset("Chinatown");

## Split data into training and test set

* perhaps write a simpler function as in sklearn to do that in a single line of code

In [6]:
train, test = partition(eachindex(y), 0.7, shuffle=true, rng=1234); # 70:30 split

X_train, y_train = X[train], y[train]
X_test, y_test = X[test], y[test]

(Table with 109 rows, 24 columns:
Columns:
[1m#   [22m[1mcolname  [22m[1mtype[22m
────────────────────
1   1        Float64
2   2        Float64
3   3        Float64
4   4        Float64
5   5        Float64
6   6        Float64
7   7        Float64
8   8        Float64
9   9        Float64
10  10       Float64
11  11       Float64
12  12       Float64
13  13       Float64
14  14       Float64
15  15       Float64
16  16       Float64
17  17       Float64
18  18       Float64
19  19       Float64
20  20       Float64
21  21       Float64
22  22       Float64
23  23       Float64
24  24       Float64, CategoricalArrays.CategoricalValue{Float64,UInt32}[1.0, 2.0, 2.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 1.0  …  2.0, 2.0, 2.0, 1.0, 1.0, 1.0, 2.0, 1.0, 2.0, 2.0])

## Build time series classification model

* describe that we have the same interface as MLJ (hyper-parameters, machine, fit/predict, etc)
* link to time series forest paper
* describe algorithm in simple words

In [27]:
model = TimeSeriesForestClassifier(n_trees=100)
mach = machine(model, X_train, y_train)

[34mMachine{TimeSeriesForestClassifier} @ 5…60[39m


## Fit model

In [8]:
fit!(mach)

┌ Info: Training [34mMachine{TimeSeriesForestClassifier} @ 6…67[39m.
└ @ MLJBase /Users/mloning/.julia/packages/MLJBase/8HOpr/src/machines.jl:187


[34mMachine{TimeSeriesForestClassifier} @ 6…67[39m


## Make prediction

In [9]:
y_pred = predict(mach, X_test);

## Evaluate predictive performance

In [28]:
# this needs fixing
y1 = map(x -> x.prob_given_ref[1]==1 ? 1 : 2, y_pred )
MLJTime.L1(y1, y_test)

98.1651376146789

## Tuning

In [16]:
using MLJBase: L1, CV, range, cross_entropy
using MLJTuning

In [24]:
tsf = TimeSeriesForestClassifier()
r = range(tsf, :n_trees, lower=100, upper=500, scale=:log)
cv = CV(nfolds=10, shuffle=true)
tuned_model = TunedModel(model=tsf, ranges=[r, ], measure=cross_entropy, resampling=cv)
mach = machine(tuned_model, X_train, y_train)

[34mMachine{ProbabilisticTunedModel{Grid,…}} @ 6…74[39m


In [25]:
fit!(mach, force=true)

┌ Info: Training [34mMachine{ProbabilisticTunedModel{Grid,…}} @ 6…74[39m.
└ @ MLJBase /Users/mloning/.julia/packages/MLJBase/8HOpr/src/machines.jl:187
┌ Info: Attempting to evaluate 10 models.
└ @ MLJTuning /Users/mloning/.julia/packages/MLJTuning/JZ7ZX/src/tuned_models.jl:501


[34mMachine{ProbabilisticTunedModel{Grid,…}} @ 6…74[39m


In [26]:
fitted_params(mach).best_model

UndefVarError: UndefVarError: fitted_params not defined

In [132]:
y_pred = predict(mach, X_test);

In [133]:
y1 = map(x -> x.prob_given_ref[1]==1 ? 1 : 2, y_pred )
MLJTime.L1(y1, y_test)

88.07339449541284