# Run models

In Notebook 3 we learn how to train and predict built-in ecg models. We consider [fft_inceprion](https://github.com/analysiscenter/ecg/blob/unify_models/doc/fft_model.md) model as an example.

Some necessary imports before to start. Note ```ModelEcgBatch``` that contains models is imported rather than plain ```EcgBatch```:

In [1]:
import sys
import os
import numpy as np
from sklearn.metrics import f1_score
sys.path.append(os.path.join("..", "..", ".."))

import ecg.dataset as ds
from ecg.batch import ModelEcgBatch

Using TensorFlow backend.


Then we create an ecg dataset (see [Notebook 1](https://github.com/analysiscenter/ecg/blob/unify_models/doc/ecg_tutorial_part_1.ipynb) for details):

In [2]:
index = ds.FilesIndex(path=".../data/ECG/*.hea", no_ext=True, sort=True)
eds = ds.Dataset(index, batch_class=ModelEcgBatch)

Now we want to divide the dataset into 2 parts that will be used for train and validation. Method ```cv_splint``` do this job:

In [3]:
eds.cv_split(0.8)

Now 80% of the dataset are in ```eds.train``` and the rest in ```eds.test```.

Let's define a preprocess pipeline. Ttis part is common for train and prediction:

In [4]:
preprocess_pipeline = (ds.Pipeline()
                       .load(fmt="wfdb", components=["signal", "meta"])
                       .load(src=".../data/ECG/REFERENCE.csv",
                             fmt="csv", components="target")
                       .drop_labels(["~"])
                       .replace_labels({"N": "NO", "O": "NO"})
                       .random_resample_signals("normal", loc=300, scale=10)
                       .drop_short_signals(4000)
                       .segment_signals(3000, 3000)
                       .binarize_labels()
                       .apply(np.transpose, [0, 2, 1])
                       .ravel())

## Train pipeline

Train pipeline is preprocess pipeline plus ```train_on_batch``` action. We exploit pipeline algebra to merge two pipelines:

In [5]:
fft_train_pipeline = (preprocess_pipeline +
                      ds.Pipeline().train_on_batch('fft_inception', metrics=f1_score, average='macro'))

Then we only have to pass dataset to pipeline and start the calculation:

In [6]:
fft_trained = (eds.train >> fft_train_pipeline).run(batch_size=300, shuffle=True, drop_last=True, n_epochs=1, prefetch=0)

  'precision', 'predicted', average, warn_for)


As a result we obtain pipeline ```fft_trained``` that contains trained model. Let's make a prediction!

## Predict pipeline

Predict pipeline is preprocess pipeline plus ```import_model``` action plus ```predict_on_batch``` action. Model can be imported from dump file or from pipeline with trained model. We show the second option since we have ```fft_trained``` pipeline: 

In [7]:
fft_predict_pipeline = ((ds.Pipeline()
                         .import_model('fft_inception', fft_trained)
                         .init_variable("prediction", [])) +
                        preprocess_pipeline +
                        ds.Pipeline().predict_on_batch('fft_inception'))

Note that we aslo add action ```init_variable```. It defines empty list ```prediction``` that will store output of the model.

To start caclulation we pass ecg dataset into pipeline and call action ```run```:

In [8]:
predicted = (eds.test >> fft_predict_pipeline).run(batch_size=300, shuffle=False, drop_last=False, n_epochs=1)

To see the output we read pipeline variable ```prediction```:

In [9]:
print(predicted.get_variable('prediction'))

[array([[ 0.03159175,  0.96840829],
       [ 0.05374619,  0.94625378],
       [ 0.05049713,  0.94950283],
       ..., 
       [ 0.0613869 ,  0.93861312],
       [ 0.06168694,  0.93831307],
       [ 0.06264812,  0.93735182]], dtype=float32), array([[ 0.02915464,  0.9708454 ],
       [ 0.02819072,  0.97180927],
       [ 0.02667561,  0.97332442],
       ..., 
       [ 0.02050101,  0.97949904],
       [ 0.04027931,  0.95972073],
       [ 0.03398024,  0.96601969]], dtype=float32), array([[ 0.00240983,  0.99759018],
       [ 0.02517843,  0.97482163],
       [ 0.04068676,  0.95931321],
       ..., 
       [ 0.0757554 ,  0.92424452],
       [ 0.08056658,  0.91943341],
       [ 0.07819081,  0.92180914]], dtype=float32), array([[ 0.01943815,  0.98056191],
       [ 0.06241378,  0.93758619],
       [ 0.062431  ,  0.93756902],
       ..., 
       [ 0.04630136,  0.95369864],
       [ 0.0429494 ,  0.95705062],
       [ 0.04455574,  0.95544428]], dtype=float32), array([[ 0.06281535,  0.93718469],
    

This is the end of Notebook 3. See previous topics in [Notebook 1](https://github.com/analysiscenter/ecg/blob/unify_models/doc/ecg_tutorial_part_1.ipynb) and [Notebook 2](https://github.com/analysiscenter/ecg/blob/unify_models/doc/ecg_tutorial_part_2.ipynb). See more on ecg models [here](https://github.com/analysiscenter/ecg/blob/unify_models/doc/models.md).