# How to optimize a forecaster

## Introduction
This method will traverse existing optimization methods(onnxruntime, openvino, jit, …) and save the model with minimum latency under the given data and search restrictions(accelerator, precision, accuracy_criterion) in forecaster.accelerated_model. This method is required to call before predict and evaluate. Now this function is only for non-distributed model.

## Set up
Before we begin, we need to install chronos if it isn’t already available, we choose to use pytorch as deep learning backend.

In [None]:
pip install --pre --upgrade bigdl-chronos[pytorch,inference]

## Forecaster preparation

Before the inferencing process, a forecaster should be created and trained. The training process is introduced in the previous guidance [Train forcaster on single node](https://bigdl.readthedocs.io/en/latest/doc/Chronos/Howto/how_to_train_forecaster_on_one_node.html) in detail, therefore we directly create and train a `TCNForecaster` based on the nyc taxi dataset.

In [None]:
# data preparation
def get_data():
    from bigdl.chronos.data import get_public_dataset
    from sklearn.preprocessing import StandardScaler

    # load the nyc taxi dataset
    tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')

    stand = StandardScaler()
    for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
        tsdata.impute()\
              .scale(stand, fit=tsdata is tsdata_train)

    # convert `tsdata_train` and `tsdata_test` to pytorch dataloader
    train_data = tsdata_train.to_torch_data_loader(roll=True, lookback=48, horizon=1)
    test_data = tsdata_test.to_torch_data_loader(roll=True, lookback=48, horizon=1)

    return train_data, test_data

# trained forecaster preparation
def get_trained_forecaster(train_data):
    from bigdl.chronos.forecaster.tcn_forecaster import TCNForecaster
    # create a TCNForecaster
    forecaster = TCNForecaster(past_seq_len=48,
                               future_seq_len=1,
                               input_feature_num=1,
                               output_feature_num=1)

    # train the forecaster on the training data
    forecaster.fit(train_data)
    return forecaster

And there are batch_size and quantize parameters you may want to change. If not familiar with manual hyperparameters tuning, just leave batch_size to the default value.

In [None]:
# get data for training and testing
train_data, test_data = get_data()
# get a trained forecaster
forecaster = get_trained_forecaster(train_data)

Traverse existing optimization methods(onnxruntime, openvino, jit, …) and save the model with minimum latency under the given data and search restrictions(accelerator, precision, accuracy_criterion) in forecaster.accelerated_model. 

In [None]:
forecaster.optimize(train_data, test_data, thread_num=1)

==========================Optimization Results==========================
```
 -------------------------------- ---------------------- -------------- ----------------------
|             method             |        status        | latency(ms)  |       accuracy       |
 -------------------------------- ---------------------- -------------- ----------------------
|            original            |      successful      |    0.798     |        0.023         |
|              bf16              |   fail to forward    |     None     |         None         |
|          static_int8           |      successful      |    1.416     |        0.022         |
|         jit_fp32_ipex          |    early stopped     |    26.346    |         None         |
|  jit_fp32_ipex_channels_last   |   fail to forward    |     None     |         None         |
|         jit_bf16_ipex          |      successful      |    0.405     |        0.023         |
|  jit_bf16_ipex_channels_last   |   fail to forward    |     None     |         None         |
|         openvino_fp32          |      successful      |    0.191     |        0.023*        |
|         openvino_int8          |      successful      |    0.185     |         0.26         |
|        onnxruntime_fp32        |      successful      |    0.075     |        0.023*        |
|    onnxruntime_int8_qlinear    |      successful      |    0.097     |        0.023         |
 -------------------------------- ---------------------- -------------- ----------------------
 ```
* means we assume the precision of the traced model does not change, so we don't recompute accuracy to save time.
Optimization cost 22.5s in total.
===========================Stop Optimization===========================