# Quick Start

We'll walk through creating a model for intraday volume prediction: predict trading volume in the next 10 minutes. For more data science use cases see the [Use Cases](./mini_use_cases.ipynb) notebook.

In [1]:
!pip install -U onetick-ds-framework
!pip install -U onetick.py

In [2]:
import os
import yaml

from dsframework.utils import build_experiment
import dsframework
import onetick.py as otp
print(dsframework.__version__)
print(otp.__version__)

from dsframework.utils import logger
import logging
logger.setLevel(logging.ERROR)

0.0.72
1.14.42


# Create experiment based on a config file
Configuring experiments is easy with a YAML config file. You can examine the file [here](./volume_prediction_config.yml).

In [3]:
config_path = os.path.join('./volume_prediction_config.yml')
config = yaml.load(open(config_path), Loader=yaml.Loader)
config['training']['search_cv'] = False # turn of hyperparameter tuning for this example
exp = build_experiment(config)

# Get data
You can examine the data specified in the config file and analyze it using you favorite Python libraries. The data is returned in a pandas DataFrame.

In [4]:
exp.get_data()

Unnamed: 0,Time,hhmm_fut,VOLUME_fut
0,2021-04-01 09:40:00,09:40,31967
1,2021-04-01 09:50:00,09:50,13194
2,2021-04-01 10:00:00,10:00,9774
3,2021-04-01 10:10:00,10:10,26026
4,2021-04-01 10:20:00,10:20,10889
...,...,...,...
10011,2022-04-01 15:20:00,15:20,9723
10012,2022-04-01 15:30:00,15:30,9506
10013,2022-04-01 15:40:00,15:40,10394
10014,2022-04-01 15:50:00,15:50,11058


# Prepare Features
The `prepare_data` method adds features, applies preprocessing, and splits the data into train/validate/test as per the config file. The following DataFrames will be available in `exp` after `prepare_data` is called: 

    x_train - features for training
    x_val - features for validation
    x_test - features for testing
    y_train - targets for training
    y_val - targets for validation 
    y_test - targets for testing

In [8]:
exp.prepare_data()

exp.x_train # can also examine exp.y_train, etc

Unnamed: 0,VOLUME_fut_lag_1,VOLUME_fut_lag_2,VOLUME_fut_lag_3,VOLUME_fut_lag_39,VOLUME_fut_lag_40
820,0.386654,0.545518,0.548112,0.432484,0.418640
821,0.420214,0.386639,0.545519,0.460330,0.432452
822,0.418077,0.420198,0.386641,0.436759,0.460295
823,0.533263,0.418060,0.420199,0.451760,0.436726
824,0.452859,0.533240,0.418061,0.492920,0.451727
...,...,...,...,...,...
7354,0.421037,0.368229,0.362486,0.346420,0.344855
7355,0.383425,0.421020,0.368230,0.371892,0.346395
7356,0.369573,0.383411,0.421021,0.344477,0.371865
7357,0.362603,0.369560,0.383412,0.329824,0.344453


# Train model
The next step is to train the model based on the parameters in the config file. Hyperparameter optimization will be performed using grid search if ranges of values are specified for model parameters (see [models and hyperparameters](./models_and_hyperparameters.ipynb)).

In [9]:
%%capture
exp.init_fit()

In [10]:
exp.current_model_params # parameters of the model

{'learning_rate': 0.01,
 'n_estimators': 100,
 'max_depth': 2,
 'min_child_weight': 2,
 'max_delta_step': 0,
 'subsample': 0.9,
 'nthread': 2}

# Predict targets
Get predictions now that we have a trained model.

In [11]:
predictions = exp.predict(x=exp.x_test)
predictions

Unnamed: 0,VOLUME_fut_PREDICTION
8513,32975.180390
8514,26944.110516
8515,20511.256732
8516,21225.624316
8517,19664.905713
...,...
10011,9138.217749
10012,9587.655103
10013,10531.510608
10014,10749.750977


# Evaluate the predictions

In [15]:
%%capture
metrics = exp.calc_metrics(y=exp.y_unprocessed.loc[exp.y_test.index],
                               prediction=predictions)

In [16]:
metrics

{'VOLUME_fut_R2': 0.6943093856701434,
 'VOLUME_fut_MAE': 2585.8595131799684,
 'VOLUME_fut_RMSE': 3766.6875414285173,
 'VOLUME_fut_MAPE': 0.256538897755401}

Log channel is reconnecting. Logs produced while the connection was down can be found on the head node of the cluster in `ray_client_server_[port].out`


For more data science use cases see the [Use Cases](./mini_use_cases.ipynb) notebook.