# Intraday Volume Prediction: tuning hyperparameters and comparing models

Problem: predict trading volume in the next 10 minutes for Nasdaq E-mini futures.

Approach: Use Market Data Research Environment (MDRE) to access relevant market data and run experiments.

In this notebook, we show how to create a model, tune hyperparameters, and choose among several models.

More details about data preparation appear in a Medium [post](https://medium.com/machine-learning-with-market-data/trading-volume-prediction-on-the-example-of-nasdaq-index-futures-6033de7ba716?source=collection_home---6------0-----------------------).

In [22]:
%%capture
!pip install -U onetick-ds-framework

In [17]:
import os
import yaml
import pandas as pd

from dsframework.utils import build_experiment
import dsframework
dsframework.__version__

'0.0.72'

# XGBoost

Load the experiment described in the config [file](./volume_prediction_config.yml), tune hyperparameters using grid search over the combinations of parameter values specified in the config for the selected model. In this case we use XGBoostRegressor

            init_params:
                learning_rate: [.01, .03, .1]
                n_estimators: [100, 500, 1000]
                max_depth: [2, 3, 4]
                min_child_weight: [2, 4]
                max_delta_step: [0]
                subsample: [0.9]
                nthread: [2]

In [2]:
%%time
config_path = os.path.join('./volume_prediction_config.yml')

config = yaml.load(open(config_path), Loader=yaml.Loader)
config['training']['use'] = 'XGBRegressor'
exp = build_experiment(config)

exp.get_data()
exp.prepare_data()
exp.init_fit(local_mode=True)

`ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.
Log channel is reconnecting. Logs produced while the connection was down can be found on the head node of the cluster in `ray_client_server_[port].out`
The `start_trial` operation took 12.847 s, which may be a performance bottleneck.
The `start_trial` operation took 7.420 s, which may be a performance bottleneck.
The `start_trial` operation took 3.069 s, which may be a performance bottleneck.
The `start_trial` operation took 14.499 s, which may be a performance bottleneck.
The `start_trial` operation took 6.126 s, which may be a performance bottleneck.
The `start_trial` operation took 2.667 s, which may be a performance bottleneck.
The `start_trial` operation took 15.972 s, which may be a performance bottleneck.
The `start_trial` operation took 6.340 s, which may be a performance bottleneck.
The `start_trial` operation took 2.795 

CPU times: user 2.83 s, sys: 267 ms, total: 3.09 s
Wall time: 6min 23s


<dsframework.impl.models.regressor_models.XGBRegressor at 0x7f2924cee1c0>

The best hyperparameters are:

In [4]:
exp.current_model_params

{'learning_rate': 0.03,
 'n_estimators': 500,
 'max_depth': 3,
 'min_child_weight': 2,
 'max_delta_step': 0,
 'subsample': 0.9,
 'nthread': 2}

This set of hyperparameters was selected based on the `mean_test_score` computed as the average of the models performance during WalkForwardValidation. A relevant section of the config file:

    search_cv:
        val_type: 'WalkForward' # 'Simple','Cross','WalkForward'
        folds: 5
        eval_metric: 'MAE'
        early_stopping: True
        #Search params:
        tuning_cv: True
        search_optimization: 'grid' # 'grid', 'random', 'bayesian', 'bohb', 'hyperopt'
        n_trials: 2 # is used for 'random', 'bayesian', 'bohb'


In [13]:
params_comparison = pd.DataFrame(exp.cv_model.cv_results_)
params_comparison.sort_values(by='rank_test_score').reset_index().head()

Unnamed: 0,index,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score,time_total_s,training_iteration,param_learning_rate,param_n_estimators,param_max_depth,param_min_child_weight,param_max_delta_step,param_subsample,param_nthread
0,22,"{'learning_rate': 0.03, 'n_estimators': 500, 'max_depth': 3, 'min_child_weight': 2, 'max_delta_step': 0, 'subsample': 0.9, 'nthread': 2}",-1927.179122,-1731.081079,-2301.349205,-2020.802291,-2402.91701,-2076.665741,245.690164,1,6.084126,1,0.03,500,3,2,0,0.9,2
1,40,"{'learning_rate': 0.03, 'n_estimators': 1000, 'max_depth': 3, 'min_child_weight': 2, 'max_delta_step': 0, 'subsample': 0.9, 'nthread': 2}",-1927.179122,-1731.081079,-2301.349205,-2020.802291,-2402.91701,-2076.665741,245.690164,1,6.196923,1,0.03,1000,3,2,0,0.9,2
2,31,"{'learning_rate': 0.03, 'n_estimators': 500, 'max_depth': 3, 'min_child_weight': 4, 'max_delta_step': 0, 'subsample': 0.9, 'nthread': 2}",-1925.679314,-1734.779033,-2301.324955,-2020.315396,-2402.317327,-2076.883205,244.695429,3,6.381348,1,0.03,500,3,4,0,0.9,2
3,49,"{'learning_rate': 0.03, 'n_estimators': 1000, 'max_depth': 3, 'min_child_weight': 4, 'max_delta_step': 0, 'subsample': 0.9, 'nthread': 2}",-1925.679314,-1734.779033,-2301.324955,-2020.315396,-2402.317327,-2076.883205,244.695429,3,5.997833,1,0.03,1000,3,4,0,0.9,2
4,39,"{'learning_rate': 0.01, 'n_estimators': 1000, 'max_depth': 3, 'min_child_weight': 2, 'max_delta_step': 0, 'subsample': 0.9, 'nthread': 2}",-1922.950299,-1735.087423,-2301.679248,-2020.21987,-2407.894771,-2077.566322,246.507612,5,17.69006,1,0.01,1000,3,2,0,0.9,2


The model with the best set of hyperparmeters identified above can now be used to compute predictions based on the test data defined in the config

    splitters:
        -
            class: dsframework.impl.SplitXYTrainTest
            use: True
            columns_to_predict: ['VOLUME_fut']
            shuffle: False
            test_size: 0.1506
            val_size: 0.15
            
The performance of the model can be examined on the test set.

In [12]:
predictions = exp.predict(x=exp.x_test)
metrics = exp.calc_metrics(y=exp.y_unprocessed.loc[exp.y_test.index],
                           prediction=predictions)
metrics

{'VOLUME_fut_R2': 0.7510247238605942,
 'VOLUME_fut_MAE': 2287.483682523331,
 'VOLUME_fut_RMSE': 3399.3560457247017,
 'VOLUME_fut_MAPE': 0.19789891907778864}

# R

Trying a different model with its own set of hyperparameters takes just a few lines:


In [19]:
import ray
ray.shutdown()

In [20]:
# %%capture
config_path = os.path.join('./volume_prediction_config.yml')

config = yaml.load(open(config_path), Loader=yaml.Loader)
config['training']['use'] = 'DecisionTreeRegressor'
exp = build_experiment(config)

exp.get_data()
exp.prepare_data()
exp.init_fit(local_mode=True)

predictions = exp.predict(x=exp.x_test)
metrics = exp.calc_metrics(y=exp.y_unprocessed.loc[exp.y_test.index],
                           prediction=predictions)

2022-10-03 14:59:04,549	INFO worker.py:862 -- Using address ray://172.16.1.89:10001 set in the environment variable RAY_ADDRESS
2022-10-03 14:59:04,551	INFO client_builder.py:224 -- Passing the following kwargs to ray.init() on the server: local_mode
`ray.get_gpu_ids()` will always return the empty list when called from the driver. This is because Ray does not manage GPU allocations to the driver process.


In [21]:
params_comparison = pd.DataFrame(exp.cv_model.cv_results_)
params_comparison.sort_values(by='rank_test_score').reset_index()

Unnamed: 0,index,params,split0_test_score,split1_test_score,split2_test_score,split3_test_score,split4_test_score,mean_test_score,std_test_score,rank_test_score,time_total_s,training_iteration,param_max_depth
0,3,{'max_depth': 5},-2066.528301,-1886.882701,-2422.438167,-2089.33001,-2636.770868,-2220.390009,270.656148,1,0.064043,1,5
1,4,{'max_depth': 7},-2227.903955,-1856.63307,-2484.075173,-2076.426661,-2553.763625,-2239.760497,257.640358,2,0.075886,1,7
2,2,{'max_depth': 4},-2102.88873,-1866.945114,-2435.780523,-2160.145183,-2651.614936,-2243.474897,272.779301,3,0.057943,1,4
3,1,{'max_depth': 2},-2177.698384,-1894.29453,-2694.022462,-2396.275317,-2891.87628,-2410.833395,355.801792,4,0.050941,1,2
4,5,{'max_depth': 10},-2429.684288,-1990.989936,-2690.971599,-2360.483148,-2824.956956,-2459.417185,288.905814,5,0.105283,1,10
5,0,{'max_depth': 1},-2478.47255,-2106.533304,-2942.229808,-2491.228747,-3449.093549,-2693.511592,461.404052,6,0.037644,1,1
6,6,{'max_depth': 50},-2926.554592,-2522.378005,-3167.398689,-2687.239044,-3414.674969,-2943.64906,321.072242,7,0.179526,1,50


It may be informative to compare both the performance on the test set and the WalkForward's `mean_test_score` and `std_test_score` values when choosing across models (e.g., when comparing DNN and XGBoost).

In [22]:
predictions = exp.predict(x=exp.x_test)
metrics = exp.calc_metrics(y=exp.y_unprocessed.loc[exp.y_test.index],
                           prediction=predictions)
metrics

{'VOLUME_fut_R2': 0.7206484757661189,
 'VOLUME_fut_MAE': 2430.6749551020084,
 'VOLUME_fut_RMSE': 3600.759091668688,
 'VOLUME_fut_MAPE': 0.21395223528229643}