<a href="https://colab.research.google.com/github/WideSu/Python-for-DS/blob/main/Compare_HyperParam_Tuning_Methods.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

TO-DO
- [x] Test the average time usage and RMSE for each epoch using scikit-learn random search
- [ ] Test TPE hyper param tuning for HyperOpt, Ray, Optuna
- [ ] Plot the RMSE through timeline
- [ ] Use the different sampler in Optuna: Random,TPE,CMA-ES,NSGA-II

The outcome:
- A chart consisting the average RMSE and excuation time for all hyper parameter tunning methods

|HPO Package                                  |Avg RMSE                        |Avg Time Ellipsed                                            |
|---------------------------------------------|--------------------------------|-------------------------------------------------------------|
|Scikit-learn                                 |                                |                                                             |
|HyperOpt                                     |                                |                                                             |
|Ray                                          |                                |                                                             |
|Optuna                                       |                                |                                                             |

- One time series plot

<img src="https://user-images.githubusercontent.com/44923423/171923215-292e776a-79aa-4a08-8e81-a2ef627bd42a.png" data-canonical-src="https://user-images.githubusercontent.com/44923423/171923215-292e776a-79aa-4a08-8e81-a2ef627bd42a.png" width="500" height="300" />


|Library|Pros|Cons|Scenario|
|-|-|-|-|
|Scikit-learn|Flexible and basic|Only 2 basic methods (grid/random), New methods are not stable|Tradictional tuning|
|HyperOpt|High-speed and flexible,New search method: TPE/ATPE| Out-of-date interface |Time-limited|
|Ray|Systematic and well wrapped|Too customized and not flexible,Time-cost on initialization|Fast development and deployment with various tuning methods|
|Optuna|Well-performed and light;Include all popular and stable tuning methods |Not well wrapped for all methods|Accurate, flexible required|


In [None]:
# @title Mont on Google Drive
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/MyDrive/HPO/

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/HPO


In [None]:
# @title Install and import packages
! pip install dateutil
! pip install lightgbm
! pip install optuna
import pandas as pd
import dateutil
import datetime
import optuna
from tqdm import tqdm, trange
from lightgbm import LGBMRegressor
import sklearn
import math
import time

In [None]:
# @title Read-in data and check data type and volume
df = pd.read_csv('./exp_data.csv')
df.info()

In [None]:
# @title Change into datatime type
df[["date"]] = df[["date"]].apply(pd.to_datetime)

In [None]:
library_evaluation_df = {
    'Library' : [],
    'Train Start Date': [],
    'Train End Date': [],
    'Test Start Date': [],
    'Test End Date': [],
    'Smallest RMSE': [],
    'Time Ellipsed': []
}

In [None]:
# @title optuna hyper param tuning
# Configuration 
train_timespan_months = 180
whole_period_months = 60
test_timespan_months = 1
first_end_time = datetime.datetime(2015, 12, 1)
feat_cols = ['absacc', 'acc', 'age', 'agr', 'baspread','bm', 'bm_ia',
             'cash', 'cashdebt', 'cashpr', 'cfp', 'cfp_ia', 'chatoia', 'chcsho', 'chempia', 'chinv', 'chmom',
             'chpmia', 'chtx', 'cinvest', 'convind', 'currat', 'depr', 'divi', 'divo', 'dolvol', 'dy', 
             'egr', 'ep', 'gma', 'grcapx', 'grltnoa', 'herf', 'hire', 'ill', 'indmom', 'invest', 'lev', 'lgr',
             'maxret', 'mom12m', 'mom1m', 'mom36m', 'mom6m', 'ms', 'mve_ia', 'mvel1', 'nincr', 'operprof',
             'orgcap', 'pchcapx_ia', 'pchcurrat', 'pchdepr', 'pchgm_pchsale', 'pchquick', 'pchsale_pchinvt',
             'pchsale_pchrect', 'pchsale_pchxsga', 'pchsaleinv', 'pctacc', 'ps', 'quick', 'rd', 'rd_mve',
             'rd_sale', 'realestate', 'retvol', 'roaq', 'roavol', 'roeq', 'roic', 'rsup', 'salecash', 'pricedelay',
             'saleinv', 'salerec', 'secured', 'securedind', 'sgr', 'sin', 'sp', 'std_dolvol', 'std_turn',
             'stdacc', 'stdcf', 'tang', 'tb', 'turn', 'zerotrade','aeavol','ear','beta','betasq','idiovol']
y_col = 'ret'
train_end_date = first_end_time
time_usage = []
score_list = []
timeline = []

# Evaluation details for each train and test timespan
evaluate_detail_df = {
    'Train Start Date': [],
    'Train End Date': [],
    'Test Start Date': [],
    'Test End Date': [],
    'Smallest RMSE': [],
    'Time Ellipsed': []
}
predict_times = 60
for period_time in trange(predict_times):
    train_start_date = train_end_date - dateutil.relativedelta.relativedelta(months=train_timespan_months)
    test_end_date = train_end_date + dateutil.relativedelta.relativedelta(months=test_timespan_months)
    print(train_start_date, train_end_date, test_end_date)
    train_data = df.query(f'"{train_start_date}" < date <= "{train_end_date}"')
    test_data = df.query(f'"{train_end_date}" < date <= "{test_end_date}"')
    X_train = train_data[feat_cols].values
    y_train = train_data[y_col].values
    X_test = test_data[feat_cols].values
    y_test = test_data[y_col].values.ravel()
    X_test = test_data[feat_cols].values
    y_test = test_data[y_col].values.ravel()
    study = optuna.create_study(sampler=optuna.samplers.TPESampler())  # Create a new study.
    def objective(trial):
        param = {
        'n_estimators': trial.suggest_int('n_estimators', 50, 500),   
        'num_leaves': trial.suggest_int('num_leaves', 10, 512),
        'min_data_in_leaf': trial.suggest_int('min_data_in_leaf', 10, 80),
        'bagging_fraction': trial.suggest_float('bagging_fraction', 0.0, 1.0), # subsample
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.1),  # eta
        'lambda_l1': trial.suggest_float('lambda_l1', 0.01, 1),  # reg_alpha
        'lambda_l2': trial.suggest_float('lambda_l2', 0.01, 1), # reg_lambda
        }
        model = LGBMRegressor(seed=42, **param)
        model.fit(X_train, y_train)
        y_pred = model.predict(X_test)
        mse = sklearn.metrics.mean_squared_error(y_test, y_pred)
        rmse = math.sqrt(mse)
        return rmse  # An objective value linked with the Trial object.
    ts = time.time()
    study.optimize(objective, n_trials=1)  # Invoke optimization of the objective function.
    te = time.time()
    exc_time = te-ts
    evaluate_detail_df['Smallest RMSE'].append(study.best_value)
    evaluate_detail_df['Time Ellipsed'].append(exc_time)
    evaluate_detail_df['Train Start Date'].append(train_start_date)
    evaluate_detail_df['Train End Date'].append(train_end_date)
    evaluate_detail_df['Test Start Date'].append(train_end_date+dateutil.relativedelta.relativedelta(months=1))
    evaluate_detail_df['Test End Date'].append(test_end_date)
    train_end_date += dateutil.relativedelta.relativedelta(months=1)
evaluate_detail_df = pd.DataFrame(evaluate_detail_df)

  0%|          | 0/60 [00:00<?, ?it/s][32m[I 2022-06-03 17:32:45,287][0m A new study created in memory with name: no-name-0fdd27fe-29f4-4e18-bd90-e385c9590cf7[0m


2000-12-01 00:00:00 2015-12-01 00:00:00 2016-01-01 00:00:00


[32m[I 2022-06-03 17:33:44,046][0m Trial 0 finished with value: 0.12014407101854203 and parameters: {'n_estimators': 268, 'num_leaves': 372, 'min_data_in_leaf': 26, 'bagging_fraction': 0.974294625529873, 'learning_rate': 0.051771955019260726, 'lambda_l1': 0.3197602119136842, 'lambda_l2': 0.3139139070581193}. Best is trial 0 with value: 0.12014407101854203.[0m
  2%|▏         | 1/60 [00:58<57:54, 58.90s/it][32m[I 2022-06-03 17:33:44,103][0m A new study created in memory with name: no-name-65861a4c-4122-488c-a4f0-e8217ac3b9a7[0m


2001-01-01 00:00:00 2016-01-01 00:00:00 2016-02-01 00:00:00


[32m[I 2022-06-03 17:35:16,232][0m Trial 0 finished with value: 0.08019383980045353 and parameters: {'n_estimators': 401, 'num_leaves': 444, 'min_data_in_leaf': 20, 'bagging_fraction': 0.2717637088424665, 'learning_rate': 0.05733894722006798, 'lambda_l1': 0.5653191869972338, 'lambda_l2': 0.8047878837165326}. Best is trial 0 with value: 0.08019383980045353.[0m
  3%|▎         | 2/60 [02:31<1:15:51, 78.48s/it][32m[I 2022-06-03 17:35:16,289][0m A new study created in memory with name: no-name-c30b1107-40ae-4b49-9a27-f0e0e08cf214[0m


2001-02-01 00:00:00 2016-02-01 00:00:00 2016-03-01 00:00:00


[32m[I 2022-06-03 17:36:28,274][0m Trial 0 finished with value: 0.10378380773668443 and parameters: {'n_estimators': 366, 'num_leaves': 465, 'min_data_in_leaf': 38, 'bagging_fraction': 0.8282910831787512, 'learning_rate': 0.03809146297301107, 'lambda_l1': 0.8236603636591376, 'lambda_l2': 0.14037952991211028}. Best is trial 0 with value: 0.10378380773668443.[0m
  5%|▌         | 3/60 [03:43<1:11:45, 75.54s/it][32m[I 2022-06-03 17:36:28,330][0m A new study created in memory with name: no-name-91fda144-817c-4e70-8082-ec21b5a784ed[0m


2001-03-01 00:00:00 2016-03-01 00:00:00 2016-04-01 00:00:00


[32m[I 2022-06-03 17:36:55,614][0m Trial 0 finished with value: 0.07491520892194346 and parameters: {'n_estimators': 419, 'num_leaves': 93, 'min_data_in_leaf': 38, 'bagging_fraction': 0.4458782310190025, 'learning_rate': 0.052092798825081894, 'lambda_l1': 0.9359970735445483, 'lambda_l2': 0.3903733063558175}. Best is trial 0 with value: 0.07491520892194346.[0m
  7%|▋         | 4/60 [04:10<52:44, 56.51s/it]  [32m[I 2022-06-03 17:36:55,676][0m A new study created in memory with name: no-name-92b19a33-bd7c-4cce-9969-d5dff48db2ed[0m


2001-04-01 00:00:00 2016-04-01 00:00:00 2016-05-01 00:00:00


[32m[I 2022-06-03 17:37:12,801][0m Trial 0 finished with value: 0.05822410892525416 and parameters: {'n_estimators': 83, 'num_leaves': 428, 'min_data_in_leaf': 27, 'bagging_fraction': 0.8688430879194784, 'learning_rate': 0.07079634257143526, 'lambda_l1': 0.9471216182592902, 'lambda_l2': 0.6970212138114318}. Best is trial 0 with value: 0.05822410892525416.[0m
  8%|▊         | 5/60 [04:27<38:48, 42.33s/it][32m[I 2022-06-03 17:37:12,859][0m A new study created in memory with name: no-name-ca51c7e9-b11b-463d-a370-7c1134c3049a[0m


2001-05-01 00:00:00 2016-05-01 00:00:00 2016-06-01 00:00:00


[32m[I 2022-06-03 17:37:23,138][0m Trial 0 finished with value: 0.0670729828126723 and parameters: {'n_estimators': 275, 'num_leaves': 41, 'min_data_in_leaf': 68, 'bagging_fraction': 0.29148774622621343, 'learning_rate': 0.08126330758317332, 'lambda_l1': 0.5699393766073408, 'lambda_l2': 0.6982690793842373}. Best is trial 0 with value: 0.0670729828126723.[0m
 10%|█         | 6/60 [04:37<28:18, 31.45s/it][32m[I 2022-06-03 17:37:23,193][0m A new study created in memory with name: no-name-c239c106-195d-41e2-958e-96cd1356d3af[0m


2001-06-01 00:00:00 2016-06-01 00:00:00 2016-07-01 00:00:00


[32m[I 2022-06-03 17:38:29,270][0m Trial 0 finished with value: 0.06862326520561161 and parameters: {'n_estimators': 391, 'num_leaves': 291, 'min_data_in_leaf': 16, 'bagging_fraction': 0.587390898939794, 'learning_rate': 0.013391466404517691, 'lambda_l1': 0.8748935957969195, 'lambda_l2': 0.41062010331531973}. Best is trial 0 with value: 0.06862326520561161.[0m
 12%|█▏        | 7/60 [05:44<37:47, 42.79s/it][32m[I 2022-06-03 17:38:29,324][0m A new study created in memory with name: no-name-6ab0f8c2-53f4-42f9-853d-f43f98d09ff2[0m


2001-07-01 00:00:00 2016-07-01 00:00:00 2016-08-01 00:00:00


[32m[I 2022-06-03 17:38:48,310][0m Trial 0 finished with value: 0.06619546110893992 and parameters: {'n_estimators': 266, 'num_leaves': 84, 'min_data_in_leaf': 34, 'bagging_fraction': 0.6374022882315279, 'learning_rate': 0.07182212771811257, 'lambda_l1': 0.23276959579485487, 'lambda_l2': 0.9457321919893327}. Best is trial 0 with value: 0.06619546110893992.[0m
 13%|█▎        | 8/60 [06:03<30:31, 35.23s/it][32m[I 2022-06-03 17:38:48,366][0m A new study created in memory with name: no-name-2a8df19f-7dfd-4b2c-83dc-38f306227e05[0m


2001-08-01 00:00:00 2016-08-01 00:00:00 2016-09-01 00:00:00


[32m[I 2022-06-03 17:39:02,148][0m Trial 0 finished with value: 0.05389885816508892 and parameters: {'n_estimators': 389, 'num_leaves': 38, 'min_data_in_leaf': 34, 'bagging_fraction': 0.6202801173451391, 'learning_rate': 0.07548623777265732, 'lambda_l1': 0.5411668511619852, 'lambda_l2': 0.9807127830301795}. Best is trial 0 with value: 0.05389885816508892.[0m
 15%|█▌        | 9/60 [06:16<24:15, 28.54s/it][32m[I 2022-06-03 17:39:02,203][0m A new study created in memory with name: no-name-af51ca3c-7fd2-48c6-a269-6af7e01f9396[0m


2001-09-01 00:00:00 2016-09-01 00:00:00 2016-10-01 00:00:00


[32m[I 2022-06-03 17:39:10,891][0m Trial 0 finished with value: 0.07736818783760037 and parameters: {'n_estimators': 110, 'num_leaves': 100, 'min_data_in_leaf': 22, 'bagging_fraction': 0.7029934014451122, 'learning_rate': 0.041083482563717875, 'lambda_l1': 0.5006470435455179, 'lambda_l2': 0.6047476253218793}. Best is trial 0 with value: 0.07736818783760037.[0m
 17%|█▋        | 10/60 [06:25<18:41, 22.43s/it][32m[I 2022-06-03 17:39:10,948][0m A new study created in memory with name: no-name-407f6546-7410-4555-a842-ae1ec936264a[0m


2001-10-01 00:00:00 2016-10-01 00:00:00 2016-11-01 00:00:00


[32m[I 2022-06-03 17:39:51,765][0m Trial 0 finished with value: 0.08978101459707476 and parameters: {'n_estimators': 474, 'num_leaves': 141, 'min_data_in_leaf': 40, 'bagging_fraction': 0.0016121502430528345, 'learning_rate': 0.08534978448354451, 'lambda_l1': 0.04088545950547744, 'lambda_l2': 0.13712495358570456}. Best is trial 0 with value: 0.08978101459707476.[0m
 18%|█▊        | 11/60 [07:06<22:55, 28.07s/it][32m[I 2022-06-03 17:39:51,822][0m A new study created in memory with name: no-name-e97f29a8-7e96-4ae9-9b90-4956fdbcdd39[0m


2001-11-01 00:00:00 2016-11-01 00:00:00 2016-12-01 00:00:00


[32m[I 2022-06-03 17:40:55,504][0m Trial 0 finished with value: 0.053357450879461095 and parameters: {'n_estimators': 343, 'num_leaves': 387, 'min_data_in_leaf': 44, 'bagging_fraction': 0.05610239264344852, 'learning_rate': 0.08177932331220801, 'lambda_l1': 0.4935200710348668, 'lambda_l2': 0.045361363401450985}. Best is trial 0 with value: 0.053357450879461095.[0m
 20%|██        | 12/60 [08:10<31:08, 38.93s/it][32m[I 2022-06-03 17:40:55,573][0m A new study created in memory with name: no-name-cb2bb1f1-0a1c-45a6-9f53-dd4195041404[0m


2001-12-01 00:00:00 2016-12-01 00:00:00 2017-01-01 00:00:00


[32m[I 2022-06-03 17:42:00,505][0m Trial 0 finished with value: 0.06786859463669276 and parameters: {'n_estimators': 331, 'num_leaves': 338, 'min_data_in_leaf': 37, 'bagging_fraction': 0.4120767657470459, 'learning_rate': 0.035714329189798166, 'lambda_l1': 0.09928730067493845, 'lambda_l2': 0.5909144825432318}. Best is trial 0 with value: 0.06786859463669276.[0m
 22%|██▏       | 13/60 [09:15<36:40, 46.82s/it][32m[I 2022-06-03 17:42:00,564][0m A new study created in memory with name: no-name-a54fff28-145f-4649-b052-1a4baf829cfd[0m


2002-01-01 00:00:00 2017-01-01 00:00:00 2017-02-01 00:00:00


[32m[I 2022-06-03 17:42:33,577][0m Trial 0 finished with value: 0.05840770262127854 and parameters: {'n_estimators': 178, 'num_leaves': 330, 'min_data_in_leaf': 57, 'bagging_fraction': 0.3087993684493945, 'learning_rate': 0.0802808574114693, 'lambda_l1': 0.23376195594959268, 'lambda_l2': 0.8151718540086543}. Best is trial 0 with value: 0.05840770262127854.[0m
 23%|██▎       | 14/60 [09:48<32:42, 42.67s/it][32m[I 2022-06-03 17:42:33,636][0m A new study created in memory with name: no-name-b4454f09-94f5-47c3-a470-0b1e13e3fbb4[0m


2002-02-01 00:00:00 2017-02-01 00:00:00 2017-03-01 00:00:00


[32m[I 2022-06-03 17:42:50,316][0m Trial 0 finished with value: 0.043667678289728916 and parameters: {'n_estimators': 317, 'num_leaves': 67, 'min_data_in_leaf': 43, 'bagging_fraction': 0.4572084838831454, 'learning_rate': 0.08534007925739444, 'lambda_l1': 0.25097502950996214, 'lambda_l2': 0.7540607208532455}. Best is trial 0 with value: 0.043667678289728916.[0m
 25%|██▌       | 15/60 [10:05<26:08, 34.85s/it][32m[I 2022-06-03 17:42:50,379][0m A new study created in memory with name: no-name-008e4d48-4bca-4088-a680-6f0b82671196[0m


2002-03-01 00:00:00 2017-03-01 00:00:00 2017-04-01 00:00:00


[32m[I 2022-06-03 17:43:03,012][0m Trial 0 finished with value: 0.050683171881216973 and parameters: {'n_estimators': 221, 'num_leaves': 51, 'min_data_in_leaf': 52, 'bagging_fraction': 0.015440183528511398, 'learning_rate': 0.04087071058850183, 'lambda_l1': 0.8905086617395727, 'lambda_l2': 0.9939924598761507}. Best is trial 0 with value: 0.050683171881216973.[0m
 27%|██▋       | 16/60 [10:17<20:40, 28.18s/it][32m[I 2022-06-03 17:43:03,068][0m A new study created in memory with name: no-name-03b5feba-35d6-4de6-8800-6664bf962e1f[0m


2002-04-01 00:00:00 2017-04-01 00:00:00 2017-05-01 00:00:00


[32m[I 2022-06-03 17:44:14,156][0m Trial 0 finished with value: 0.06448410626079916 and parameters: {'n_estimators': 287, 'num_leaves': 440, 'min_data_in_leaf': 40, 'bagging_fraction': 0.20025922322680179, 'learning_rate': 0.01462589285371228, 'lambda_l1': 0.17592454617194878, 'lambda_l2': 0.4781595839022791}. Best is trial 0 with value: 0.06448410626079916.[0m
 28%|██▊       | 17/60 [11:29<29:27, 41.10s/it][32m[I 2022-06-03 17:44:14,215][0m A new study created in memory with name: no-name-323ee53d-1765-4f44-8f80-61f2dde1cb27[0m


2002-05-01 00:00:00 2017-05-01 00:00:00 2017-06-01 00:00:00


[32m[I 2022-06-03 17:44:30,934][0m Trial 0 finished with value: 0.054389736674299756 and parameters: {'n_estimators': 168, 'num_leaves': 151, 'min_data_in_leaf': 57, 'bagging_fraction': 0.9900115918129653, 'learning_rate': 0.08168970189626208, 'lambda_l1': 0.6553521119894905, 'lambda_l2': 0.4691514771427516}. Best is trial 0 with value: 0.054389736674299756.[0m
 30%|███       | 18/60 [11:45<23:39, 33.79s/it][32m[I 2022-06-03 17:44:30,988][0m A new study created in memory with name: no-name-39604374-17e6-4283-a3f2-a103c8aeb246[0m


2002-06-01 00:00:00 2017-06-01 00:00:00 2017-07-01 00:00:00


[32m[I 2022-06-03 17:46:05,906][0m Trial 0 finished with value: 0.05413798508330158 and parameters: {'n_estimators': 461, 'num_leaves': 399, 'min_data_in_leaf': 41, 'bagging_fraction': 0.7093930048075773, 'learning_rate': 0.08531406531772866, 'lambda_l1': 0.16003353963669711, 'lambda_l2': 0.13750236934913815}. Best is trial 0 with value: 0.05413798508330158.[0m
 32%|███▏      | 19/60 [13:20<35:38, 52.17s/it][32m[I 2022-06-03 17:46:05,963][0m A new study created in memory with name: no-name-7d588669-9b5b-48f4-8969-9f12853a6fe0[0m


2002-07-01 00:00:00 2017-07-01 00:00:00 2017-08-01 00:00:00


[32m[I 2022-06-03 17:46:31,132][0m Trial 0 finished with value: 0.06251742031691136 and parameters: {'n_estimators': 148, 'num_leaves': 261, 'min_data_in_leaf': 29, 'bagging_fraction': 0.12724388442785783, 'learning_rate': 0.017286928973210127, 'lambda_l1': 0.07156539207676664, 'lambda_l2': 0.20489513231777673}. Best is trial 0 with value: 0.06251742031691136.[0m
 33%|███▎      | 20/60 [13:45<29:23, 44.08s/it][32m[I 2022-06-03 17:46:31,189][0m A new study created in memory with name: no-name-b083e792-0a1e-4958-a830-a56ca3c0e785[0m


2002-08-01 00:00:00 2017-08-01 00:00:00 2017-09-01 00:00:00


[32m[I 2022-06-03 17:46:39,088][0m Trial 0 finished with value: 0.06584304130032842 and parameters: {'n_estimators': 50, 'num_leaves': 305, 'min_data_in_leaf': 64, 'bagging_fraction': 0.4476580975750709, 'learning_rate': 0.05744774163161348, 'lambda_l1': 0.9787287182108987, 'lambda_l2': 0.2883498709174521}. Best is trial 0 with value: 0.06584304130032842.[0m
 35%|███▌      | 21/60 [13:53<21:36, 33.24s/it][32m[I 2022-06-03 17:46:39,146][0m A new study created in memory with name: no-name-9f97716b-e0fd-4993-94a7-f1143e663dd9[0m


2002-09-01 00:00:00 2017-09-01 00:00:00 2017-10-01 00:00:00


[32m[I 2022-06-03 17:47:07,149][0m Trial 0 finished with value: 0.06493398201087233 and parameters: {'n_estimators': 326, 'num_leaves': 136, 'min_data_in_leaf': 60, 'bagging_fraction': 0.8920240286170281, 'learning_rate': 0.09215456500164018, 'lambda_l1': 0.7574197631965031, 'lambda_l2': 0.23235223587163836}. Best is trial 0 with value: 0.06493398201087233.[0m
 37%|███▋      | 22/60 [14:21<20:03, 31.68s/it][32m[I 2022-06-03 17:47:07,221][0m A new study created in memory with name: no-name-94f48fb3-e135-473f-b6b8-017394b5ae1f[0m


2002-10-01 00:00:00 2017-10-01 00:00:00 2017-11-01 00:00:00


[32m[I 2022-06-03 17:47:37,200][0m Trial 0 finished with value: 0.06856124875539424 and parameters: {'n_estimators': 320, 'num_leaves': 109, 'min_data_in_leaf': 11, 'bagging_fraction': 0.9537057644860628, 'learning_rate': 0.02836326101654256, 'lambda_l1': 0.11492008822027182, 'lambda_l2': 0.23464357483935422}. Best is trial 0 with value: 0.06856124875539424.[0m
 38%|███▊      | 23/60 [14:52<19:14, 31.19s/it][32m[I 2022-06-03 17:47:37,258][0m A new study created in memory with name: no-name-1e883bdc-b688-49fa-a201-21c6c9904de5[0m


2002-11-01 00:00:00 2017-11-01 00:00:00 2017-12-01 00:00:00


[32m[I 2022-06-03 17:48:20,726][0m Trial 0 finished with value: 0.05310693725158352 and parameters: {'n_estimators': 427, 'num_leaves': 156, 'min_data_in_leaf': 37, 'bagging_fraction': 0.279136558722585, 'learning_rate': 0.03831238557177387, 'lambda_l1': 0.5462274589144304, 'lambda_l2': 0.5381234693428405}. Best is trial 0 with value: 0.05310693725158352.[0m
 40%|████      | 24/60 [15:35<20:56, 34.89s/it][32m[I 2022-06-03 17:48:20,784][0m A new study created in memory with name: no-name-9b58e089-4381-4202-9b9b-fde24e7171e6[0m


2002-12-01 00:00:00 2017-12-01 00:00:00 2018-01-01 00:00:00


[32m[I 2022-06-03 17:48:31,140][0m Trial 0 finished with value: 0.07387120619606334 and parameters: {'n_estimators': 395, 'num_leaves': 21, 'min_data_in_leaf': 20, 'bagging_fraction': 0.46998672312877454, 'learning_rate': 0.07226513944794064, 'lambda_l1': 0.37736435371296656, 'lambda_l2': 0.77064680157497}. Best is trial 0 with value: 0.07387120619606334.[0m
 42%|████▏     | 25/60 [15:45<16:04, 27.55s/it][32m[I 2022-06-03 17:48:31,200][0m A new study created in memory with name: no-name-6fe5480d-914d-4b92-9248-6f102b15a1c7[0m


2003-01-01 00:00:00 2018-01-01 00:00:00 2018-02-01 00:00:00


[32m[I 2022-06-03 17:48:40,603][0m Trial 0 finished with value: 0.08416154026369368 and parameters: {'n_estimators': 304, 'num_leaves': 25, 'min_data_in_leaf': 13, 'bagging_fraction': 0.0065226764782314595, 'learning_rate': 0.020757643419170003, 'lambda_l1': 0.25955614484203077, 'lambda_l2': 0.40287879250759295}. Best is trial 0 with value: 0.08416154026369368.[0m
 43%|████▎     | 26/60 [15:55<12:32, 22.12s/it][32m[I 2022-06-03 17:48:40,668][0m A new study created in memory with name: no-name-0f89a158-32db-4bd9-b796-e08e1bdb675a[0m


2003-02-01 00:00:00 2018-02-01 00:00:00 2018-03-01 00:00:00


[32m[I 2022-06-03 17:50:23,748][0m Trial 0 finished with value: 0.06787789697429485 and parameters: {'n_estimators': 492, 'num_leaves': 459, 'min_data_in_leaf': 65, 'bagging_fraction': 0.8373236134422632, 'learning_rate': 0.0908774773207321, 'lambda_l1': 0.16525772073519746, 'lambda_l2': 0.19909274613640474}. Best is trial 0 with value: 0.06787789697429485.[0m
 45%|████▌     | 27/60 [17:38<25:32, 46.43s/it][32m[I 2022-06-03 17:50:23,808][0m A new study created in memory with name: no-name-7b623f97-5039-40cc-8a03-800ad8c673da[0m


2003-03-01 00:00:00 2018-03-01 00:00:00 2018-04-01 00:00:00


[32m[I 2022-06-03 17:50:54,958][0m Trial 0 finished with value: 0.07104535503072233 and parameters: {'n_estimators': 363, 'num_leaves': 131, 'min_data_in_leaf': 30, 'bagging_fraction': 0.29512537653422444, 'learning_rate': 0.08243363030424106, 'lambda_l1': 0.1761165932779036, 'lambda_l2': 0.18898916474325297}. Best is trial 0 with value: 0.07104535503072233.[0m
 47%|████▋     | 28/60 [18:09<22:19, 41.87s/it][32m[I 2022-06-03 17:50:55,020][0m A new study created in memory with name: no-name-125e8fec-697c-4ff7-b975-ba91cb59520c[0m


2003-04-01 00:00:00 2018-04-01 00:00:00 2018-05-01 00:00:00


[32m[I 2022-06-03 17:52:20,265][0m Trial 0 finished with value: 0.06661817727116191 and parameters: {'n_estimators': 325, 'num_leaves': 443, 'min_data_in_leaf': 46, 'bagging_fraction': 0.3214931133232928, 'learning_rate': 0.020619143523789827, 'lambda_l1': 0.42061386394185657, 'lambda_l2': 0.7748432683562756}. Best is trial 0 with value: 0.06661817727116191.[0m
 48%|████▊     | 29/60 [19:35<28:21, 54.90s/it][32m[I 2022-06-03 17:52:20,361][0m A new study created in memory with name: no-name-b2525b5d-d2fc-46bc-8de3-ae941c0dcc7d[0m


2003-05-01 00:00:00 2018-05-01 00:00:00 2018-06-01 00:00:00


[32m[I 2022-06-03 17:53:04,047][0m Trial 0 finished with value: 0.06016003721951752 and parameters: {'n_estimators': 175, 'num_leaves': 292, 'min_data_in_leaf': 50, 'bagging_fraction': 0.4176560357284428, 'learning_rate': 0.08731857128936854, 'lambda_l1': 0.04376576727742454, 'lambda_l2': 0.6848812674778916}. Best is trial 0 with value: 0.06016003721951752.[0m
 50%|█████     | 30/60 [20:18<25:46, 51.56s/it][32m[I 2022-06-03 17:53:04,138][0m A new study created in memory with name: no-name-46ec6b8e-d095-4b9a-a1d7-232c9c334497[0m


2003-06-01 00:00:00 2018-06-01 00:00:00 2018-07-01 00:00:00


[32m[I 2022-06-03 17:53:37,481][0m Trial 0 finished with value: 0.07026551035301298 and parameters: {'n_estimators': 288, 'num_leaves': 154, 'min_data_in_leaf': 53, 'bagging_fraction': 0.8352190628367414, 'learning_rate': 0.03040261011950935, 'lambda_l1': 0.9202533685921809, 'lambda_l2': 0.010472158221088835}. Best is trial 0 with value: 0.07026551035301298.[0m
 52%|█████▏    | 31/60 [20:52<22:17, 46.12s/it][32m[I 2022-06-03 17:53:37,539][0m A new study created in memory with name: no-name-c85dc120-715e-40de-bbcd-6329968c7a50[0m


2003-07-01 00:00:00 2018-07-01 00:00:00 2018-08-01 00:00:00


[32m[I 2022-06-03 17:53:58,666][0m Trial 0 finished with value: 0.06689355894368446 and parameters: {'n_estimators': 109, 'num_leaves': 269, 'min_data_in_leaf': 31, 'bagging_fraction': 0.47110980616081677, 'learning_rate': 0.09941505886774622, 'lambda_l1': 0.41982298146960056, 'lambda_l2': 0.7269674208358973}. Best is trial 0 with value: 0.06689355894368446.[0m
 53%|█████▎    | 32/60 [21:13<18:01, 38.64s/it][32m[I 2022-06-03 17:53:58,723][0m A new study created in memory with name: no-name-0b39107d-b5bf-4069-8ef3-ef55f8f3b8ce[0m


2003-08-01 00:00:00 2018-08-01 00:00:00 2018-09-01 00:00:00


[32m[I 2022-06-03 17:54:21,151][0m Trial 0 finished with value: 0.05445439134307995 and parameters: {'n_estimators': 58, 'num_leaves': 379, 'min_data_in_leaf': 27, 'bagging_fraction': 0.33511624198315826, 'learning_rate': 0.09442538411911199, 'lambda_l1': 0.5949607333053314, 'lambda_l2': 0.0679188678267868}. Best is trial 0 with value: 0.05445439134307995.[0m
 55%|█████▌    | 33/60 [21:35<15:12, 33.79s/it][32m[I 2022-06-03 17:54:21,211][0m A new study created in memory with name: no-name-746835f6-cfb6-4873-af13-22a9117a4728[0m


2003-09-01 00:00:00 2018-09-01 00:00:00 2018-10-01 00:00:00


[32m[I 2022-06-03 17:55:06,773][0m Trial 0 finished with value: 0.14362461625253856 and parameters: {'n_estimators': 212, 'num_leaves': 284, 'min_data_in_leaf': 33, 'bagging_fraction': 0.7425109197493517, 'learning_rate': 0.07367208886324825, 'lambda_l1': 0.6804928698468536, 'lambda_l2': 0.07425334865053872}. Best is trial 0 with value: 0.14362461625253856.[0m
 57%|█████▋    | 34/60 [22:21<16:10, 37.34s/it][32m[I 2022-06-03 17:55:06,830][0m A new study created in memory with name: no-name-95189103-d9c9-450a-bf6d-cdf276d541c9[0m


2003-10-01 00:00:00 2018-10-01 00:00:00 2018-11-01 00:00:00


[32m[I 2022-06-03 17:56:58,589][0m Trial 0 finished with value: 0.08074941024069487 and parameters: {'n_estimators': 358, 'num_leaves': 408, 'min_data_in_leaf': 54, 'bagging_fraction': 0.9382036681572317, 'learning_rate': 0.04741101922331131, 'lambda_l1': 0.03774935090242905, 'lambda_l2': 0.021492905657798312}. Best is trial 0 with value: 0.08074941024069487.[0m
 58%|█████▊    | 35/60 [24:13<24:52, 59.69s/it][32m[I 2022-06-03 17:56:58,659][0m A new study created in memory with name: no-name-f17cc5f8-3fc8-409c-9035-ef0da468245e[0m


2003-11-01 00:00:00 2018-11-01 00:00:00 2018-12-01 00:00:00


[32m[I 2022-06-03 17:57:28,600][0m Trial 0 finished with value: 0.12698566341191195 and parameters: {'n_estimators': 393, 'num_leaves': 104, 'min_data_in_leaf': 72, 'bagging_fraction': 0.2132112929986858, 'learning_rate': 0.025769210754735303, 'lambda_l1': 0.9643151897910137, 'lambda_l2': 0.8999098630084961}. Best is trial 0 with value: 0.12698566341191195.[0m
 60%|██████    | 36/60 [24:43<20:18, 50.78s/it][32m[I 2022-06-03 17:57:28,657][0m A new study created in memory with name: no-name-b63def76-b958-4e99-94db-8441dbaf60a9[0m


2003-12-01 00:00:00 2018-12-01 00:00:00 2019-01-01 00:00:00


[32m[I 2022-06-03 17:58:19,501][0m Trial 0 finished with value: 0.11341128711715351 and parameters: {'n_estimators': 266, 'num_leaves': 344, 'min_data_in_leaf': 60, 'bagging_fraction': 0.07996230764472201, 'learning_rate': 0.06507197719176742, 'lambda_l1': 0.28958795057341025, 'lambda_l2': 0.08163162825167028}. Best is trial 0 with value: 0.11341128711715351.[0m
 62%|██████▏   | 37/60 [25:34<19:28, 50.82s/it][32m[I 2022-06-03 17:58:19,563][0m A new study created in memory with name: no-name-59147eb5-0b43-43db-bbc8-29bf08252ade[0m


2004-01-01 00:00:00 2019-01-01 00:00:00 2019-02-01 00:00:00


[32m[I 2022-06-03 17:58:27,778][0m Trial 0 finished with value: 0.06947361412380704 and parameters: {'n_estimators': 200, 'num_leaves': 38, 'min_data_in_leaf': 79, 'bagging_fraction': 0.4607507372812357, 'learning_rate': 0.022160674493709974, 'lambda_l1': 0.25340965196225423, 'lambda_l2': 0.6367550351029837}. Best is trial 0 with value: 0.06947361412380704.[0m
 63%|██████▎   | 38/60 [25:42<13:57, 38.06s/it][32m[I 2022-06-03 17:58:27,835][0m A new study created in memory with name: no-name-0e7f8d9c-f601-4b5e-84aa-3b5cd8021120[0m


2004-02-01 00:00:00 2019-02-01 00:00:00 2019-03-01 00:00:00


[32m[I 2022-06-03 17:58:55,326][0m Trial 0 finished with value: 0.05995539898786268 and parameters: {'n_estimators': 116, 'num_leaves': 460, 'min_data_in_leaf': 50, 'bagging_fraction': 0.640332375934401, 'learning_rate': 0.023708919602925795, 'lambda_l1': 0.431811031642084, 'lambda_l2': 0.419769037352957}. Best is trial 0 with value: 0.05995539898786268.[0m
 65%|██████▌   | 39/60 [26:10<12:12, 34.90s/it][32m[I 2022-06-03 17:58:55,387][0m A new study created in memory with name: no-name-872661e4-416e-4070-bfa5-c10cceb33841[0m


2004-03-01 00:00:00 2019-03-01 00:00:00 2019-04-01 00:00:00


[32m[I 2022-06-03 17:59:16,432][0m Trial 0 finished with value: 0.07424296567168902 and parameters: {'n_estimators': 165, 'num_leaves': 200, 'min_data_in_leaf': 64, 'bagging_fraction': 0.10538964006824703, 'learning_rate': 0.07449781259793092, 'lambda_l1': 0.5558595347092928, 'lambda_l2': 0.3699401418476587}. Best is trial 0 with value: 0.07424296567168902.[0m
 67%|██████▋   | 40/60 [26:31<10:15, 30.77s/it][32m[I 2022-06-03 17:59:16,493][0m A new study created in memory with name: no-name-a70af87e-f14e-4ef8-82b8-4e13e9cfcb97[0m


2004-04-01 00:00:00 2019-04-01 00:00:00 2019-05-01 00:00:00


[32m[I 2022-06-03 18:00:31,953][0m Trial 0 finished with value: 0.11516233501442177 and parameters: {'n_estimators': 290, 'num_leaves': 505, 'min_data_in_leaf': 50, 'bagging_fraction': 0.008769024137896264, 'learning_rate': 0.09468538846863302, 'lambda_l1': 0.1731306920626565, 'lambda_l2': 0.3879132657284738}. Best is trial 0 with value: 0.11516233501442177.[0m
 68%|██████▊   | 41/60 [27:46<13:59, 44.19s/it][32m[I 2022-06-03 18:00:32,012][0m A new study created in memory with name: no-name-329b8697-c009-49a3-8850-f376d237707a[0m


2004-05-01 00:00:00 2019-05-01 00:00:00 2019-06-01 00:00:00


[32m[I 2022-06-03 18:00:42,510][0m Trial 0 finished with value: 0.08406336898445514 and parameters: {'n_estimators': 120, 'num_leaves': 117, 'min_data_in_leaf': 62, 'bagging_fraction': 0.7590115505877337, 'learning_rate': 0.09210694128601205, 'lambda_l1': 0.7340352722226295, 'lambda_l2': 0.8714137528925119}. Best is trial 0 with value: 0.08406336898445514.[0m
 70%|███████   | 42/60 [27:57<10:13, 34.10s/it][32m[I 2022-06-03 18:00:42,572][0m A new study created in memory with name: no-name-f4c712b3-348c-4008-93c8-9426b49d8e72[0m


2004-06-01 00:00:00 2019-06-01 00:00:00 2019-07-01 00:00:00


[32m[I 2022-06-03 18:01:02,417][0m Trial 0 finished with value: 0.06688736368508577 and parameters: {'n_estimators': 202, 'num_leaves': 140, 'min_data_in_leaf': 69, 'bagging_fraction': 0.7672834567713196, 'learning_rate': 0.0410200283494049, 'lambda_l1': 0.7340917520319944, 'lambda_l2': 0.3819495968515241}. Best is trial 0 with value: 0.06688736368508577.[0m
 72%|███████▏  | 43/60 [28:17<08:27, 29.84s/it][32m[I 2022-06-03 18:01:02,478][0m A new study created in memory with name: no-name-704863a5-3341-40d3-b917-5ecf3ab0479b[0m


2004-07-01 00:00:00 2019-07-01 00:00:00 2019-08-01 00:00:00


[32m[I 2022-06-03 18:01:34,295][0m Trial 0 finished with value: 0.09432848189304384 and parameters: {'n_estimators': 194, 'num_leaves': 287, 'min_data_in_leaf': 62, 'bagging_fraction': 0.029600718347063792, 'learning_rate': 0.055676940822447805, 'lambda_l1': 0.5453545212086743, 'lambda_l2': 0.9248576909187926}. Best is trial 0 with value: 0.09432848189304384.[0m
 73%|███████▎  | 44/60 [28:49<08:07, 30.45s/it][32m[I 2022-06-03 18:01:34,353][0m A new study created in memory with name: no-name-10c84f17-7a21-4e84-a66e-8170a17e1a80[0m


2004-08-01 00:00:00 2019-08-01 00:00:00 2019-09-01 00:00:00


[32m[I 2022-06-03 18:02:25,468][0m Trial 0 finished with value: 0.05902564562608192 and parameters: {'n_estimators': 283, 'num_leaves': 435, 'min_data_in_leaf': 63, 'bagging_fraction': 0.07188048343531883, 'learning_rate': 0.016234580655725386, 'lambda_l1': 0.9772196908034232, 'lambda_l2': 0.20070024169624257}. Best is trial 0 with value: 0.05902564562608192.[0m
 75%|███████▌  | 45/60 [29:40<09:10, 36.67s/it][32m[I 2022-06-03 18:02:25,529][0m A new study created in memory with name: no-name-211f0e21-6505-40a5-9418-581470969738[0m


2004-09-01 00:00:00 2019-09-01 00:00:00 2019-10-01 00:00:00


[32m[I 2022-06-03 18:03:08,821][0m Trial 0 finished with value: 0.07615698337417376 and parameters: {'n_estimators': 174, 'num_leaves': 451, 'min_data_in_leaf': 72, 'bagging_fraction': 0.5135105519428946, 'learning_rate': 0.0709753428778593, 'lambda_l1': 0.09466089090567917, 'lambda_l2': 0.10887156788225674}. Best is trial 0 with value: 0.07615698337417376.[0m
 77%|███████▋  | 46/60 [30:23<09:01, 38.67s/it][32m[I 2022-06-03 18:03:08,881][0m A new study created in memory with name: no-name-374d3ae9-bd2f-4d53-9aca-84515841a9a7[0m


2004-10-01 00:00:00 2019-10-01 00:00:00 2019-11-01 00:00:00


[32m[I 2022-06-03 18:03:46,470][0m Trial 0 finished with value: 0.0667533052236141 and parameters: {'n_estimators': 242, 'num_leaves': 217, 'min_data_in_leaf': 48, 'bagging_fraction': 0.1055316403732749, 'learning_rate': 0.027714067524089753, 'lambda_l1': 0.02750198272496303, 'lambda_l2': 0.8611320729155278}. Best is trial 0 with value: 0.0667533052236141.[0m
 78%|███████▊  | 47/60 [31:01<08:18, 38.37s/it][32m[I 2022-06-03 18:03:46,532][0m A new study created in memory with name: no-name-85a6cc40-af6d-4356-8c14-a8f0ade52641[0m


2004-11-01 00:00:00 2019-11-01 00:00:00 2019-12-01 00:00:00


[32m[I 2022-06-03 18:04:56,319][0m Trial 0 finished with value: 0.06045654831801104 and parameters: {'n_estimators': 362, 'num_leaves': 430, 'min_data_in_leaf': 38, 'bagging_fraction': 0.7030868599049996, 'learning_rate': 0.09352853749079416, 'lambda_l1': 0.5741793785037419, 'lambda_l2': 0.21416632548064257}. Best is trial 0 with value: 0.06045654831801104.[0m
 80%|████████  | 48/60 [32:11<09:33, 47.81s/it][32m[I 2022-06-03 18:04:56,380][0m A new study created in memory with name: no-name-57bc6db6-ff0e-44ca-9430-561c4f730ae7[0m


2004-12-01 00:00:00 2019-12-01 00:00:00 2020-01-01 00:00:00


[32m[I 2022-06-03 18:05:38,322][0m Trial 0 finished with value: 0.08351162654688528 and parameters: {'n_estimators': 197, 'num_leaves': 424, 'min_data_in_leaf': 32, 'bagging_fraction': 0.7595939130832353, 'learning_rate': 0.08856328324491973, 'lambda_l1': 0.7094158403461719, 'lambda_l2': 0.02609183118933176}. Best is trial 0 with value: 0.08351162654688528.[0m
 82%|████████▏ | 49/60 [32:53<08:26, 46.07s/it][32m[I 2022-06-03 18:05:38,386][0m A new study created in memory with name: no-name-4599c4a1-ea0f-4436-96e6-f03caef93df0[0m


2005-01-01 00:00:00 2020-01-01 00:00:00 2020-02-01 00:00:00


[32m[I 2022-06-03 18:05:47,918][0m Trial 0 finished with value: 0.13431642118584788 and parameters: {'n_estimators': 327, 'num_leaves': 21, 'min_data_in_leaf': 16, 'bagging_fraction': 0.05282848345114377, 'learning_rate': 0.024039254164137065, 'lambda_l1': 0.14450282322391914, 'lambda_l2': 0.36486259326583903}. Best is trial 0 with value: 0.13431642118584788.[0m
 83%|████████▎ | 50/60 [33:02<05:51, 35.13s/it][32m[I 2022-06-03 18:05:47,984][0m A new study created in memory with name: no-name-00180656-94c8-4216-ad8e-eb742ef00467[0m


2005-02-01 00:00:00 2020-02-01 00:00:00 2020-03-01 00:00:00


[32m[I 2022-06-03 18:06:56,429][0m Trial 0 finished with value: 0.48531170403061596 and parameters: {'n_estimators': 384, 'num_leaves': 359, 'min_data_in_leaf': 48, 'bagging_fraction': 0.45494623299391723, 'learning_rate': 0.04896289048319624, 'lambda_l1': 0.7687039298972054, 'lambda_l2': 0.07645993082932771}. Best is trial 0 with value: 0.48531170403061596.[0m
 85%|████████▌ | 51/60 [34:11<06:46, 45.14s/it][32m[I 2022-06-03 18:06:56,495][0m A new study created in memory with name: no-name-97fa739a-0a43-4a09-bf75-47327b6b3d23[0m


2005-03-01 00:00:00 2020-03-01 00:00:00 2020-04-01 00:00:00


[32m[I 2022-06-03 18:07:28,569][0m Trial 0 finished with value: 0.2862720749588326 and parameters: {'n_estimators': 245, 'num_leaves': 188, 'min_data_in_leaf': 13, 'bagging_fraction': 0.5308634667631523, 'learning_rate': 0.06252248870912558, 'lambda_l1': 0.6657344678513178, 'lambda_l2': 0.2295909867513167}. Best is trial 0 with value: 0.2862720749588326.[0m
 87%|████████▋ | 52/60 [34:43<05:29, 41.24s/it][32m[I 2022-06-03 18:07:28,629][0m A new study created in memory with name: no-name-b5b8df89-7956-4a42-9208-ceaa3545eef1[0m


2005-04-01 00:00:00 2020-04-01 00:00:00 2020-05-01 00:00:00


[32m[I 2022-06-03 18:08:05,336][0m Trial 0 finished with value: 0.10768620067121613 and parameters: {'n_estimators': 192, 'num_leaves': 338, 'min_data_in_leaf': 43, 'bagging_fraction': 0.06396795219300289, 'learning_rate': 0.08210209790909856, 'lambda_l1': 0.8731982351713249, 'lambda_l2': 0.5382193285870668}. Best is trial 0 with value: 0.10768620067121613.[0m
 88%|████████▊ | 53/60 [35:20<04:39, 39.90s/it][32m[I 2022-06-03 18:08:05,404][0m A new study created in memory with name: no-name-a83f2d11-bedf-4a2f-9a62-0fb15b5bba33[0m


2005-05-01 00:00:00 2020-05-01 00:00:00 2020-06-01 00:00:00


[32m[I 2022-06-03 18:09:08,546][0m Trial 0 finished with value: 0.07675069471293311 and parameters: {'n_estimators': 236, 'num_leaves': 466, 'min_data_in_leaf': 26, 'bagging_fraction': 0.7871640983401632, 'learning_rate': 0.020463175140088014, 'lambda_l1': 0.4170042828158193, 'lambda_l2': 0.5862071342200781}. Best is trial 0 with value: 0.07675069471293311.[0m
 90%|█████████ | 54/60 [36:23<04:41, 46.89s/it][32m[I 2022-06-03 18:09:08,612][0m A new study created in memory with name: no-name-3bd336d7-b93a-4022-bf85-b6687763baba[0m


2005-06-01 00:00:00 2020-06-01 00:00:00 2020-07-01 00:00:00


[32m[I 2022-06-03 18:10:18,974][0m Trial 0 finished with value: 0.10076033730179461 and parameters: {'n_estimators': 314, 'num_leaves': 382, 'min_data_in_leaf': 26, 'bagging_fraction': 0.7164363960492673, 'learning_rate': 0.06149639526582263, 'lambda_l1': 0.033703641328631156, 'lambda_l2': 0.41265195777749186}. Best is trial 0 with value: 0.10076033730179461.[0m
 92%|█████████▏| 55/60 [37:33<04:29, 53.95s/it][32m[I 2022-06-03 18:10:19,035][0m A new study created in memory with name: no-name-da8e7f5e-beea-4ded-a72f-5b44a3f6d5f3[0m


2005-07-01 00:00:00 2020-07-01 00:00:00 2020-08-01 00:00:00


[32m[I 2022-06-03 18:10:53,275][0m Trial 0 finished with value: 0.08636640562117433 and parameters: {'n_estimators': 181, 'num_leaves': 320, 'min_data_in_leaf': 53, 'bagging_fraction': 0.5711139456491199, 'learning_rate': 0.08542226380970386, 'lambda_l1': 0.24789044778174046, 'lambda_l2': 0.9200393465060456}. Best is trial 0 with value: 0.08636640562117433.[0m
 93%|█████████▎| 56/60 [38:08<03:12, 48.06s/it][32m[I 2022-06-03 18:10:53,335][0m A new study created in memory with name: no-name-04a8b0b8-4d58-4ccf-8fa4-38572c7e5ff8[0m


2005-08-01 00:00:00 2020-08-01 00:00:00 2020-09-01 00:00:00


[32m[I 2022-06-03 18:11:31,528][0m Trial 0 finished with value: 0.08366220221107927 and parameters: {'n_estimators': 341, 'num_leaves': 180, 'min_data_in_leaf': 73, 'bagging_fraction': 0.09546303148034319, 'learning_rate': 0.08476369163748282, 'lambda_l1': 0.2190238421195959, 'lambda_l2': 0.4773624396487671}. Best is trial 0 with value: 0.08366220221107927.[0m
 95%|█████████▌| 57/60 [38:46<02:15, 45.11s/it][32m[I 2022-06-03 18:11:31,588][0m A new study created in memory with name: no-name-8992c273-ebc3-480a-9a13-396f03b35569[0m


2005-09-01 00:00:00 2020-09-01 00:00:00 2020-10-01 00:00:00


[32m[I 2022-06-03 18:13:12,005][0m Trial 0 finished with value: 0.08473459445233832 and parameters: {'n_estimators': 367, 'num_leaves': 490, 'min_data_in_leaf': 33, 'bagging_fraction': 0.47112341979694927, 'learning_rate': 0.01893882822917211, 'lambda_l1': 0.3548998999783392, 'lambda_l2': 0.244676063140908}. Best is trial 0 with value: 0.08473459445233832.[0m
 97%|█████████▋| 58/60 [40:26<02:03, 61.72s/it][32m[I 2022-06-03 18:13:12,062][0m A new study created in memory with name: no-name-812a578e-fffb-4955-8756-eb91fd07fd61[0m


2005-10-01 00:00:00 2020-10-01 00:00:00 2020-11-01 00:00:00


[32m[I 2022-06-03 18:13:33,090][0m Trial 0 finished with value: 0.16823991270667032 and parameters: {'n_estimators': 239, 'num_leaves': 123, 'min_data_in_leaf': 43, 'bagging_fraction': 0.3856901931840869, 'learning_rate': 0.06451953053786437, 'lambda_l1': 0.4671681551781985, 'lambda_l2': 0.7455652397075785}. Best is trial 0 with value: 0.16823991270667032.[0m
 98%|█████████▊| 59/60 [40:47<00:49, 49.53s/it][32m[I 2022-06-03 18:13:33,146][0m A new study created in memory with name: no-name-a2a59e1f-a306-485e-89e0-2525bf43215b[0m


2005-11-01 00:00:00 2020-11-01 00:00:00 2020-12-01 00:00:00


[32m[I 2022-06-03 18:14:24,296][0m Trial 0 finished with value: 0.06104910464372973 and parameters: {'n_estimators': 326, 'num_leaves': 237, 'min_data_in_leaf': 47, 'bagging_fraction': 0.7353679216692118, 'learning_rate': 0.01659753603452322, 'lambda_l1': 0.17769560328863834, 'lambda_l2': 0.04840970419390444}. Best is trial 0 with value: 0.06104910464372973.[0m
100%|██████████| 60/60 [41:39<00:00, 41.65s/it]


In [None]:
evaluate_detail_df

Unnamed: 0,Train Start Date,Train End Date,Test Start Date,Test End Date,Smallest RMSE,Time Ellipsed
0,2000-12-01,2015-12-01,2016-01-01,2016-01-01,0.120144,58.760128
1,2001-01-01,2016-01-01,2016-02-01,2016-02-01,0.080194,92.129778
2,2001-02-01,2016-02-01,2016-03-01,2016-03-01,0.103784,71.986675
3,2001-03-01,2016-03-01,2016-04-01,2016-04-01,0.074915,27.287619
4,2001-04-01,2016-04-01,2016-05-01,2016-05-01,0.058224,17.13049
5,2001-05-01,2016-05-01,2016-06-01,2016-06-01,0.067073,10.279614
6,2001-06-01,2016-06-01,2016-07-01,2016-07-01,0.068623,66.076798
7,2001-07-01,2016-07-01,2016-08-01,2016-08-01,0.066195,18.987839
8,2001-08-01,2016-08-01,2016-09-01,2016-09-01,0.053899,13.782289
9,2001-09-01,2016-09-01,2016-10-01,2016-10-01,0.077368,8.68783


In [None]:
library_evaluation_df['Library'].extend(['Optuna' for _ in range(len(evaluate_detail_df))])
library_evaluation_df['Train Start Date'].extend(evaluate_detail_df['Train Start Date'])
library_evaluation_df['Train End Date'].extend(evaluate_detail_df['Train End Date'])
library_evaluation_df['Test Start Date'].extend(evaluate_detail_df['Test Start Date'])
library_evaluation_df['Test End Date'].extend(evaluate_detail_df['Test End Date'])
library_evaluation_df['Smallest RMSE'].extend(evaluate_detail_df['Smallest RMSE'])
library_evaluation_df['Time Ellipsed'].extend(evaluate_detail_df['Time Ellipsed'])

In [None]:
pd.DataFrame(library_evaluation_df)

Unnamed: 0,Library,Train Start Date,Train End Date,Test Start Date,Test End Date,Smallest RMSE,Time Ellipsed
0,Optuna,2000-12-01,2015-12-01,2016-01-01,2016-01-01,0.120144,58.760128
1,Optuna,2001-01-01,2016-01-01,2016-02-01,2016-02-01,0.080194,92.129778
2,Optuna,2001-02-01,2016-02-01,2016-03-01,2016-03-01,0.103784,71.986675
3,Optuna,2001-03-01,2016-03-01,2016-04-01,2016-04-01,0.074915,27.287619
4,Optuna,2001-04-01,2016-04-01,2016-05-01,2016-05-01,0.058224,17.13049
5,Optuna,2001-05-01,2016-05-01,2016-06-01,2016-06-01,0.067073,10.279614
6,Optuna,2001-06-01,2016-06-01,2016-07-01,2016-07-01,0.068623,66.076798
7,Optuna,2001-07-01,2016-07-01,2016-08-01,2016-08-01,0.066195,18.987839
8,Optuna,2001-08-01,2016-08-01,2016-09-01,2016-09-01,0.053899,13.782289
9,Optuna,2001-09-01,2016-09-01,2016-10-01,2016-10-01,0.077368,8.68783


# Scikit-learn

Refered to [Scikit-learn RandomnizedSearchCV](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html)

In [None]:
# @title Scikit-learn
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import uniform
# @title optuna hyper param tuning
# Configuration 
train_timespan_months = 180
whole_period_months = 60
test_timespan_months = 1
first_end_time = datetime.datetime(2015, 12, 1)
feat_cols = ['absacc', 'acc', 'age', 'agr', 'baspread','bm', 'bm_ia',
             'cash', 'cashdebt', 'cashpr', 'cfp', 'cfp_ia', 'chatoia', 'chcsho', 'chempia', 'chinv', 'chmom',
             'chpmia', 'chtx', 'cinvest', 'convind', 'currat', 'depr', 'divi', 'divo', 'dolvol', 'dy', 
             'egr', 'ep', 'gma', 'grcapx', 'grltnoa', 'herf', 'hire', 'ill', 'indmom', 'invest', 'lev', 'lgr',
             'maxret', 'mom12m', 'mom1m', 'mom36m', 'mom6m', 'ms', 'mve_ia', 'mvel1', 'nincr', 'operprof',
             'orgcap', 'pchcapx_ia', 'pchcurrat', 'pchdepr', 'pchgm_pchsale', 'pchquick', 'pchsale_pchinvt',
             'pchsale_pchrect', 'pchsale_pchxsga', 'pchsaleinv', 'pctacc', 'ps', 'quick', 'rd', 'rd_mve',
             'rd_sale', 'realestate', 'retvol', 'roaq', 'roavol', 'roeq', 'roic', 'rsup', 'salecash', 'pricedelay',
             'saleinv', 'salerec', 'secured', 'securedind', 'sgr', 'sin', 'sp', 'std_dolvol', 'std_turn',
             'stdacc', 'stdcf', 'tang', 'tb', 'turn', 'zerotrade','aeavol','ear','beta','betasq','idiovol']
y_col = 'ret'
train_end_date = first_end_time
time_usage = []
score_list = []
timeline = []

# Evaluation details for each train and test timespan
evaluate_detail_df = {
    'Train Start Date': [],
    'Train End Date': [],
    'Test Start Date': [],
    'Test End Date': [],
    'Smallest RMSE': [],
    'Time Ellipsed': []
}
predict_times = 60
def rmse(reg, X, y):
      y_pred = reg.predict(X)
      mse = sklearn.metrics.mean_squared_error(y, y_pred)      
      return math.sqrt(mse)
for period_time in trange(predict_times):
    train_start_date = train_end_date - dateutil.relativedelta.relativedelta(months=train_timespan_months)
    test_end_date = train_end_date + dateutil.relativedelta.relativedelta(months=test_timespan_months)
    print(train_start_date, train_end_date, test_end_date)
    train_data = df.query(f'"{train_start_date}" < date <= "{train_end_date}"')
    test_data = df.query(f'"{train_end_date}" < date <= "{test_end_date}"')
    X_train = train_data[feat_cols].values
    y_train = train_data[y_col].values
    X_test = test_data[feat_cols].values
    y_test = test_data[y_col].values.ravel()
    X_test = test_data[feat_cols].values
    y_test = test_data[y_col].values.ravel()
    model = LGBMRegressor(seed=42)
    param_distribution = dict(
        n_estimators = uniform(loc=50, scale=500),   
        num_leaves = uniform(loc=10, scale=512),
        min_data_in_leaf = uniform( loc=10, scale=80),
        bagging_fraction= uniform( loc=0, scale=0.1), # subsample
        learning_rate= uniform( loc=0.01, scale=0.1),  # eta
        lambda_l1= uniform( loc=0.01, scale=1),  # reg_alpha
        lambda_l2= uniform( loc=0.01, scale=1), # reg_lambda
    )
    search_cv = RandomizedSearchCV(model, 
                                   param_distribution,
                                   scoring=rmse,
                                   random_state=0)
    ts = time.time()
    search_cv.fit(X_train, y_train)
    te = time.time()
    exc_time = te-ts
    evaluate_detail_df['Smallest RMSE'].append(search_cv.best_score_)
    evaluate_detail_df['Time Ellipsed'].append(exc_time)
    evaluate_detail_df['Train Start Date'].append(train_start_date)
    evaluate_detail_df['Train End Date'].append(train_end_date)
    evaluate_detail_df['Test Start Date'].append(train_end_date+dateutil.relativedelta.relativedelta(months=1))
    evaluate_detail_df['Test End Date'].append(test_end_date)
    train_end_date += dateutil.relativedelta.relativedelta(months=1)
evaluate_detail_df = pd.DataFrame(evaluate_detail_df)

  0%|          | 0/60 [00:00<?, ?it/s]

2000-12-01 00:00:00 2015-12-01 00:00:00 2016-01-01 00:00:00


50 fits failed out of a total of 50.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/lightgbm/engine.py", line 197, in train
    booster = Booster(params=params, train_set=train_set)
  File "/usr/local/lib/python3.7/dist-packages/lightgbm/basic.py", line 1552, in __init__
    train_set.construct().handle,
  File "/usr/local/lib/python3.7/dist-packages/lightgbm/basic.py", line 1001, in construct
    categorical_feature=self.categorical_feature, params=self.params)
  File "/usr/local/lib/python3.7/dist-packages/lightgbm/basic.py", line 791, in _lazy_init
    self.__init_from_np2d(data, params_str, ref_dataset)
  File "/

NotFittedError: ignored