## パラメーター管理方法

Hydraを使ってハイパーパラメータの管理方法を確認する。

* 参考：
  * [Hydraを用いたPython・機械学習のパラメータ管理方法](https://zenn.dev/kwashizzz/articles/ml-hydra-param)

In [22]:
import os
import seaborn as sns
import sys
import pandas as pd
from hydra import compose, initialize
from omegaconf import DictConfig, ListConfig, OmegaConf
from sklearn.model_selection import train_test_split


# モデルのモジュールのインポート
sys.path.append('../../')
sys.dont_write_bytecode = True
from src.train.lightGBM.model import Model


## 設定ファイルの読み込み

hydraでは、設定ファイル(.yaml形式)をnotebook上で読み込むためには"initialize"を利用する必要がある。

In [23]:
with initialize(version_base=None,config_path='../../config', job_name="test"):
    # dictinary形式での出力
    cfg = compose(config_name="train")

In [24]:
cfg['model']

{'params': {'objective': 'regression', "metric'": 'rmse', 'num_leaves': 100, 'max_depth': 10, 'feature_fraction': 0.8, 'subsample_freq': 1, 'bagging_fraction': 0.95, 'learning_rate': 0.1, 'boosting': 'gbdt', 'lambda_l1': 0.1, 'lambda_l2': 10, 'random_state': 42, 'verbosity': -1}}

## 学習用のデータの作成

In [25]:
# データの抽出
df = sns.load_dataset('titanic')

# 説明変数の指定
X = pd.get_dummies(
    df.loc[:, (df.columns!='survived') & (df.columns!='alive')], 
    drop_first=True
    )
y = df['survived']

In [30]:
# モデルのインスタンス化
model = Model(params=OmegaConf.to_container(cfg['model']))

# 前処理
model.preprocessing(X, y)

# モデルの学習
model.train()

# 予測
model.predict(X)

[1]	valid_0's rmse: 0.471568
[2]	valid_0's rmse: 0.452985
[3]	valid_0's rmse: 0.436971
[4]	valid_0's rmse: 0.425039
[5]	valid_0's rmse: 0.412421
[6]	valid_0's rmse: 0.403422
[7]	valid_0's rmse: 0.39577
[8]	valid_0's rmse: 0.38823
[9]	valid_0's rmse: 0.383366
[10]	valid_0's rmse: 0.378937
[11]	valid_0's rmse: 0.375149
[12]	valid_0's rmse: 0.371092
[13]	valid_0's rmse: 0.368492
[14]	valid_0's rmse: 0.36584
[15]	valid_0's rmse: 0.363963
[16]	valid_0's rmse: 0.362676
[17]	valid_0's rmse: 0.360535
[18]	valid_0's rmse: 0.359405
[19]	valid_0's rmse: 0.358393
[20]	valid_0's rmse: 0.357472
[21]	valid_0's rmse: 0.357059
[22]	valid_0's rmse: 0.356041
[23]	valid_0's rmse: 0.355005
[24]	valid_0's rmse: 0.355156
[25]	valid_0's rmse: 0.354678
[26]	valid_0's rmse: 0.354073
[27]	valid_0's rmse: 0.353313
[28]	valid_0's rmse: 0.352416
[29]	valid_0's rmse: 0.352023
[30]	valid_0's rmse: 0.351048
[31]	valid_0's rmse: 0.35183
[32]	valid_0's rmse: 0.351163
[33]	valid_0's rmse: 0.351447
[34]	valid_0's rmse: 0.