# FLAML

* https://github.com/microsoft/FLAML

<a href="https://colab.research.google.com/github/fuyu-quant/Data_Science/blob/main/Tabel_Data/AutoML/FLAML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 概要
* 学習に使えるモデル  
lgbm，xgboost,xgb_limitdepth,rf,extra_tree,lrl1,lrl2,catboost,kneighbor,prophet,arima,sarimax,transformer,temporal_fusion_transformer  
https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML/#estimator-and-search-space
* 参考  
https://github.com/microsoft/FLAML/blob/main/notebook/automl_classification.ipynb

In [1]:
%%capture
!pip install flaml

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting flaml
  Downloading FLAML-1.0.14-py3-none-any.whl (208 kB)
[K     |████████████████████████████████| 208 kB 4.8 MB/s 
Collecting lightgbm>=2.3.1
  Downloading lightgbm-3.3.3-py3-none-manylinux1_x86_64.whl (2.0 MB)
[K     |████████████████████████████████| 2.0 MB 42.8 MB/s 
Installing collected packages: lightgbm, flaml
  Attempting uninstall: lightgbm
    Found existing installation: lightgbm 2.2.3
    Uninstalling lightgbm-2.2.3:
      Successfully uninstalled lightgbm-2.2.3
Successfully installed flaml-1.0.14 lightgbm-3.3.3


In [9]:
# FLAML
from flaml import AutoML

import numpy as np
import pandas as pd

from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

## データの用意

In [3]:
iris_dataset = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris_dataset['data'], iris_dataset['target'], test_size=0.25,  random_state=0)

## FLAMLの学習

In [None]:
automl = AutoML()

settings = {"time_budget": 60,                               # 学習に使う時間
            "estimator_list": ['RGF', 'lgbm', 'rf', 'xgboost'],   # 学習に使うモデルの指定(しなくても良い)
            "metric": 'accuracy',                              # metricの指定(https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML/#optimization-metric)(オリジナルのmetricを作っても良い)
            "task": 'classification',                            # タスク
            #"log_file_name": 'airlines_experiment.log',       # logファイルの名前
            "seed": 3655,                                    # seedの設定
            }

automl.fit(X_train, y_train, **settings)

## FLAMLの推論

In [11]:
# 予測値の出力
y_pred = automl.predict(X_test)

# 予測値の確率の出力
y_pred_proba = automl.predict_proba(X_test)[:,1]

acc_score = accuracy_score(y_test, y_pred)
print(acc_score)

0.9473684210526315


In [13]:
print('最も精度の高いモデル:', automl.best_estimator)
print('最も良いハイパーパラメータ:', automl.best_config)
print('validationデータでの最も高いAccuracy: {0:.4g}'.format(1-automl.best_loss))
print('最も良いモデルの学習時間: {0:.4g} s'.format(automl.best_config_train_time))

最も精度の高いモデル: rf
最も良いハイパーパラメータ: {'n_estimators': 4, 'max_features': 0.6051754338344674, 'max_leaves': 4, 'criterion': 'entropy'}
validationデータでの最も高いAccuracy: 0.9822
最も良いモデルの学習時間: 0.121 s


## metricのカスタマイズ

In [14]:
def custom_metric(X_val, y_val, estimator, labels, X_train, y_train,
                  weight_val=None, weight_train=None, config=None,
                  groups_val=None, groups_train=None):
    from sklearn.metrics import log_loss
    import time
    start = time.time()
    y_pred = estimator.predict_proba(X_val)
    pred_time = (time.time() - start) / len(X_val)
    val_loss = log_loss(y_val, y_pred, labels=labels,
                         sample_weight=weight_val)
    y_pred = estimator.predict_proba(X_train)
    train_loss = log_loss(y_train, y_pred, labels=labels,
                          sample_weight=weight_train)
    alpha = 0.5
    return val_loss * (1 + alpha) - alpha * train_loss, {
        "val_loss": val_loss, "train_loss": train_loss, "pred_time": pred_time
    }

In [None]:
automl = AutoML()
settings = {"time_budget": 10,  
            "metric": custom_metric,  
            "task": 'classification',  
            "log_file_name": 'airlines_experiment_custom_metric.log',  
            }

automl.fit(X_train=X_train, y_train=y_train, **settings)