### FLAML
is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner.

### Main Features​
For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It supports both classical machine learning models and deep neural networks.

It is easy to customize or extend. Users can find their desired customizability from a smooth range: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), or full customization (arbitrary training and evaluation code). Users can customize only when and what they need to, and leave the rest to the library.

It supports fast and economical automatic tuning, capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research.

In [1]:
# Installation
!pip install flaml

Collecting flaml
  Downloading FLAML-1.0.13-py3-none-any.whl (205 kB)
Collecting lightgbm>=2.3.1
  Downloading lightgbm-3.3.3-py3-none-win_amd64.whl (1.0 MB)
Collecting xgboost>=0.90
  Downloading xgboost-1.7.1-py3-none-win_amd64.whl (89.1 MB)
Installing collected packages: xgboost, lightgbm, flaml
Successfully installed flaml-1.0.13 lightgbm-3.3.3 xgboost-1.7.1




### A basic classification example

In [2]:
from flaml import AutoML
from sklearn.datasets import load_iris

# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 1,  # in seconds
    "metric": 'accuracy',
    "task": 'classification',
    "log_file_name": "iris.log",
}
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)


[flaml.automl: 11-14 23:52:40] {2600} INFO - task = classification
[flaml.automl: 11-14 23:52:40] {2602} INFO - Data split method: stratified
[flaml.automl: 11-14 23:52:40] {2605} INFO - Evaluation method: cv
[flaml.automl: 11-14 23:52:40] {2727} INFO - Minimizing error metric: 1-accuracy
[flaml.automl: 11-14 23:52:40] {2869} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth', 'lrl1']
[flaml.automl: 11-14 23:52:40] {3164} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-14 23:52:41] {3297} INFO - Estimated sufficient time budget=286s. Estimated necessary time budget=7s.
[flaml.automl: 11-14 23:52:41] {3344} INFO -  at 0.0s,	estimator lgbm's best error=0.0733,	best estimator lgbm's best error=0.0733
[flaml.automl: 11-14 23:52:41] {3164} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-14 23:52:41] {3344} INFO -  at 0.1s,	estimator lgbm's best error=0.0733,	best estimator lgbm's best error=0.0733
[flaml.automl: 11-14 23:

In [3]:
# Predict
print(automl.predict_proba(X_train))
# Print the best model
print(automl.model.estimator)

[[0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02250246]
 [0.9530079  0.0244897  0.02

### A basic regression example

In [4]:
from flaml import AutoML
from sklearn.datasets import fetch_california_housing

# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 1,  # in seconds
    "metric": 'r2',
    "task": 'regression',
    "log_file_name": "california.log",
}
X_train, y_train = fetch_california_housing(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)


[flaml.automl: 11-14 23:54:36] {2600} INFO - task = regression
[flaml.automl: 11-14 23:54:36] {2602} INFO - Data split method: uniform
[flaml.automl: 11-14 23:54:36] {2605} INFO - Evaluation method: holdout
[flaml.automl: 11-14 23:54:36] {2727} INFO - Minimizing error metric: 1-r2
[flaml.automl: 11-14 23:54:36] {2869} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth']
[flaml.automl: 11-14 23:54:36] {3164} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-14 23:54:36] {3297} INFO - Estimated sufficient time budget=160s. Estimated necessary time budget=1s.
[flaml.automl: 11-14 23:54:36] {3344} INFO -  at 0.0s,	estimator lgbm's best error=0.7393,	best estimator lgbm's best error=0.7393
[flaml.automl: 11-14 23:54:36] {3164} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-14 23:54:36] {3344} INFO -  at 0.1s,	estimator lgbm's best error=0.7393,	best estimator lgbm's best error=0.7393
[flaml.automl: 11-14 23:54:36] {3164} IN

In [5]:
# Predict
print(automl.predict(X_train))
# Print the best model
print(automl.model.estimator)

[4.06421067 3.90548325 3.99106909 ... 0.86693755 0.95347687 0.82396956]
LGBMRegressor(colsample_bytree=0.6649148062238498,
              learning_rate=0.17402065726724145, max_bin=255,
              min_child_samples=3, n_estimators=148, num_leaves=18,
              reg_alpha=0.0009765625, reg_lambda=0.006761362450996487,
              verbose=-1)


### AutoML - Time Series Forecast

In [6]:
!pip install "flaml[ts_forecast]"

Collecting prophet>=1.0.1
  Downloading prophet-1.1.1-cp39-cp39-win_amd64.whl (12.1 MB)
Collecting hcrystalball==0.1.10
  Downloading hcrystalball-0.1.10-py2.py3-none-any.whl (786 kB)
Collecting holidays<0.14
  Downloading holidays-0.13-py3-none-any.whl (172 kB)
Collecting workalendar>=10.1
  Downloading workalendar-16.4.0-py3-none-any.whl (208 kB)
Collecting prophet>=1.0.1
  Downloading prophet-1.1-cp39-cp39-win_amd64.whl (12.1 MB)
Collecting cmdstanpy>=1.0.1
  Downloading cmdstanpy-1.0.8-py3-none-any.whl (81 kB)
Collecting pyluach
  Downloading pyluach-2.0.2-py3-none-any.whl (22 kB)
Collecting tzdata
  Downloading tzdata-2022.6-py2.py3-none-any.whl (338 kB)
Collecting lunardate
  Downloading lunardate-0.2.0-py3-none-any.whl (5.6 kB)
Installing collected packages: tzdata, pyluach, lunardate, workalendar, holidays, cmdstanpy, prophet, hcrystalball
  Attempting uninstall: holidays
    Found existing installation: holidays 0.16
    Uninstalling holidays-0.16:
      Successfully uninstall

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fbprophet 0.7.1 requires cmdstanpy==0.9.5, but you have cmdstanpy 1.0.8 which is incompatible.


In [7]:
import numpy as np
from flaml import AutoML

X_train = np.arange('2014-01', '2022-01', dtype='datetime64[M]')
y_train = np.random.random(size=84)
automl = AutoML()
automl.fit(X_train=X_train[:84],  # a single column of timestamp
           y_train=y_train,  # value for each timestamp
           period=12,  # time horizon to forecast, e.g., 12 months
           task='ts_forecast', time_budget=15,  # time budget in seconds
           log_file_name="ts_forecast.log",
           eval_method="holdout",
          )
print(automl.predict(X_train[84:]))

[flaml.automl: 11-14 23:55:48] {2600} INFO - task = ts_forecast
[flaml.automl: 11-14 23:55:48] {2602} INFO - Data split method: time
[flaml.automl: 11-14 23:55:48] {2605} INFO - Evaluation method: holdout
[flaml.automl: 11-14 23:55:48] {2727} INFO - Minimizing error metric: mape
Importing plotly failed. Interactive plots will not work.
[flaml.automl: 11-14 23:55:49] {2869} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth', 'prophet', 'arima', 'sarimax']
[flaml.automl: 11-14 23:55:49] {3164} INFO - iteration 0, current learner lgbm
[flaml.automl: 11-14 23:55:50] {3297} INFO - Estimated sufficient time budget=11275s. Estimated necessary time budget=11s.
[flaml.automl: 11-14 23:55:50] {3344} INFO -  at 1.6s,	estimator lgbm's best error=2.0298,	best estimator lgbm's best error=2.0298
[flaml.automl: 11-14 23:55:50] {3164} INFO - iteration 1, current learner lgbm
[flaml.automl: 11-14 23:55:50] {3344} INFO -  at 1.6s,	estimator lgbm's best erro

[0.35592384 0.66935056 0.42003267 0.24122833 0.66935056 0.42003267
 0.67638959 0.50795631 0.48049062 0.63562776 0.46613119 0.39557645]
