<a href="https://colab.research.google.com/github/swarnava-96/FLAML/blob/main/flaml.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **FLAML - Fast, Lightweight and Economic AutoML**

In [1]:
!pip install flaml

Collecting flaml
  Downloading FLAML-0.6.5-py3-none-any.whl (156 kB)
[?25l[K     |██                              | 10 kB 21.5 MB/s eta 0:00:01[K     |████▏                           | 20 kB 26.2 MB/s eta 0:00:01[K     |██████▎                         | 30 kB 12.7 MB/s eta 0:00:01[K     |████████▍                       | 40 kB 8.8 MB/s eta 0:00:01[K     |██████████▌                     | 51 kB 5.1 MB/s eta 0:00:01[K     |████████████▋                   | 61 kB 5.4 MB/s eta 0:00:01[K     |██████████████▊                 | 71 kB 5.8 MB/s eta 0:00:01[K     |████████████████▉               | 81 kB 6.5 MB/s eta 0:00:01[K     |███████████████████             | 92 kB 6.7 MB/s eta 0:00:01[K     |█████████████████████           | 102 kB 5.2 MB/s eta 0:00:01[K     |███████████████████████         | 112 kB 5.2 MB/s eta 0:00:01[K     |█████████████████████████▏      | 122 kB 5.2 MB/s eta 0:00:01[K     |███████████████████████████▎    | 133 kB 5.2 MB/s eta 0:00:01[K    

## Classification on IRIS Data 

In [2]:
from flaml import AutoML
from sklearn.datasets import load_iris
# Initializing the AutoML instance
automl = AutoML()
# Goal setting and constraint
automl_settings = {
    "time_budget" : 10, # in seconds
    "metric" : "accuracy",
    "task" : "classification",
    "log_file_name" : 'iris.log',
}
X_train, y_train = load_iris(return_X_y = True)
# Train with labeled input data
automl.fit(X_train = X_train, y_train = y_train, **automl_settings)

[flaml.automl: 10-02 06:58:43] {1432} INFO - Evaluation method: cv
[flaml.automl: 10-02 06:58:43] {1478} INFO - Minimizing error metric: 1-accuracy
[flaml.automl: 10-02 06:58:43] {1515} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'catboost', 'xgboost', 'extra_tree', 'lrl1']
[flaml.automl: 10-02 06:58:43] {1748} INFO - iteration 0, current learner lgbm
[flaml.automl: 10-02 06:58:43] {1866} INFO - Estimated sufficient time budget=524s. Estimated necessary time budget=10s.
[flaml.automl: 10-02 06:58:43] {1944} INFO -  at 0.1s,	estimator lgbm's best error=0.0733,	best estimator lgbm's best error=0.0733
[flaml.automl: 10-02 06:58:43] {1748} INFO - iteration 1, current learner lgbm
[flaml.automl: 10-02 06:58:43] {1944} INFO -  at 0.1s,	estimator lgbm's best error=0.0733,	best estimator lgbm's best error=0.0733
[flaml.automl: 10-02 06:58:43] {1748} INFO - iteration 2, current learner lgbm
[flaml.automl: 10-02 06:58:43] {1944} INFO -  at 0.2s,	estimator lgbm's best error=0.0533,	b

In [3]:
# Prediction
print(automl.predict_proba(X_train))
# Export the best model
print(automl.model)

[[1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [0.82142857 0.16785714 0.01071429]
 [0.82142857 0.16785714 0.01071429]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [0.82142857 0.16785714 0.01071429]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.  

## Regression on Boston Data

In [4]:
from flaml import AutoML
from sklearn.datasets import load_boston
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget" : 10, # in seconds
    "metric" : "r2",
    "task" : "regression",
    "log_file_name" : "boston.log",
}
X_train, y_train = load_boston(return_X_y = True)
# Train with labelled input data
automl.fit(X_train = X_train, y_train = y_train, **automl_settings)


    The Boston housing prices dataset has an ethical problem. You can refer to
    the documentation of this function for further details.

    The scikit-learn maintainers therefore strongly discourage the use of this
    dataset unless the purpose of the code is to study and educate about
    ethical issues in data science and machine learning.

    In this case special case, you can fetch the dataset from the original
    source::

        import pandas as pd
        import numpy as np


        data_url = "http://lib.stat.cmu.edu/datasets/boston"
        raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
        data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
        target = raw_df.values[1::2, 2]

    Alternative datasets include the California housing dataset (i.e.
    func:`~sklearn.datasets.fetch_california_housing`) and the Ames housing
    dataset. You can load the datasets as follows:

        from sklearn.datasets import fetch_californi

In [5]:
# Prediction
print(automl.predict(X_train))
# Export the best model
print(automl.model)

[25.85426879 22.17737651 33.79509964 33.47189557 35.61127167 26.6926382
 21.53656446 20.99173684 16.18673463 18.75433976 18.20961633 20.48286627
 20.20745331 19.5131759  18.82401423 19.81864373 21.91420674 17.53721008
 18.69698259 19.07769813 14.3021372  18.32367106 16.63348583 15.34809335
 16.27978499 15.13176242 16.94561436 15.20221211 18.67737191 20.75076542
 13.88486622 17.70376242 13.89318927 14.90744983 14.32837615 21.02020963
 20.89410764 21.52937191 23.19568492 30.3824402  34.45306735 29.43834549
 24.62262048 24.62262048 22.21884844 20.87353145 20.36481607 18.09945043
 15.55273032 18.67322033 20.4058421  21.53510487 24.1301244  21.73345138
 17.74586736 34.48285914 23.17178682 31.55632404 23.26235097 20.39233374
 18.98627393 17.89811291 22.71262836 25.23349363 32.76974079 25.04432847
 19.83367435 20.94710592 19.92700235 21.23614492 23.62275307 21.08771163
 22.31713619 23.46816704 24.39319272 22.93449232 21.05889974 21.25605159
 21.17094097 21.4104049  26.32531539 24.84165632 23.

## Time Series Forecast

In [1]:
!pip install flaml[forecast]



In [2]:
import numpy as np
from flaml import AutoML
# Creating the data sets
X_train = np.arange("2014-01", "2021-01", dtype = "datetime64[M]")
y_train = np.random.random(size = 72)
# Initialize the automl instance
automl = AutoML()
# Model training
automl.fit(X_train = X_train[:72],
           y_train = y_train,
           period = 12,
           task = "forecast",
           time_budget = 15,
           log_file_name = "forecast.log",
           )

[flaml.automl: 10-02 07:34:50] {1432} INFO - Evaluation method: cv
[flaml.automl: 10-02 07:34:50] {1478} INFO - Minimizing error metric: mape
[flaml.automl: 10-02 07:34:51] {1515} INFO - List of ML learners in AutoML Run: ['prophet', 'arima', 'sarimax']
[flaml.automl: 10-02 07:34:51] {1748} INFO - iteration 0, current learner prophet
INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:prophet:n_changepoints greater than number of observations. Using 8.
INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True 

In [3]:
# Predict
print(automl.predict(X_train[72:]))

0     0.585458
1     0.695978
2     0.529412
3     0.763986
4     0.914638
5     0.439150
6     0.731521
7     0.669782
8     0.718833
9     0.277179
10    0.559964
11    0.413329
Name: yhat, dtype: float64
