![](https://github.com/microsoft/FLAML/raw/main/docs/images/FLAML.png)

**FLAML** is a lightweight Python library (https://github.com/microsoft/FLAML) that finds accurate machine learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner. It is fast and economical. The simple and lightweight design makes it easy to extend, such as adding customized learners or metrics. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research.

In [None]:
!pip install flaml[notebook];

In [None]:
import pandas as pd
df = pd.read_csv('../input/tabular-playground-series-aug-2021/train.csv')
df.head()

In [None]:
''' import AutoML class from flaml package '''
from flaml import AutoML
automl = AutoML()

In [None]:
X = df.drop(['id', 'loss'], axis=1)
y = df['loss']

In [None]:
settings = {
    "time_budget": 21600,  # total running time in seconds
    "metric": 'rmse',  # primary metrics for regression can be chosen from: ['mae','mse','r2']
    "estimator_list": ['xgboost', 'lgbm', 'catboost'],  # list of ML learners;
    "task": 'regression',  # task type    
    "log_file_name": 'kaggle_experiment.log',  # flaml log file
}

In [None]:
'''The main flaml automl API'''
automl.fit(X_train=X, y_train=y, verbose=0, **settings)

In [None]:
automl.model.estimator

In [None]:
''' retrieve best config'''
print('Best hyperparmeter config:', automl.best_config)
print('Best rmse on validation data: {0:.4g}'.format(automl.best_loss))
print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))

In [None]:
from flaml.data import get_output_from_log
time_history, best_valid_loss_history, valid_loss_history, config_history, train_loss_history = \
    get_output_from_log(filename=settings['log_file_name'], time_budget=60)

In [None]:
import matplotlib.pyplot as plt
import numpy as np

plt.title('Learning Curve')
plt.xlabel('Wall Clock Time (s)')
plt.ylabel('Validation rmse')
plt.scatter(time_history, np.array(valid_loss_history))
plt.step(time_history, np.array(best_valid_loss_history), where='post')
plt.show()

In [None]:
''' compute predictions of testing dataset ''' 
test = pd.read_csv('../input/tabular-playground-series-aug-2021/test.csv')
predictions = automl.predict(test)
print('Predicted labels', predictions)

In [None]:
submission = pd.read_csv('../input/tabular-playground-series-aug-2021/sample_submission.csv')
submission['loss'] = predictions
submission.to_csv('submission_flaml.csv', index=False)