# AutoML with FLAML

Automated machine learning holds much promise as an assistive tool for practitioners working with data. In this notebook, we demonstrate the [FLAML](https://github.com/microsoft/FLAML) library for AutoML. FLAML is a fast and lightweight AutoML library that finds accurate machine learning models efficiently.

We'll work with a small sample of the [credit card fraud dataset](https://www.kaggle.com/mlg-ulb/creditcardfraud). We'll first load the data and perform some automated data profiling. Then, we will use FLAML to automatically train and tune a classifier to detect fraud, optimizing for Average Precision due to the class imbalance.

## Setup

We import all dependencies up front. 
**Note:** Ensure you have installed FLAML in your environment (`pip install flaml`).

In [15]:
import mlflow
import mlflow.sklearn
mlflow.set_experiment("fraud_detection_flaml_v3")

2026/01/28 05:46:44 INFO mlflow.tracking.fluent: Experiment with name 'fraud_detection_flaml_v3' does not exist. Creating a new experiment.


<Experiment: artifact_location='/home/cdsw/.experiments/c8ye-hfco-331k-oxps', creation_time=None, experiment_id='c8ye-hfco-331k-oxps', last_update_time=None, lifecycle_stage='active', name='fraud_detection_flaml_v3', tags={}>

In [16]:
import pydantic
import pydantic_core

print(f"Pydantic version: {pydantic.VERSION}")
print(f"Pydantic Core version: {pydantic_core.__version__}")

Pydantic version: 2.12.5
Pydantic Core version: 2.41.5


In [17]:
import os
import time
import pickle
import pandas as pd

# FLAML Import
from flaml import AutoML

from ydata_profiling import ProfileReport
from sklearn.model_selection import train_test_split
from sklearn.metrics import average_precision_score

### Load and profile the data

In [18]:
transactions = pd.read_csv("data/creditcardsample.csv")

Let's profile the data. Exploratory data analysis is not a mechanical process, and thus impossible to automate completely. However, open source provides us several options to automatically generate common charts. Here, we're using [pandas-profiling](https://github.com/pandas-profiling/pandas-profiling) to view histograms of each variable and detect duplicate entries.

In [19]:
ProfileReport(transactions, interactions=None)

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]


  0%|          | 0/31 [00:00<?, ?it/s][A
100%|██████████| 31/31 [00:00<00:00, 52.34it/s][A


Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]



There are no missing values, but there are some duplicate rows. We don't want to accidentally polute our test set with rows that are duplicated in the training set, so we'll drop them.

In [20]:
unique_transactions = transactions.drop_duplicates()

In [21]:
X = unique_transactions.drop(["Class", "Time"], axis="columns")
y = unique_transactions.Class

Split our data into train and test sets, so we can evaluate performance on a hold out set.

In [22]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, shuffle=True, stratify=y)

## Fit a model with FLAML

We define the `AutoML` settings below. 

We set `time_budget` to 300 seconds (5 minutes). FLAML will try different algorithms (XGBoost, LightGBM, Random Forest, etc.) and hyperparameter configurations within this time limit. 

We use `metric='ap'` (Average Precision) to handle the class imbalance, similar to the original TPOT configuration.

In [23]:
# Initialize the AutoML instance
automl = AutoML()

# Define settings
automl_settings = {
    "time_budget": 30,  # Total time limit in seconds
    "metric": 'ap',      # Average Precision (good for imbalanced fraud data)
    "task": 'classification',
    "log_file_name": 'flaml.log',
    "seed": 42,
    "n_jobs": -1,        # Use all available CPU cores
}

print(f"Starting AutoML with a time budget of {automl_settings['time_budget']} seconds...")

# Fit the model
with mlflow.start_run(run_name="flaml_run_v1"):
# Run FLAML as usual
    print("Starting FLAML training...")
    automl.fit(X_train=X_train, y_train=y_train, **automl_settings)
    
    # 3. Log the Best Results
    print(f"Best ML leaner: {automl.best_estimator}")
    print(f"Best loss: {automl.best_loss}")
    
    # Log best hyperparameters (config)
    mlflow.log_params(automl.best_config)
    
    # Log the best metric found (Note: FLAML minimizes loss, so 1-loss is often the score)
    mlflow.log_metric("best_loss", automl.best_loss)
    mlflow.log_metric("best_ap_score", 1 - automl.best_loss) # formatted for Average Precision
    
    # Log the name of the best estimator (e.g., 'xgboost', 'lgbm')
    mlflow.log_param("best_estimator", automl.best_estimator)

    # 4. Log the Detailed History File
    # FLAML writes a log file (set in your settings as 'flaml.log')
    # We log this as an artifact so you can download/inspect the full trial history later
    mlflow.log_artifact("flaml.log")
    
    # 5. Log the Final Model
    # This saves the model in a format MLflow can serve or deploy later
    best_model = automl.model.estimator
    #mlflow.sklearn.log_model(best_model, "model")
    mlflow.sklearn.log_model(
    sk_model=best_model, 
    artifact_path="model", 
    input_example=X_train.head(1)
)

    # FLAML stores the best found model pipeline in 'model.estimator'
    #best_model = automl.model.estimator
    #mlflow.sklearn.log_model(best_model, "model",input_example=X_train.head(1))

    print("Run saved to MLflow!")

Starting AutoML with a time budget of 30 seconds...
Starting FLAML training...
[flaml.automl.logger: 01-28 05:47:37] {1752} INFO - task = classification
[flaml.automl.logger: 01-28 05:47:37] {1763} INFO - Evaluation method: holdout
[flaml.automl.logger: 01-28 05:47:37] {1862} INFO - Minimizing error metric: 1-ap
[flaml.automl.logger: 01-28 05:47:37] {1979} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth', 'sgd', 'lrl1']
[flaml.automl.logger: 01-28 05:47:37] {2282} INFO - iteration 0, current learner lgbm
[flaml.automl.logger: 01-28 05:47:37] {2417} INFO - Estimated sufficient time budget=224s. Estimated necessary time budget=5s.
[flaml.automl.logger: 01-28 05:47:38] {2466} INFO -  at 0.2s,	estimator lgbm's best error=0.1679,	best estimator lgbm's best error=0.1679
[flaml.automl.logger: 01-28 05:47:38] {2282} INFO - iteration 1, current learner lgbm
[flaml.automl.logger: 01-28 05:47:38] {2466} INFO -  at 0.4s,	estimator lgbm's best error

In [9]:
print(X_train.head(1))

            V1        V2        V3        V4        V5        V6        V7  \
3021  1.721567 -0.278398 -0.374122  1.620882 -0.643373 -0.399292 -0.529103   

            V8        V9       V10  ...       V20       V21       V22  \
3021  0.028126  1.068808 -0.411499  ... -0.002246  0.054291  0.135393   

           V23       V24      V25       V26       V27       V28  Amount  
3021  0.177179 -0.151525 -0.38247 -0.763582  0.079404  0.024146  103.71  

[1 rows x 29 columns]


## Evaluation

We can inspect the best model found and evaluate it on our test set.

In [None]:
# Retrieve the best model found
print('Best ML learner:', automl.best_estimator)
print('Best hyperparmeter config:', automl.best_config)
print('Best Average Precision on validation data: {0:.4g}'.format(1-automl.best_loss))

# Evaluate on the test set
y_pred_proba = automl.predict_proba(X_test)[:, 1] # Get probabilities for class 1

# Calculate Average Precision Score manually
ap_score = average_precision_score(y_test, y_pred_proba)
print(f"Test Set Average Precision Score: {ap_score:.4f}")

## Save Model

We can save the optimized AutoML object using pickle.

In [None]:
with open('flaml_fraud_model.pkl', 'wb') as f:
    pickle.dump(automl, f)

print("Model saved to flaml_fraud_model.pkl")

***If this documentation includes code, including but not limited to, code examples, Cloudera makes this available to you under the terms of the Apache License, Version 2.0, including any required notices. A copy of the Apache License Version 2.0 can be found [here](https://opensource.org/licenses/Apache-2.0).***