# HPO for Random Forest with Ray Tune and cuML

This notebook demonstrates how to perform hyperparameter optimization (HPO) for a Random Forest classifier using Ray Tune and cuML. We'll use Ray Tune to efficiently search through hyperparameter combinations while leveraging cuML's GPU-accelerated Random Forest implementation for faster training.

## Problem Overview

We're solving a binary classification problem using the airline dataset, where we predict flight delays. The goal is to find the optimal hyperparameters (number of estimators, max depth, and max features) that maximize the model's accuracy. Ray Tune will orchestrate multiple training trials in parallel, each testing different hyperparameter combinations, while cuML provides GPU acceleration for each individual model training.


### Setup Instructions

#### Brev

```{docref} /cloud/nvidia/brev
For the purpose of this example, follow Option 1 (Setting up your Brev GPU Environment) in the Brev Instance Setup section:
- Create a GPU environment with 4 L4 GPUs
- Make sure to include Jupyter in your setup
- Wait until the "Open Notebook" button is flashing
- Open the Notebook and navigate to a Jupyter terminal
```

#### Environment Setup with uv (included in Brev)

1. Check Your CUDA Version in the Jupyter terminal

Before installing dependencies, verify your CUDA version (shown in the top right corner of the output):

```bash
nvidia-smi
```

2. Create a file named `pyproject.toml` and copy the content below

Based on your CUDA version you have, modify the `cuML` package:

- **CUDA 12.x**: Use `cuml-cu12==26.2.*`
- **CUDA 13.x**: Change to `cuml-cu13==26.2.*`

> **TODO (BLOCKED)**: CuPy 13 requires a system CUDA install (or conda, but not relevant here). CuPy 14 is releasing soon and properly supports installations with only CUDA wheels. This support will come via the cuda-pathfinder project. Once CuPy 14 is out, upgrading it in an existing cuML environment would patch this (e.g., having cuML or cuDF 26.02 but bumping CuPy in your own env). Existing RAPIDS packages pin `cupy>=13.6`, which would resolve to 14 in a new environment or upgrade. This will be autofixed, effectively. Unfortunately, Brev doesn't have a system CUDA installed, or at least is not in GCP machines.

The `pyproject.toml` file should look like this:

```toml
[project]
name = "ray-cuml"
version = "0.1.0"
requires-python = "==3.13.*"
dependencies = [
    "ray[default]==2.53.0",
    "ray[data]==2.53.0",
    "ray[train]==2.53.0",
    "ray[tune]==2.53.0",
    "cuml-cu12==26.2.*",  # Change cu12 to cu13 if you have CUDA 13.x
    "jupyterlab-nvdashboard",
    "ipykernel",
]
```

3. Install Dependencies

```bash
uv sync
```

#### Enable Jupyter nvdashboard

We can use the `jupyterlab-nvdashboard` extension monitor GPU usage in Jupyter

To be able to enable the `nvdashboard` jupyter extension, installed in as part of the setup, 

1. Restart Jupyter: `sudo systemctl restart jupyter.service`
2. Exit and reopen the notebook or refresh your browser

## Getting Started

Open a new notebook to get started with this example.

You should now see a button on the left panel that looks like a GPU, which will give you several dashboards to choose from. For the sake of this example, we will look at GPU memory and GPU Utilization.

![GPU Dashboard Button](../../_static/images/examples/cuml-ray-hpo/nvdashboard.png)


### Data Preparation

Copy the `get_data.py` script provided in the `setup` directory to your current jupyter working directory.

Download the airline dataset. The script supports both a small dataset (for quick testing) and a full dataset (20M rows). By default, it downloads the small dataset. Use the `--full-dataset` flag for the complete dataset. 

In [None]:
! python get_data.py --full-dataset ## for a smaller dataset remove --full-dataset

In [None]:
import pandas as pd
import ray
from cuml.ensemble import RandomForestClassifier
from cuml.metrics import accuracy_score
from ray import tune
from ray.tune import RunConfig, TuneConfig
from sklearn.model_selection import train_test_split

In [None]:
def train_rf(config, data_dict):
    """
    Training function for Ray Tune.

    Args:
        config: Dictionary of hyperparameters from Ray Tune
        data_dict: Dictionary containing training and test data (NumPy arrays)
    """
    # Extract data
    X_train = data_dict["X_train"]
    X_test = data_dict["X_test"]
    y_train = data_dict["y_train"]
    y_test = data_dict["y_test"]

    # Initialize cuML Random Forest with hyperparameters from config
    rf = RandomForestClassifier(
        n_estimators=config["n_estimators"],
        max_depth=config["max_depth"],
        max_features=config["max_features"],
        random_state=42,
    )

    # Train the model
    rf.fit(X_train, y_train)

    # Evaluate on test set
    predictions = rf.predict(X_test)

    # Calculate accuracy using cuML's metric function
    score = accuracy_score(y_test, predictions)

    # Report metrics back to Ray Tune
    return {"accuracy": score}

## Ray Tune Hyperparameter Search

Now we'll set up Ray Tune to search for optimal hyperparameters. Ray Tune will run multiple trials in parallel, each testing different combinations of hyperparameters. Each trial will train a cuML Random Forest model on a GPU and evaluate its performance.

**Important**: Modify the following according to your setup:
- `ray.init()` parameters: Adjust `num_cpus` and `num_gpus` based on your available resources if you are not using the Brev instance indicated. 
- `storage_path` in `RunConfig`: Set a valid local path to save Ray Tune results
- `resources` in `tune.with_resources()`: Configure CPU and GPU allocation per trial


In [None]:
# Initialize Ray with resource constraints
# Note: If you see a FutureWarning about RAY_ACCEL_ENV_VAR_OVERRIDE_ON_ZERO, that's okay -
# it's just informing you about future Ray behavior changes and doesn't affect functionality.
ray.init(num_cpus=8, num_gpus=4)

# use airlines_small.parquet if you downloaded the small dataset
df = pd.read_parquet("data/airlines.parquet")

# Define the target label
label = "ArrDelayBinary"

# Prepare features and target
X = df.drop(columns=[label])  # All columns except the target
y = df[label]  # Just the target column


# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# TODO CHECK IF WE SHOULD CONVERT TO NUMPY ARRAYS FOR Ray Tune FOR zero copy efficiency on CPU?
# https://docs.ray.io/en/latest/ray-core/objects.html#fetching-object-data

# Store data in a dictionary to pass to training function
data_dict = {"X_train": X_train, "X_test": X_test, "y_train": y_train, "y_test": y_test}

**Access Ray Dashboard**: The dashboard is available at `http://127.0.0.1:8265` on the Brev instance. To access it from your local machine, run in your local terminal:

```bash
brev port-forward <your-instance-name> -p 8265:8265
```

In [None]:
import os

# Define hyperparameter search space
search_space = {
    "n_estimators": tune.grid_search([25, 50, 75, 100]),
    "max_depth": tune.grid_search([10, 20, 30, 40]),
    "max_features": tune.grid_search([0.25, 0.5, 0.75, 1.0]),
}

# Using default random search algorithm
tune_config = TuneConfig(
    metric="accuracy",
    mode="max",
)

run_config = RunConfig(
    name="rf_hyperparameter_tuning_real_data",
    storage_path=os.path.abspath("<your-path>/ray_results"),
)

# Create a trainable with resources
trainable = tune.with_resources(
    tune.with_parameters(train_rf, data_dict=data_dict),
    resources={"cpu": 2, "gpu": 1},  # Each trial uses 1 GPU and 2 CPUs
)

# Run the hyperparameter tuning
tuner = tune.Tuner(
    trainable,
    param_space=search_space,
    tune_config=tune_config,
    run_config=run_config,
)

results = tuner.fit()

# Get the best result
best_result = results.get_best_result(metric="accuracy", mode="max")

In [None]:
# Display results

print("Best hyperparameters found:")
print(f"  n_estimators: {best_result.config['n_estimators']}")
print(f"  max_depth: {best_result.config['max_depth']}")
print(f"  max_features: {best_result.config['max_features']}")
print(f"Best test accuracy: {best_result.metrics['accuracy']:.4f}")
print()

In [None]:
# Clean up Ray results directory
import os
import shutil

ray_results_path = "<local_path_to_save_results>/ray_results"
if os.path.exists(ray_results_path):
    print(f"Cleaning Ray results directory: {ray_results_path}")
    shutil.rmtree(ray_results_path)

In [None]:
# Shutdown Ray
ray.shutdown()