# ML Hyperparameter Tuning - Random Search

This notebook demonstrates hyperparameter tuning using `ml_tune_random`, which samples from parameter distributions rather than exhausting all combinations.

Random search is more efficient than grid search for large search spaces, especially when some hyperparameters are more important than others.

Results are written to `tuning_results.json`.

In [1]:
import openeo # type: ignore

In [2]:
connection = openeo.connect(url="http://127.0.0.1:8000")
connection.authenticate_basic("brian", "123456")

<Connection to 'http://127.0.0.1:8000/' with BasicBearerAuth>

In [3]:
training_set = "https://github.com/e-sensing/sitsdata/raw/main/data/samples_deforestation_rondonia.rds"

## Parameter Distributions

Random search supports several distribution types:
- **Arrays** - uniform sampling from discrete values
- **`uniform`** - continuous uniform between min and max
- **`log_uniform`** - log-uniform distribution (good for learning rates spanning orders of magnitude)
- **`int_uniform`** - integer uniform distribution
- **`choice`** - explicit discrete choices from a values array

**Important for deep learning models:** Always include `epochs` in your parameter distributions. If omitted, the model uses its default (e.g., 150 for TempCNN), which can be very slow - especially when the system falls back to CPU due to GPU memory constraints.

In [4]:
# Parameter distributions for random search
# IMPORTANT: Always include 'epochs' for deep learning models - otherwise it uses
# the model default (150 epochs) which is very slow, especially on CPU fallback.
param_distributions = {
    # Discrete choices (uniform sampling from array)
    "batch_size": [32],
    # Log-uniform distribution for learning rate (good for rates spanning orders of magnitude)
    "learning_rate": {"type": "log_uniform", "min": 0.0001, "max": 0.001},
    # Integer uniform distribution for epochs (keep low for faster tuning)
    "epochs": {"type": "int_uniform", "min": 30, "max": 80},
}

In [5]:
process_graph = {
    "tempcnn1": {
        "process_id": "mlm_class_tempcnn",
        "arguments": {
            "optimizer": "adam",
            "learning_rate": 0.0005,
            "seed": 42,
            "verbose": True
        },
    },
    "tune1": {
        "process_id": "ml_tune_random",
        "arguments": {
            "model": {"from_node": "tempcnn1"},
            "training_data": training_set,
            "target": "label",
            "parameters": param_distributions,
            "n_iter": 2,  # Number of random samples to evaluate (keep low for testing)
            "scoring": "accuracy",
            "cv": 0,
            "seed": 42
        },
        "result": True,
    },
}

job = connection.create_job(
    process_graph=process_graph,
    title="TempCNN hyperparameter tuning (random search)",
    description="Random search for TempCNN hyperparameters",
)
job.start_and_wait()
results = job.get_results()

0:00:00 Job '6c4c1beeb7959277691878be603d86ef': send 'start'
0:00:00 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:00:05 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:00:11 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:00:19 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:00:29 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:00:41 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:00:56 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:01:15 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:01:39 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:02:09 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:02:46 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:03:33 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:04:31 Job '6c4c1beeb7959277691878be603d86ef': running (progress N/A)
0:05:31 Job '6c4

In [6]:
results.download_files("data/outputs_random/")

[PosixPath('data/outputs_random/tuning_results'),
 PosixPath('data/outputs_random/job-results.json')]