### Fine-tune ML models to find the best hyperparameters and nuisance model

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/WinterSchool2026/ch09-causal-inference-extremes/blob/main/notebooks/03_trained_nuisance_models.ipynb)

In [None]:
# Upgrade pip first for better dependency resolution
!pip install -U pip

In [None]:
# Install packages, ensuring numpy is at a version compatible with most 2024-2025 builds
!pip install -q econml numba xarray zarr fsspec aiohttp geopandas dask netcdf4 h5netcdf "numpy<2.0"

In [None]:
%matplotlib inline
import seaborn as sns
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import sys
import os
from google.colab import drive

Mount the folder with the utils functions

In [None]:
# 1. Mount drive if you haven't already
drive.mount('/content/drive')

In [None]:
# 2. Append the PARENT directory (notebooks), not the utils folder itself
path_to_parent = '/content/drive/MyDrive/09_challenge_EllisWinterSchool'
if path_to_parent not in sys.path:
    sys.path.append(path_to_parent)

# 3. Now Python sees 'utils' as a package inside 'notebooks'
import utils.utils
from utils.utils import *

print("âœ… Success! Functions imported.")

Load sample data (trimmed)

In [None]:
samples = pd.read_csv("/content/drive/MyDrive/09_challenge_EllisWinterSchool/df_ps_trimmed.csv")

Set the variables: Outcome (target), Treatment, Heterogeneity (zones), Confounders.

In [None]:
target = ['DI_agri_extreme_M7']

treatment = ['SMA_2']

zones = ['basin_lv2']

vars_list = ['E_gleam_ds','S_gleam_ds','H_gleam_ds',
            'pev_ds','sro_ds','sp_ds','tp_ds','d2m_ds',
            'agri_irri', 'agri_mix', 'agri_rain',
            'soil_clay', 'soil_oc', 'soil_roots','soil_sand', 'soil_tawc',
            'lst_night_ds','ndvi_ds','ndwi_ds',
            'pop','road','hand','lc2','lc3','lc5','lc8',
            'censo','soi_long','pdo_timeseries_sstens','noaa_globaltmp_comb']

Encode Xi (the heterogeneity features) - Create a one-hot ecoder (dummy variable)

In [None]:
zones_encoded = encode_categorical_raster(samples[zones[0]], prefix='zone')
samples_zones = samples.join(zones_encoded)
zone_vars = [v for v in samples_zones.columns if v.startswith(('zone'))]
samples_zones.head()

---
# Find the best nuisance models for O and T 

The target is to find the best model to predict the **Outcome** and the **Treatment**.

The best models will be used to train the residualizing models in the Causal Forest.

### Train several ML models for Treatment and Outcome classification

Suggestions to use:

- https://pycaret.readthedocs.io/en/stable/api/classification.html
- https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html
- https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_stats.html

#### Temporal sampling (train: 2012-2018/ validation: 2019-2020/ test: 2021-2022)

In [None]:
train_data, val_data, test_data = split_data_by_time(
    samples_zones, 
    time_col='time', 
    train_years=(2012, 2018), 
    val_years=(2019, 2020), 
    test_years=(2021, 2022)
)

---
# Outcome nuisance model

In [None]:
# Binary variable -> classification problem -> probability of extreme drought (pedict row probability of class 1)

In [None]:
plot_class_distribution(
    train_data, 
    val_data, 
    test_data, 
    target_col=target,
)

Model training and evaluation

In [None]:
## TODO: train your model here

Setting raw_score=True ensures you get the individual probability scores for each class.

#### Select best model based on test set

Train model with train + validation samples

Quantify accuracy and plot predictive power on test data

---
# Treatment nuisance model

In [None]:
# Binary variable -> classification problem -> probability of having an extreme soil moisture value (pedict row probability of class 1)

Model training and evaluation

In [None]:
## TODO: train your model here

Setting raw_score=True ensures you get the individual probability scores for each class.

#### Select best model based on test set

Train model with train + validation samples

Quantify accuracy and plot predictive power on test data