# Hyperparameter Optimization with Optuna in cellmaps_vnn

This tutorial shows how to define a training configuration with search spaces for Optuna, run training using the config file, and then use the resulting optimized parameters for prediction.

### Step 1: Define Your Configuration 
Below we define a configuration dictionary for training.

If a parameter is given a list of values, Optuna will treat it as a search space. If it is a single value, it will remain fixed during training.

In [None]:
# Define or modify your hyperparameter configuration here
config = {
    # Training settings
    'epoch': 20,
    'cuda': 0,
    'zscore_method': 'auc',

    # Optimization settings
    'optimize': 1,  # Set to 1 to enable Optuna optimization
    'n_trials': 2,  # Number of trials for Optuna

    # Parameters (if parameter is given a list of values, it will be considered for optimization)
    'batchsize': [32, 64, 128],
    'lr': [0.1, 0.01, 0.001],
    'wd': [0.0001, 0.001, 0.01],
    'alpha': 0.3,
    'genotype_hiddens': 4,
    'patience': 30,
    'delta': [0.001, 0.002, 0.003],
    'min_dropout_layer': 2,
    'dropout_fraction': 0.3,

    # Input files
    'training_data': '../examples/training_data.txt',
    'predict_data': '../examples/test_data.txt',
    'gene2id': '../examples/gene2ind.txt',
    'cell2id': '../examples/cell2ind.txt',
    'mutations': '../examples/cell2mutation.txt',
    'cn_deletions': '../examples/cell2cndeletion.txt',
    'cn_amplifications': '../examples/cell2cnamplification.txt'
}

### Step 2: Save Configuration to YAML
We'll save the configuration to a YAML file. The training pipeline will load this file and extract parameter values and ranges.

In [None]:
import yaml

# Save to a YAML config file
config_path = './vnn_config.yaml'

with open(config_path, 'w') as f:
    yaml.dump(config, f, default_flow_style=False, sort_keys=False)

print(f'Configuration saved to {config_path}')

### Step 3: Train the VNN Model with Optuna
Use **cellmaps_vnncmd.py train** and provide the config file via **--config_file**.
This will automatically trigger Optuna-based optimization for any parameter listed with multiple values.

After training completes, the output folder (out_train_optuna) will contain a config.yaml file — a flattened version of the original config with the best parameters from Optuna.

In [None]:
import subprocess

train_out = 'out_train_optuna'
inputdir = '../examples/'
command = (
    f"cellmaps_vnncmd.py train {train_out} --inputdir {inputdir} --config_file {config_path}"
)
subprocess.run(command, shell=True, check=True)

### Step 2: Make predictions with the Optimized Model
Use the saved config.yaml from training (with best parameters) to perform prediction.

In [None]:
test_out = './out_test'
new_config = f"{train_out}/config.yaml"


command = (
    f"cellmaps_vnncmd.py predict {test_out} --inputdir {train_out} "
    f"--config_file {new_config}"
)
subprocess.run(command, shell=True, check=True)