# Search Space Basics

In this tutorial, we will walk you through the process of defining a search space, and how to incorporate them into ablator for ablation study on various hyperparameters.

## Create a search space with `ablator.config.hpo.SearchSpace`

In Ablator, [`ablator.config.hpo.SearchSpace`](../config.train.parallel.experiment.rst#configurations-for-parallel-models-experiments) is used to define the search space for a hyperparameter, based on which ablator creates many trials of different values for that hyperparameter. It allows you to specify the range of values for different types of data, and the type of data the hyperparameter is.

Import `SearchSpace`:

```python
from ablator.config.hpo import SearchSpace
```

The `SearchSpace` class (one object created for one hyperparameter) takes the following arguments:

- **`value_range`**: defines a continuous range for a continuous hyperparameter. It is specified in the format of `[<lower_bound>, <upper_bound>]`. In each trial, the hyperparameter will be sampled with a value taken from this range. 

- **`categorical_values`**: defines a discrete set for a discrete hyperparameter. In each trial, the hyperparameter will be sampled with a value taken from this set. 

- **`value_type`**: specifies the hyperparameter's data type. Ablator supports `"int"` for integer values and `"float"` for decimal or floating-point values. This argument is required for hyperparameters that take values from a `value_range`.

Note that categorical values do not require `value_type`.

In the example below, we create a search space with a continuous float range `[0.05, 0.1]` and a search space with a discrete set `[32, 64, 128]`:

```python
from ablator.config.hpo import SearchSpace

SearchSpace(value_range=[0.05, 0.1], value_type="float")
SearchSpace(categorical_values=[32, 64, 128])
```

## Creating search space for hyperparameters

Recall from [Configuration Basics](./Configuration-Basics.ipynb) tutorial, `ParallelConfig` has an argument for search space, which is `search_space`. This argument is defined as a dictionary of `SearchSpace` objects, which captures all search spaces for all hyperparameters that we want to run ablation study on.

`SearchSpace` can be created for hyperparameters that are ablator-predefined configuration attributes, or custom configuration attributes:

- **Predefined Configurations**: Ablator offers predefined configurations for optimizers, schedulers, batch size, epochs, and more. These configurations are readily available for users to use in their experiments.

- **Custom Configurations** (added by users): Users can define custom configurations for hyperparameters specific to their models. For example, the number of hidden layer in a neural network, activation functions, and other relevant hyperparameters. 

### Using `SearchSpace` for predefined configurations

#### For optimizers

Ablator supports three predefined optimizers: `SGD`, `Adam`, and `AdamW`. For an optimizer chosen for the training process, you can create a search space for any of its parameters. For example, to create search space for AdamW optimizer's (parameters are learning rate, epsilon, weight decay, etc.), you can do the following:

```python
my_search_space = {
    "train_config.optimizer_config.arguments.lr": SearchSpace(
        value_range  = [0.01, 0.05],
        value_type = "float"
    ), 
    "train_config.optimizer_config.arguments.eps": SearchSpace(
        value_range  = [1e-9, 1e-7],
        value_type = "float"
    ), 
    "train_config.optimizer_config.arguments.weight_decay": SearchSpace(
        value_range  = [1e-4, 1e-3],
        value_type = "float"
    ),
}
```

The syntax for creating search space for optimizers in ablator is:

```python
search_space = {
    "train_config.optimizer_config.arguments.<parameter 1>": search_space_1,
    "train_config.optimizer_config.arguments.<parameter 2>": search_space_2,
    ...
}
```
where `<parameter 1>` and `<parameter 2>` are the parameters for the corresponding optimizer. You can find parameters for all optimizers in the [Configuration Basics](./Configuration-Basics.ipynb) tutorial.

#### For schedulers

Ablator supports three predefined schedulers: `StepLR`, `OneCycleLR`, and `ReduceLROnPlateau`. For a scheduler chosen for the training process, you can create a search space for any of its parameters. For example, to create search space for `ReduceOnPlateau` scheduler (parameters are min learning rate, patience, factor, threshold, etc.), you can do the following:

```python
my_search_space = {
    "train_config.scheduler_config.arguments.min_lr": SearchSpace(
        value_range  = [1e-6, 1e-4],
        value_type = "float"
    ),
    "train_config.scheduler_config.arguments.threshold": SearchSpace(
        value_range  = [1e-5, 1e-3],
        value_type = "float"
    ),
}
```

The syntax for creating search space for schedulers in ablator is:

```python
search_space = {
    "train_config.scheduler_config.arguments.<parameter 1>": search_space_1,
    "train_config.scheduler_config.arguments.<parameter 2>": search_space_2,
    ...
}
```
where `<parameter 1>` and `<parameter 2>` are the parameters for the corresponding scheduler. You can find parameters for all schedulers in the [Configuration Basics](./Configuration-Basics.ipynb) tutorial.

#### For other parameters
 
We can also provide `SearchSpace` to other parameters like epochs, batch_size, etc. from `TrainConfig`.

The syntax will be:

```python
search_space = {
    "train_config.<parameter 1>": search_space_1,
    "train_config.<parameter 2>": search_space_2,
    ...
}
```
where `<parameter 1>` and `<parameter 2>` are the attributes of `TrainConfig`.

For example, trying different batch_size or epochs can be easily done with the following snippet:

```python
my_search_space = {
    "train_config.batch_size": SearchSpace(
        categorical_values = [32, 64, 128]
    ),
    "train_config.epochs": SearchSpace(
        value_range  = [10, 20],
        value_type = "int"
    ),
}
```

### Using `SearchSpace` for custom configurqations

In the previous tutorials, we have shown that you can run ablation experiment to study different components of a model. For example, we want to study the impact of the hyperparameters `hidden_size` and `activation` on the performance of a model. So we're first creating a custom model configuration with these hyperparameters and using this configuration to build the model:

```python
class CustomModelConfig(ModelConfig):   # hyperparameters to be studied
    hidden_size :int 
    activation: str

class MyModel(nn.Module):
    def __init__(self, config: CustomModelConfig) -> None:
        activation_list = {"relu" : nn.ReLU(), "elu": nn.ELU()}
        
        input_size = 100
        self.fc1 = nn.Linear(input_size, config.hidden_size)
        self.act1 = activation_list[config.activation]

model_config = CustomModelConfig(
    hidden_size = 256,
    activation = "relu"
)
```

Note that we still need to create a model config object with initial values for the hyperparameters (and pass that to the running configuration), even though later we will create multiple trials with different values for them, taken from the search space. 

Let's now create `SearchSpace` for `hidden_size` (an integer range) and `activation` (a discrete set):

The syntax for search space for model hyperparameters is:
```python
search_space = {
    "model_config.<parameter 1>": search_space_1,
    "model_config.<parameter 2>": search_space_2,
    ...
}
```

```python
my_search_space = {
    "model_config.hidden_size": SearchSpace(
        value_range=[250, 500], value_type="int"
    ),
    "model_config.activation": SearchSpace(
        categorical_values = ["relu","elu"]
    ), 
}
```

Putting everything into one:

```python
my_search_space = {
    "train_config.optimizer_config.arguments.lr": SearchSpace(
        value_range  = [0.01, 0.05],
        value_type = "float"
    ), 
    "train_config.optimizer_config.arguments.eps": SearchSpace(
        value_range  = [1e-9, 1e-7],
        value_type = "float"
    ), 
    "train_config.optimizer_config.arguments.weight_decay": SearchSpace(
        value_range  = [1e-4, 1e-3],
        value_type = "float"
    ),
    "train_config.scheduler_config.arguments.min_lr": SearchSpace(
        value_range  = [1e-6, 1e-4],
        value_type = "float"
    ),
    "train_config.scheduler_config.arguments.threshold": SearchSpace(
        value_range  = [1e-5, 1e-3],
        value_type = "float"
    ),
    "train_config.batch_size": SearchSpace(
        categorical_values = [32, 64, 128]
    ),
    "train_config.epochs": SearchSpace(
        value_range  = [10, 20],
        value_type = "int"
    ),
    "model_config.hidden_size": SearchSpace(
        value_range=[250, 500], value_type="int"
    ),
    "model_config.activation": SearchSpace(
        categorical_values = ["relu","elu"]
    )
}
```

Finally, `my_search_space` dictionary is passed to the `ParallelConfig`. This will be explored in more detail in [Hyperparameter Optimization](./HPO-tutorial.ipynb) tutorial.

### SearchSpace in YAML files

If you are using a YAML file to define configurations, we can specify a search space as follows:

```yaml
# other configurations ...
search_space:
  model_config.hidden_size:
    value_range:
    - '250'
    - '500'
    categorical_values: null
    value_type: int
  train_config.optimizer_config.arguments.lr:
    value_range:
    - '0.001'
    - '0.01'
    categorical_values: null
    value_type: float
  model_config.activation:
    value_range: null
    categorical_values:
    - relu
    - leakyRelu
    - elu
    value_type: float
# other configurations ...
```

## Conclusion

In this tutorial, we have demonstrated how to create search space objects and how to utilize them to define search space for various hyperparameters. In the subsequent tutorial, we will explain how to use `search_space` with `ParallelConfig` to launch a parallel ablation experiment.