config.yaml Main config
model/ Instantiates a model backbone (see src/models/)
dataset/ Instantiates a datamodule (see src/dataloaders/)
loader/ Defines a PyTorch DataLoader
task/ Defines loss, metrics, optional encoder/decoder (see src/tasks/)
pipeline/ Combination of dataset/loader/task for convenience
optimizer/ Instantiates an optimizer
scheduler/ Instantiates a learning rate scheduler
trainer/ Flags for the PyTorch Lightning Trainer class
callbacks/ Misc options for the Trainer (see src/callbacks/)
experiment/ Defines a full experiment (combination of all of the above configs)
generate/ Additional flags used by the generate.py script
This README provides a brief overview of the organization of this configs folder. These configs are composed to define a full Hydra config for every experiment.
The main config is found at configs/config.yaml
, which is an example experiment for Permuted MNIST. Different combinations of flags can be overridden to define alternate experiments. The config files in this folder define useful combinations of flags that can be composed. Examples of full configs defining end-to-end experiments can be found in experiment/.
Flags can also be passed on the command line.
- At the beginning of running
train.py
, the full Hydra config is printed. This is very useful for making sure all flags were passed in as intended. Try runningpython -m train
and inspecting the full base config.
- Generally, most dictionaries in the config correspond exactly to the arguments passed into a Python class. For example, the configs in
model/
,dataset/
,loader/
,optimizer/
,scheduler/
,trainer/
define dictionaries which each instantiate exactly one object (a PyTorchnn.Module
,SequenceDataset
, PyTorch DataLoader, PyTorch optimizer, PyTorch scheduler, and PyTorch LightningTrainer).
- Instantiating objects is controlled by the very useful Hydra instantiate utility.
- In this codebase, instead of defining a
_target_=<path>.<to>.<module>
, we use shorthand names for each desired class (wherever a_name_
attribute appears). The filesrc/utils/registry.py
lists these shorthand names found in these configs to the full class path.
- Check READMEs for the source code. For example, the configs in configs/model correspond to classes in src/models, the configs in configs/dataset correspond to classes in src/dataloaders.
configs/optimizer/adamw.yaml
_name_: adamw
lr: 0.001
weight_decay: 0.00
- When composed into a larger config, this should define a dictionary under the corresponding sub-config name. For example, the config printed by
python -m train optimizer=adamw optimizer.weight_decay=0.1
includes the following dictionary, confirming that the flags were passed in correctly.
├── optimizer
│ └── _name_: adamw
│ lr: 0.001
│ weight_decay: 0.1
-
The file
src/utils/registry.py
includes anoptimizer
dictionary mappingadamw: torch.optim.AdamW
. -
The full optimizer config is equivalent to instantiating
torch.optim.AdamW(lr=0.001, weight_decay=0.1)
The model/
configs correspond to modules in src/models/
.
See model/README.md
.
The dataset/
configs correspond to modules in src/dataloaders/
.
loader/
configs are used to instantiate a dataloader such as PyTorch's torch.utils.data.DataLoader
.
Other configs correspond to extensions of this found in the source file src/dataloaders/base.py
, for example dataloaders that allow sampling the data at different resolutions.
A task is something like "multiclass classification" or "regression", and defines how the model interfaces with the data.
A task defines the loss function and additional metrics, and an optional encoder and decoder.
These configs correspond to modules in src/tasks/
.
The encoder is the interface between the input data and model backbone. It defines how the input data is transformed before being fed to the model.
The decoder is the interface between the model backbone and target data. It defines how the model's outputs are transformed so that the task's loss and metrics can be calculated on it.
optimizer/
and scheduler/
configs are used to instantiate an optimizer class and scheduler class respectively.
A pipeline consists of a dataset + loader + encoder + decoder + task (and optionally optimizer+scheduler). This is sometimes what people refer to as a "task", such as the "CIFAR-10 classification task". A pipeline fully defines a training scheme; combining a pipeline with a model specifies an end-to-end experiment.
Overally, a pipeline fully defines a training experiment aside from the model backbone. This means any pipeline can be flexibly combined with any model backbone to define an experiment, regardless of the dimensions of the data and model, e.g.: python -m train pipeline=cifar model=transformer
.
defaults:
- /trainer: default
- /loader: default
- /dataset: cifar
- /task: multiclass_classification
- /optimizer: adamw
- /scheduler: plateau
train:
monitor: val/accuracy # Needed for plateau scheduler
mode: max
encoder: linear
decoder:
_name_: sequence
mode: pool
-
The
trainer/default.yaml
andloader/default.yaml
specify a basic PyTorch Lightning trainer and PyTorch DataLoader -
The
dataset/cifar.yaml
defines a dataset object that specifies data and target pairs. In this case, the data has shape(batch size, 1024, 1)
and the target has shape(batch size,)
which are class IDs from 0-9. -
The model is not part of the pipeline; any model can be combined with this pipeline as long as it maps shape
(batch size, 1024, d_input) -> (batch size, 1024, d_output)
-
The task consists of an encoder, decoder, loss function, and metrics. The
encoder
interfaces between the input data and model backbone; this example specifies that the data will pass through annn.Linear
mapping dimension the data from(batch size, 1024, 1) -> (batch size, 1024, d_input)
. Thedecoder
will map the model's outputs from(batch size, 1024, d_output) -> (batch size,)
by pooling over the sequence length and passing through anothernn.Linear
. Finally, themulticlass_classification
task defines a cross entropy loss and Accuracy metric. -
This pipeline also defines a target optimizer and scheduler, which are optional.
An experiment combines every type of config into a complete end-to-end experiment. Generally, this consists of a pipeline and model, together with training details such as the optimizer and scheduler. See experiment/README.md.