# Deep Learning Example: Image Classification

This is a friendly tutorial to cover a basic image classification task. We will leverage the capabilities of `dysweep` to run different experiments on all the different configurations of the model design, as well as the dataset.

First run the following piece of code to set the root directory of the project as the parent directory:

In [1]:
# set the parent directory as the working directory
import sys
sys.path.append("..")
# remove the current directory from the path
sys.path = sys.path[1:]

## Getting started

To get started, we should first imagine the entire pipeline of our experiment. In this simple case, we have to first download the datasets for classification (in this case CIFAR10), perform any preprocessing and transforms needed to create a dataloader. Furthermore, we will also pick a specific model designed for the task at hand and train it using an optimizer. 

Each of the different parts of the experiments have different hyperparameters and configurations. It is good to come up with a plan of how we want to structure our configurations beforehand. A simple sketch of the run configurations as well as the simple training loop itself is given below that logs all the results in `wandb`.


In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torchvision.models import resnet50
from tqdm import tqdm
import wandb

wandb.init(project="image_classification")

# Configurations and hyper-parameters

# Training configurations
EPOCH_COUNT = 10
LR = 0.001
# Data configurations
NUM_WORKERS = 2
BATCH_SIZE = 64
# Model configurations
NUM_CLASSES = 10
PRETRAINED_OR_NOT = False

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


# Define transformations for the train set
train_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

# Define transformations for the test set
test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

# Load datasets
train_set = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=train_transform)
test_set = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=test_transform)

# Create dataloaders
train_loader = torch.utils.data.DataLoader(train_set, batch_size=BATCH_SIZE, shuffle=True, num_workers=NUM_WORKERS)
test_loader = torch.utils.data.DataLoader(test_set, batch_size=BATCH_SIZE, shuffle=False, num_workers=NUM_WORKERS)

# Load the ResNet model
model = resnet50(pretrained=PRETRAINED_OR_NOT, num_classes=NUM_CLASSES).to(device) 

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=LR)

# Training function
def train(epoch_count=EPOCH_COUNT):
    model.train()
    for epoch in range(epoch_count):  # 10 epochs
        wandb.log({"epoch": epoch})
        for i, data in tqdm(enumerate(train_loader, 0)):
            inputs, labels = data[0].to(device), data[1].to(device)
            
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            
            wandb.log({"loss": loss.item()})
            
        # compute the test accuracy
        correct = 0
        for i, data in tqdm(enumerate(test_loader, 0)):
            inputs, labels = data[0].to(device), data[1].to(device)
            
            outputs = model(inputs)
            # get the maximum logit
            _, predicted = torch.max(outputs.data, 1)
            # check if the prediction is correct
            correct += (predicted == labels).sum().item()
        
        wandb.log({"accuracy": correct / len(test_set)})
 
try:   
    train()
finally:
    wandb.finish()
    

## Generic and Configurable using Dypy

To illustrate how our configurations work, we will make the above code configurable down to some of the very basic components. Usually, there is no need for such detailed configuration setup; however, we do this to showcase some features of out pipeline:

1. Outline some of the functionalities of the [DyPy](https://github.com/vahidzee/dypy) library.
2. Illustrate that the configurations can be complex and hierarchical, but generic in a sense that you can manipulate everything down to the very last detail using a configuration file.

Furthermore, we will define a YAML configuration file that will allow us to define the entire experiment, and that these experiments are tunable down to the very last details. The following would be the dictionary that is stored in the yaml file that can accessed [here](./conf.yaml):

```python
{
    'data': {
        "dataset_class": "torchvision.datasets.CIFAR10",
        "batch_size": 64,
        "num_workers": 2,
        "train_transforms": [
            {
                "class_path": "torchvision.transforms.RandomHorizontalFlip",
            },
            {
                "class_path": "torchvision.transforms.RandomCrop",
                "init_args": {
                    "size": 32,
                    "padding": 4
                }
            },
            {
                "class_path": "torchvision.transforms.ToTensor",
            },
            {
                "class_path": "torchvision.transforms.Normalize",
                "init_args": {
                    "mean": [0.4914, 0.4822, 0.4465],
                    "std": [0.2023, 0.1994, 0.2010]
                }
            }
        ],
        "test_transforms": [
            {
                "class_path": "torchvision.transforms.ToTensor",
            },
            {
                "class_path": "torchvision.transforms.Normalize",
                "init_args": {
                    "mean": [0.4914, 0.4822, 0.4465],
                    "std": [0.2023, 0.1994, 0.2010]
                }
            }
        ],
    },
    "model": {
        "class_path": "torchvision.models.resnet50",
        "init_args": {
            "pretrained": False,
            "num_classes": 10
        }
    },
    "trainer": {
        "epoch_count": 10,
        "optimizer": {
            "class_path": "torch.optim.SGD",
            "init_args": {
                "lr": 0.001
            }
        }
    }
}
```

Now if we assume `config` is the above configuration, we can re-write the code as follows:

In [None]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
from torchvision.models import resnet50
from tqdm import tqdm
import dypy as dy
import yaml

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# load conf.yaml into the dictionary config
with open("conf.yaml", 'r') as stream:
    cfg = yaml.safe_load(stream)

# Define transformations for the train set
train_transform = transforms.Compose([
    dy.eval(x['class_path'])(**(x['init_args'] if 'init_args' in x else {})) for 
        x in cfg['data']['train_transforms']
])

# Define transformations for the test set
test_transform = transforms.Compose([
    dy.eval(x['class_path'])(**(x['init_args'] if 'init_args' in x else {})) for 
        x in cfg['data']['test_transforms']
])

# Load datasets
train_set = dy.eval(cfg['data']['dataset_class'])(root='./data', train=True, download=True, transform=train_transform)
test_set = dy.eval(cfg['data']['dataset_class'])(root='./data', train=False, download=True, transform=test_transform)

# Create dataloaders
train_loader = torch.utils.data.DataLoader(
    train_set, 
    batch_size=cfg['data']['batch_size'], 
    shuffle=True, 
    num_workers=cfg['data']['num_workers'],
)
test_loader = torch.utils.data.DataLoader(
    test_set, 
    batch_size=cfg['data']['batch_size'], 
    shuffle=False, 
    num_workers=cfg['data']['num_workers'],
)

# Load the ResNet model
model = dy.eval(cfg['model']['class_path'])(**cfg['model']['init_args']).to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = dy.eval(
        cfg['trainer']['optimizer']['class_path']
    )(model.parameters(), **cfg['trainer']['optimizer']['init_args'])
    
try:
    train(epoch_count=cfg['trainer']['epoch_count'])
finally:
    wandb.finish()

Now one can easily manipulate the YAML file to their needs to have all the different experiments run. Since the configuration is generic, a lot of different setups can be run. For example, you can define your custom set of transforms, or even define your own model and optimizer.

This is partially due to the dynamic nature of the [DyPy](https://github.com/vahidzee/dypy) library as well. Although not necessarily best practice, one can also define code snippets in the YAML file. This is useful for quick prototyping and testing all the possible configurations without touching the actual source code.

Furthermore, we will use this configuration as a template to change it automatically using the weights and biases sweep API.

## From Flat Sweeps to Hierarchical Sweeps

Here's where we will start using the weights and biases library and later extend it with our library. First off, say we want to sweep over the following different hyper-parameters:
1. Different learning rate values: `[0.001, 0.01]`.
2. Different optimizers: `[SGD, Adam]`.
3. Different epoch counts: `[10, 20]`.
4. Different batch sizes: `[32, 64]`.
5. For the Adam optimizer, we're going to try out different weight_decay values: `[0.1, 0.01]`.
Note that the standard sweep configuration dictionary is very limited, and should be defined in a flat way. Therefore, we will define the sweep configuration as follows:


In [None]:
import wandb

sweep_config = {
    'name': 'my-sweep',
    'method': 'grid',
    'metric': {
        'name': 'loss',
        'goal': 'minimize'   
    },
    'parameters': {
        'optimizer_type': {
            'values': ['torch.optim.SGD', 'torch.optim.AdamW']
        },
        'optimizer_weight_decay': {
            'values': [0.1, 0.01]  
        },
        'optimizer_lr': {
            'values': [0.001, 0.01],
        },
        'epoch_count': {
            'values': [10, 20],
        },
        'batch_size': {
            'values': [32, 64],
        },
    } 
}

sweep_id = wandb.sweep(sweep_config)

def train():
    # Initialize a new wandb run
    with wandb.init() as run:
        # Update the cfg dictionary with sweep parameters
        cfg = run.config
        
        # change the sweep configuration into the elaborate, yet generic configuration
        with open("conf.yaml", 'r') as stream:
            base_cfg = yaml.safe_load(stream)
        base_cfg['trainer']['optimizer']['class_path'] = cfg['optimizer_type']
        base_cfg['trainer']['optimizer']['init_args']['lr'] = cfg['optimizer_lr']
        if cfg['optimizer_type'] == 'torch.optim.Adam':
            base_cfg['trainer']['optimizer']['weight_decay'] = cfg['optimizer_weight_decay']
        base_cfg['trainer']['epoch_count'] = cfg['epoch_count']
        base_cfg['data']['batch_size'] = cfg['batch_size']
        
        #... (insert your training code here) ...
        raise NotImplementedError("Implement your trainer using wandb logs from the code snippet above!")

# Run the sweep
wandb.agent(sweep_id, function=train, count=1)


Since the sweep configuration is flat, there should exist an extra few lines to convert the flat configuration into the generic one that was intended to work on all different running procedures. One of the many features of our library is that it eliminates that and allows W&B sweeps to also work with hierarchies.

We recommend working with the `dysweep_run_resume` function that takes in all the sweep configurations as well as a function that has the following signature: `func(config, checkpoint_dir)`. For brevity, we will not explain `checkpoint_dir` here and only focus our attention on `config`. Later on, we will return to that.

In this example, the `dysweep_run_resume` function will take in a base configuration as well as a sweep configuration as follows. The hierarchies also apply to the values that are being sweeped upon. Therefore, for instance, here we can see that an entire `optimizer` dictionary is being sweeped upon, in cotrast to only sweeping over primitve values.

**Note**: `dysweep` uses a decoy run under the project specified that contains specific meta-data to communicate between the vanilla W&B API and our new one. Therefore, the name of the project should be specified when called; otherwise, the decoy run will be created under a different project.

In [None]:
from dysweep import dysweep_run_resume
from pprint import pprint
import yaml

with open("conf.yaml", 'r') as stream:
    base_cfg = yaml.safe_load(stream)

# Define the sweep configuration
sweep_config = {
    'name': 'my-sweep',
    'method': 'grid',
    'metric': {
      'name': 'loss',
      'goal': 'minimize'   
    },
    'parameters': {
        'data': {
            'batch_size': {
                'sweep': True,
                'values': [32, 64, 128]
            },
        },
        'trainer': {
            'epoch_count': {
                'sweep': True,
                'values': [10, 20],
            },
            'optimizer': {
                'sweep': True,
                'sweep_alias': [
                    'adam-lr-0.001-wd-0.1',
                    'adam-lr-0.001-wd-0.01',
                    'adam-lr-0.01-wd-0.1',
                    'adam-lr-0.01-wd-0.01',
                    'sgd-lr-0.001',
                    'sgd-lr-0.01',
                ],
                'values': [
                    {
                        'class_path': 'torch.optim.Adam',
                        'init_args': {
                            'lr': 0.001,
                            'weight_decay': 0.1,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.Adam',
                        'init_args': {
                            'lr': 0.001,
                            'weight_decay': 0.01,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.Adam',
                        'init_args': {
                            'lr': 0.01,
                            'weight_decay': 0.1,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.Adam',
                        'init_args': {
                            'lr': 0.01,
                            'weight_decay': 0.01,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.SGD',
                        'init_args': {
                            'lr': 0.001,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.SGD',
                        'init_args': {
                            'lr': 0.01,
                        },  
                    },
                ]
            }
        }
    }
}

sweep_id = dysweep_run_resume(
    project="tutorial",
    base_config=base_cfg,
    # A hierarchical sweep configuration that is not possible with the normal sweep library
    sweep_configuration= sweep_config,
)

The `dysweep_run_resume` function is multi-purpose! This means that after instantiating the sweep, you can use the same function with different arguments to run a specific configuration of that sweep. Note that you can run the following line of code **on any other machine** -- this is part of the beauty of sweep in general, that allows running configurations on different machines and even clusters for parallel computing. It should just simply have access to the same workspace and project to download all the meta-data required.

In [None]:
from dysweep import dysweep_run_resume
from pprint import pprint

# sweep_id = # Enter the sweep id obtained

def train(config, checkpoint_dir):
    ### YOUR TRAINING CODE HERE (glossing over checkpoint_dir) ###
    print("Training on the following configuration:")
    pprint(config)
    raise NotImplementedError("Implement your trainer using wandb logs from the code snippet above!")


ret = dysweep_run_resume(
    project="tutorial",
    function=train,
    count=1,
    sweep_id=sweep_id,
)

### Using Upserts

Now if you take a look at the sweep configuration that was specified earlier, you will notice that you have to repeat a lot of configurations in the `optimizer` portion. In specific, for any configuration of `optimizer_type` and `weight_decay`, you would have to repeat the learning rates as well. This is not ideal, and we can use the `upsert` functionality to avoid this. The `upsert` functionality allows us to specify a list of configurations that will be merged with the base configuration. 

What we will do in this particular case is that we will first specify the `optimizer_type` and `weight_decay` (3 different ways). In turn, we will update the two different values for `lr` on top of that (2 different ways). Allowing us to come up with the same set of 6 different configurations, but in a much more concise way.

To do so, simply change the `sweep_config` to the following:

```python
{
    'name': 'my-sweep',
    'method': 'grid',
    'metric': {
      'name': 'loss',
      'goal': 'minimize'   
    },
    'parameters': {
        'data': {
            'batch_size': {
                'sweep': True,
                'values': [32, 64, 128]
            },
        },
        'trainer': {
            'epoch_count': {
                'sweep': True,
                'values': [10, 20],
            },
            'optimizer': {
                'sweep': True,
                'sweep_alias': [
                    'adam-wd-0.1',
                    'adam-wd-0.01',
                    'sgd',
                ],
                'values': [
                    {
                        'class_path': 'torch.optim.AdamW',
                        'init_args': {
                            'weight_decay': 0.1,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.AdamW',
                        'init_args': {
                            'weight_decay': 0.01,
                        },  
                    },
                    {
                        'class_path': 'torch.optim.SGD',
                    },
                ]
            },
            'dy__upsert': [
                {
                    'sweep': True,
                    'sweep_identifier': 'lr',
                    'sweep_alias': [
                        'lr-0.001',
                        'lr-0.01',
                    ],
                    'values': [
                        {
                            'optimizer': {
                                'init_args': {'lr': 0.001},
                            },
                        },
                        {
                            'optimizer': {
                                'init_args': {'lr': 0.01},
                            },
                        },
                    ]
                }
            ]
        }
    }
}
```


The entire philosophy of dysweep revolvs around upserting configurations and we have a lot of advanced dynamic functionalities to do so. You can read more about all the ways we can alter the sweep configuration in the [documentation](https://dysweep.readthedocs.io/en/latest/).

## Checkpointing and Re-running

While doing large-scale computing, the sweep server passes on sweep configurations to the goal machine for running. If by any chance, the goal machine fails, the sweep server will not be able to re-run the same configuration again. To do so, there should be a systematic repository that saves all the run identifiers that have finished, are being run, or have failed. 

With dysweep, all of this is handled seamlessly for you. In fact, for every configuration, a specific checkpoint directory is specified that you can use to store model checkpoints. In turn, whenever you try to re-run, you can load the model from the checkpoint directory and continue training from there. This is the reason why the `dysweep_run_resume` function contains a `resume` keyword, and that the function passed to it can take in an extra argument `checkpoint_dir`.

### Re-running

Whenever dysweep instantiates a new sweep, all the logs will be saved in either `./dysweep_logs` or in a specific root directory you specify. In this directory, a set of files will be created per each of the sweeps you create. For example, if your sweep has id equal to `s6kye2ah`, then a `checkpoints-s6kye2ah` directory will appear that contains meta-data and checkpoint of all the runs associated with that sweep.

Under this directory you will see two things:
1. A set of configurations that were run successfully. These `json` files contain the entire configuration that `func` was called with and are stored in `{run_id}-config.json` format.
2. A set of configurations that are being run currently and either have not finished, or have failed. These runs contain both configurations, and checkpoints. These information are stored in a subdirectory of the format `{i}-{run_id}`. The `i` is simply an index used internally for ordering the runs that are being run.
Using this subdirectory, you can access specific run ids that you wish to re-run. 

To do so, simply pass the `rerun_id` argument in the `dysweep_run_resume` function. For example, if you wish to re-run the run with `id=xipqslhf`, you can do so as follows:
```python
dysweep_run_resume(
    func=func,
    rerun_id='xipqslhf',
    project='tutorial',
)
```

### Resuming 

To resume a specific run, you can set the `resume` argument to `True`. Note that what this does is that whenever the function `func` is being called, the `checkpoint_dir` will be a directory that contains all of your previous checkpoints. From that point on, you can load the model from the checkpoint directory and continue training from there. 

### Multi-resume

Say you have run a sweep for a long time and you come back to your system only to realize 10 of the runs have failed due to the machine cluster preempting the tasks. You would want to resume all of the 10 runs **using a single process** on a machine that you know will not preempt. In that case, you can call the `dysweep_run_resume` function with the `resume` option set to `True`. However, instead of defining a single `rerun_id`, you will set the `count` argument to the number of runs you wish to resume. 

```python
dysweep_run_resume(
    func=func,
    resume=True,
    count=10, # resume 10 runs
    project='tutorial',
)
```

Using all of the information from this section you can play around with the training task to check all of these capabilities. In particular, we have implemented the full task in two source codes. The one for [instantiating](../testing/main_sweep_maker1.py) the sweep and the one for [running](../testing/main_sweep_user.py) the sweep.

To do so, run the following command:

```bash
cd ../testing/
python main_sweep_runner.py
```

Then, you can access the sweep id by simply looking at the output of the command. Let's denote the output by `ssssssss`. Then, you can run the following command to run the sweep:

```bash
python main_sweep_user.py --sweep_id=ssssssss --count=<the-number-of-runs-to-trigger-with-the-process>
```

You can also interrupt any of the processes and take a look at the `dysweep_logs` directory to see how the logs are being saved. Under each `checkpoings-<sweep_id>` directory, you will see the configurations that have successfully ran (in the form of `<run-id>-config.json`), plus, a set of directories of the runs that are either running at the moment or have been killed or interrupted (in the form of `<i>-<run-id>`).

Now, you can re-run any of the runs that have logs in the `dysweep_logs` directory. To do so, simply check the run identifier of your desired run (let's assume it is `rrrrrrrr`) and run the following command:

```bash
python main_sweep_user.py --sweep_id=ssssssss --rerun_id=rrrrrrrr --resume=<True/False>
```

With the `resume` knob you can check whether you want to re-run or resume. Note that the code implemented in the testing section follows the `jsonargparse` standard which we descibe in the documentations.

## Beyond Hyperparameter Search

Our aim with this library is to allow for a more generic and configurable way of running experiments. Therefore, we intend on using the library for **any** large scale computing involving different configurations that can be **generated in a systematic way**.

For example, let's assume you want to check your model performance across a variety of datasets or check what is the effect of normalization on the input. Instead of running different sweeps for each dataset, we can define a standard and generic configuration scheme for all possible datasets, and then plug them into our sweep configuration. This way, we can run our model on multiple datasets using a single sweep. 

Moreover, as a special use-case in research, after developing an optimized model, we want to benchmark it against different baselines on different datasets. All of these different configurations can be systematically sweeped upon using dysweep.

### Checking the Effect of Normalization -- A Use-Case of List Operations

Assume you want to check how much normalization affects the performance of the model. If we take a look at the configuration used for this problem, we can see that the training and testing transforms contain a list of different transformations where the last transformation normalizes w.r.t the mean and standard deviation of the training set. 

Using dysweep upsert, you can also define operations over the configuration. Note that the philosophy behind dysweep is to start off with the base configuration, and then apply different changes to the configuration. One such change would be to **remove** the last element of the lists `train_transforms` and `test_transforms`. This way, we can run the model without normalization.

The following sweep configuration, will do exactly that:


```python
{
    'name': 'normalization-sweep',
    'method': 'grid',
    'metric': {
      'name': 'loss',
      'goal': 'minimize'   
    },
    'parameters': {
        'data': {
            'dy__upsert': [
                {
                    'sweep': True,
                    'sweep_identifier': 'norm_no_norm',
                    'sweep_alias': ['without_norm', 'with_norm'],
                    'values': [
                        # remove the last transform from both train and test transforms
                        {
                            'train_transforms': {
                                'dy__list__operations': [
                                    {'dy__remove': -1}
                                ]
                            },
                            'test_transforms': {
                                'dy__list__operations': [
                                    {'dy__remove': -1}
                                ]
                            }
                        },
                        # leave as is
                        {}
                    ]   
                }
            ]
        },
    }
}
```

You can also do the same thing without list operations, but it would require a hefty configuration where you define both lists `train_transforms` and `test_transforms` for each of the two cases. We have implemented the sweep instantiator in [this](../testing/sweep_maker2.py) source code that you can run similar to before.



### Checking the Model on Different Datasets

To demonstrate the capability of our package to move beyond only hyper-parameter tuning, we have also devised another sweep configuration to showcase that we can also use dysweep for running experiments on different datasets. To do so, we have also considered the classification task of [CIFAR100](https://www.cs.toronto.edu/~kriz/cifar.html). For changing datasets, we only need to change the transforms and dataset class, as well as the number of classes in our model logits output. This can be done using the following:

```python
{
    'name': 'dataset-sweep',
    'method': 'grid',
    'metric': {
      'name': 'loss',
      'goal': 'minimize'   
    },
    'parameters': {
        'dy__upsert': [
            {
                'sweep': False,
                'sweep_identifier': 'dataset',
                'sweep_alias': ['CIFAR100', 'CIFAR10'],
                'values': [
                    # CIFAR100
                    {
                        'data': {
                            'dataset_class': 'torchvision.datasets.CIFAR100',
                            'train_transforms': {
                                'dy__list__operations': [
                                    {
                                        'dy__overwrite': [
                                            3,
                                            {
                                                "class_path": "torchvision.transforms.Normalize",
                                                "init_args": {
                                                    "mean": [0.5071, 0.4865, 0.4409],
                                                    "std": [0.2673, 0.2564, 0.2762]
                                                }
                                            }
                                        ]
                                    },
                                ]
                            },
                            'test_transforms': {
                                'dy__list__operations': [
                                    {
                                        'dy__overwrite': [
                                            1,
                                            {
                                                "class_path": "torchvision.transforms.Normalize",
                                                "init_args": {
                                                    "mean": [0.5071, 0.4865, 0.4409],
                                                    "std": [0.2673, 0.2564, 0.2762]
                                                }
                                            }
                                        ]
                                    },
                                ]
                            },
                        },
                        'model': {
                            'init_args': {
                                'num_classes': 100
                            }
                        }
                    },
                    # leave as is 
                    {}
                ]
            },
        ]
    }
}
```


### Use-Case of Linking using `dy__eval` (Advanced Linking)

While [Pyyaml](https://github.com/yaml/pyyaml/) provides dynamic linking capabilites, [Dypy](https://github.com/vahidzee/dypy) takes it to the next level by introducing code-snippets in the YAML configuration. We leverage this capability of Dypy to introduce a new linking mechanism. As a use-case, we will implement the same sweep configuration as before, but this time we will use `dy__eval` instead of using list operations.

For any particular attribute in the configurations, you can use `dy__eval` to set that attribute. `dy__eval` can be set to a code snippet of a function that takes in the **current configuration**, and using that, returns the value of the attribute of interest. 

For example, here, we may only sweep over the attribute `dataset_class` and then according to the value it has been set to, we can define the transform normalization values as well as the model class count. To do so, we can run the following configuration:

```python
get_transform_from_conf = """
def func(conf):
    dataset_type = conf['data']['dataset_class'].split('.')[-1]
    if dataset_type == 'CIFAR100':
        return {
            'mean': [0.5071, 0.4865, 0.4409],
            'std': [0.2673, 0.2564, 0.2762]
        }
    elif dataset_type == 'CIFAR10':
        return {
            'mean': [0.4914, 0.4822, 0.4465],
            'std': [0.2023, 0.1994, 0.2010]
        }
"""

get_num_classes_from_conf = """
def func(conf):
    dataset_type = conf['data']['dataset_class'].split('.')[-1]
    if dataset_type == 'CIFAR100':
        return 100
    elif dataset_type == 'CIFAR10':
        return 10
"""

{
    'name': 'dataset-sweep-dy-eval',
    'method': 'grid',
    'metric': {
      'name': 'loss',
      'goal': 'minimize'   
    },
    'parameters': {
        'data': {
            'dataset_class': {
                'sweep': True,
                'values': [
                    'torchvision.datasets.CIFAR100',
                    'torchvision.datasets.CIFAR10'
                ]
            }
        },
        'dy__upsert': [
            {
                'data': {
                    "train_transforms": [
                        {
                            "class_path": "torchvision.transforms.RandomHorizontalFlip",
                        },
                        {
                            "class_path": "torchvision.transforms.RandomCrop",
                            "init_args": {
                                "size": 32,
                                "padding": 4
                            }
                        },
                        {
                            "class_path": "torchvision.transforms.ToTensor",
                        },
                        {
                            "class_path": "torchvision.transforms.Normalize",
                            # set the transform according to the dataset dynamically
                            "init_args": {
                                "dy__eval": {
                                    "expression": get_transform_from_conf,
                                    "function_of_interest": "func"
                                }
                            }
                        }
                    ],
                    "test_transforms": [
                        {
                            "class_path": "torchvision.transforms.ToTensor",
                        },
                        {
                            "class_path": "torchvision.transforms.Normalize",
                            # set the transform according to the dataset dynamically
                            "init_args": {
                                "dy__eval": {
                                    "expression": get_transform_from_conf,
                                    "function_of_interest": "func"
                                }
                            }
                        }
                    ],
                },
                'model' : {
                    'init_args': {
                        'num_classes': {
                            'dy__eval': {
                                'expression': get_num_classes_from_conf,
                                'function_of_interest': 'func',
                            }
                        }
                    }
                }
            }
        ]
    }
}
```
You can also run this configuration by running the third sweep maker [script](../testing/sweep_maker4.py). For more detail on `dy__eval` please check out our documentation.
