In [1]:
from utils import *
from simulations import *

cuda


### Before reading that notebook please follow the instructions of the file [INSTALL.md](../INSTALL.md)

## I - How to launch a simulation ?

- All you need is to precise some hyperparameters relative to the experiment in a yaml config file. Please find below the list of hyperparameters you have to give according to the setup (note that the config `example.yaml` contains already the structure to comply with) :

| Name of the hyperparameter | Description                                                                                | Default value (ours)                                                                                                               |
|----------------------------|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------|
| seed                       | The seed used to make the training reproducible                                            | 2021                                                                                                                               |
| N_fold                     | The number of fold you use for your cross validation                                       | 3                                                                                                                                  |
| im_size                    | The size of the patches used for the training phases                                       | 128 |
| setup                      | The setup you consider for your experiment (SrcOnly, Mix or Update)                                                | 'SrcOnly'                                                                                                                          |
| precisions                 | Some precisions about the experiments (if None deduced from source_path and target_path)           | s=[source]_t=[target]                                                                                                                     |
| source->nb_source_max              | The maximal number of patches you want to use for the source during training               | 10**(8)                                                                                                                            |
| source->filename                | The filename of your source domain                                                  | 'source-none.hdf5'                                                                                                                 |
| source->name                | Name of the source domain (deduced from source->filename, no need to add it)                                       | 'source-none'                                                                                                                      |
| target->nb_target_max              | The maximal number of patches you want to use for the target during training               | 10**(8)                                                                                                                            |
| target->filename               | The filename of your target domain                                                  | 'target-qf(5).hdf5'                                                                                                                |
| target->name                | Name of the target domain (deduce from target->filename, no need to add it)                                        | 'target-qf(5)'                                                                                                                     |
| training->save_at_each_epoch         | if True, for your first fold only, the weights of the detector will be saved at each epoch | true                                                                                                                               |
| training->max_epochs                 | The maximal number of epochs for the training phases                                       | 30                                                                                                                                 |
| training->earlystop_patience         | The maximal number of epochs we wait before the earlystopping                              | 5                                                                                                                                  |
| training->lr                         | The initial learning rate for our training phases                                          | 0.0001                                                                                                                             |
| training->batch_size           | The size of the batch size used during the training phases                                 | 128                                                                                                                                |
| eval->batch_size            | The size of the batch size used during the evaluation phases                               | 512                                                                                                                                |
| eval->domain_filenames               | The filenames of the domains used for the evaluation phases                             | ["target-qf(5).hdf5", "target-qf(10).hdf5", "target-qf(20).hdf5", "target-qf(50).hdf5", "target-qf(100).hdf5", "target-none.hdf5"] |
| eval->domain_names               | The name of the domains for the evaluation phases (deduced from domain_filenames, no need to add it)               | ["qf(5)", "qf(10)", "qf(20)", "qf(50)", "qf(100)", "none"]                                                                         |


For what follows, note that the source and target filenames are stored in the list `sources` and `targets` implicitly imported above via `simulations.py`

In [None]:
print(sources)
print(targets)

- **Example 1 : We want to test the Experiment  `SrcOnly_s=none_t=qf(5)`**

In [None]:
simulate('./Results/SrcOnly-s=none_t=qf(5)/hyperparameters-SrcOnly-s=none_t=qf(5).yaml')

- **Example 2 : We want to test the Experiment  `TgtOnly_s=qf(5)_t=qf(5)`**

*Technically, the TgtOnly setup is just a SrcOnly setup with an other source. Hence, we didn't explicitly considered a TgtOnly setup in our code*

In [None]:
simulate('./Results/SrcOnly-s=qf(5)_t=qf(5)/hyperparameters-SrcOnly-s=qf(5)_t=qf(5).yaml')

- **Example 3 : We want to test the Experiment  `Update(sigma=8)_s=None_t=qf(5)`**

*For that we need to precise also the bandwiths parameter at the level of each final dense layer. This is possible with an extra key 'sigmas' that you need to add in the config file*

*You can also precise with the key 'precisions' that you choose a specific bandwith for your experiment so that it appeared in the names of the folder and the file containing the results*


In [None]:
simulate('./Results/Update-s=none_t=qf(5)/hyperparameters-Update-s=none_t=qf(5).yaml')

## II -  Can I reproduce the nice gif you gave in the Readme to see what is going one for each experiment ?

In [None]:
import torch
import imageio

Of course ! Setting the key `save_at_each_epoch` to True enables to save the weights of your detector at each epoch for the first training phase (first fold). 
When you have all the weights, you can use the function below. 

It requires use to install imageio doing `pip install imageio`. 

Moreover, you need before to obtain a batch and its associated labels from your domain

To do so you can simply do something like below :

```
 my_set=MyDataset(f'{your_domain_path}',key1=f'test_0',key2=f'l_test_0')
 my_dataloader=DataLoader(my_set, batch_size=512, shuffle=True)

 torch.manual_seed(10)
 it=iter(my_dataloader)
 batch,labels=next(it)
```

Pay attention that you also need to precise again the hyperparameters that you used for your experiment with a config file

In [None]:
def create_gif(hyperparameters_config_file,batch,labels):
    hyperparameters=initialize_hyperparameters(hyperparameters_config_file)
    my_detector=ForgeryDetector(hyperparameters)
    my_detector.to(device)
    
    for i in range(0,25):
        
        my_detector.load_state_dict(torch.load(f'./Results/{my_detector.folder_path}/{hyperparameters['setup']}-{i+1}.pt'))
        my_detector.eval()
        
        embedding=(my_detector(batch)).cpu().detach().numpy()
        
        plt.figure(figsize=(24,8))

        norm0=(my_detector(batch[labels==0]).view(-1)).cpu().std().detach().numpy()
        norm1=(my_detector(batch[labels==1]).view(-1)).cpu().std().detach().numpy()
        plt.hist((embedding[labels==0]).reshape(-1)/norm0,alpha=0.5,label='real',bins=50,color='#1ABC9C',density=True);
        plt.hist((embedding[labels==1]).reshape(-1)/norm1,alpha=0.5,label='forged',bins=50,color='#186A3B',density=True)
        plt.plot([0,0],[0,1],color='black',lw=5,linestyle='--',alpha=0.5)
        plt.title(f'Distribution of the final embeddings from your domain ({hyperparameters['setup']},epoch {i})');
        plt.legend()
        plt.xlim(-5,5)
        plt.ylim(0,1)
        
        plt.savefig(f'{i}.png');
        plt.close()

    with imageio.get_writer(f'Evolution.gif', mode='I') as writer:
        for filename in np.array([10*[f'{i}.png'] for i in range(0,30)]).reshape(-1):
            image = imageio.imread(filename)
            writer.append_data(image)