In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:95% !important; }</style>"))

## CycleGAN Notes:
Initial trials of basic Cycle Consistent Generative Adversarial Networks on a toy dataset show some positive results and it is hoped that with some further tuning we could generate promising results. The toy dataset is generated by a python script using some constant lab test parameters. We can generate a finite number (`n`) of data points for our toy dataset. From the toy dataset we will pick a subset of points to represent the data from the Lab Test Results (`LTR dataset`) and for the simulation dataset `dataset_Sim` we will pick ~ twice the no. of points of the LTR dataset. The Simulation data points are designed in this toy problem to have a systematic error resulting in an (inaccurate) correction factor `c_f = 1`. Our goal in using CycleGANs is to transfer the characteristics of the LTR data distribution to the Simulation points, just as in [this example](https://dmitryulyanov.github.io/feed-forward-neural-doodle/) where CycleGANs were used to transfer the style of Van Gogh paintings to dog photos. <br>Ideally, the modified simulation data generated by the CycleGANs process should be much closer to the LTR data points than the initial simulation data. Since there will be random errors and outliers in lab test data of real applications, these errors should also be taken into account and identified automatically by the CycleGANs process, in an appropriate manner.

We will evaluate CycleGANs performance in two cases:
* **Case 1**: The LTR dataset is clean without any outliers or errors and shows a clear increasing monotonic trend.
* **Case 2**: The LTR dataset contains one outlier point, where we manipulate the correction factor of this point (`D == 132`) to an erroneous value.

<table>
    <tr>
        <td><b><center>Case 1</center></b><img src='images/CycleGAN_ToyData_Case1.png'></td>
        <td><b><center>Case 2</center></b><img src='images/CycleGAN_ToyData_Case2.png'></td>
    </tr>
</table>

To reproduce the results we set `numpy.random.seed(seed = 42)`, this initializes the weights and other parameters to the same values every time the notebook is re-run.

### Desired Output from CycleGANs
The CycleGAN is expected to learn the following:<br>
1. The Simulation points have a systematic error (`c_f = 1`), and though it comes from physics based model the simulation points does not capture the real world lab test data (or represent reality).
2. The Lab Test Results (LTR) data is representative of the real world. Though there is physics also in the LTR data (like linear dependencies: `c_f` $\propto$ `dia`) the LTR data also (as in Case 2 above) is non-linear because of uncertainities and outliers in the data,<br>
Finally, the CycleGANs should synthesize new data (i.e transform the Simulation points to LTR style data) such that it captures the non linearity in the LTR data, exhibiting the same slope as the LTR data and is also physically consistent(`c_f > 1`).  

In both cases, we expect the `c_f` value to be > 1 (as it is in the real world) and the CycleGAN should capture the monotonically increasing trend of `D` vs. `c_f` in the LTR dataset in consideration of the neighborhood of points in the Simulation dataset.In _case 2_ , the CycleGAN output should be least impacted by the outlier point in the LTR dataset and the desired output is represented by the **hand drawn**, dotted green line. In other words, the newly CycleGAN synthesised points should have a lower systematic error compared to the Simulation points.
<tr>        
    <td><img src='images/CycleGAN_ToyProblem_ExpectedOutput.png' style="width: 1000px;"/></td>        
</tr>

## <font color='blue'>Hypothesis</font> and Observations:
Hyper parameter tuning for CycleGANs is a complex process compared to hyperparamter tuning of a normal neural network.There are many parameters which impact the output and performance of the CycleGANs on this toy dataset. Some of the recorded observations in response to our hypothesis are:

1. <font color='blue'> Does the presence of outliers in the dataset impact the CycleGAN Output? </font><br>
In **Case 1** (LTR points without any outliers) the style transfer is easy, the CycleGAN generates new points that captures the linear monotonically increasing function (after training for just 2000 epochs with `batch_size` = 4) as the dataset to learn from (LTR dataset) is smooth and simple, but still we get erroneous correction factors at the start.<br>
In **Case 2** where we have one outlier in the LTR dataset, the non-linear trend is captured to an extent (after training for 20,000 epochs with `batch_size` = 2) though it captures the non-linear shape of the LTR dataset distribution to an extent. <br> In both cases, the output is still not as expected and we sometimes get erroneous correction factors (<1).

<table>
    <tr>
        <td><b><center>Case 1</center></b><img src='images/CycleGAN - Output1.png'></td>
        <td><b><center>Case 2</center></b><img src='images/CycleGAN - Output2.png'></td>
    </tr>
</table>

For comparative study we will first run a simulation to setup a **base case**. This base case has 5 LTR points and 12 Simulation points, keeping other factors like no. of `training epochs` = 20000, `batch size` = 2, `updata_sim_pool(max_size = 5)` fixed.
<tr>        
    <td><b><center>Base Case</center></b><img src='images/CycleGAN_DataSetSize_CaseStudy_Base.png' style="width: 700px;"/></td>        
</tr>

2. <font color='blue'> Does increasing the sample points in the datasets impact the CycleGAN Output? </font><br>

Here we will do 2 test runs (Case A and Case B) where in **Case A**- we first increase the LTR data points keeping the Simulation points constant and in **Case B**- we keep the LTR data points constant increasing the Simulation data points. Other parameters are kept constant as in the base case.<br>
<table>
    <tr>
        <td><b><center>Case A - Nr. of LTR Points: 8; Nr. of Simulation Points: 12</center></b><img src='images/CycleGAN_DataSetSize_CaseStudyA_MoreLTRPoints.png'></td>
        <td><b><center>Case B - Nr. of LTR Points: 5; Nr. of Simulation Points: 23</center></b><img src='images/CycleGAN_DataSetSize_CaseStudyB_MoreSIMPoints.png'></td>
    </tr>
</table>

Adding more points to the simulation dataset reduces the overall error (`c_f being < 1` ) very slightly, but is not necessarily helping in reaching the expected style transfer of the non-linear distribution shape.

3. <font color='blue'> Does **batch_size** impact the quality of the CycleGAN output? </font><br>
In this simulation, we vary the batch size keeping other parameters like no. of LTR and simulation points, training epochs etc. constant as in the base case. A lower batch size gives slightly better results (in terms of capturing the style transfer). This is expected because we have few LTR data points.
<table>
    <tr>
        <td><b><center>Batch Size: 2</center></b><img src='images/BatchSize2.png'></td>
        <td><b><center>Batch Size: 4</center></b><img src='images/BatchSize4.png'></td>
    </tr>
</table>

4. <font color='blue'> Does the **nr. of training epochs**, impact the CycleGAN output? </font><br>
In this simulation, we simply vary the `no. of training epochs`, while keeping `batch_size` and  other parameters constant as in the base case. Compared to the base case, where we ran the simulation for 20,000 epochs training the model for higher epochs seems to reduce the error in the output, but there is no improvement in the style transfer. In fact, the style transfer seems to get worse if the model is over trained at higher nr. of epochs.
<table>
    <tr>
        <td><b><center>Nr. of Epochs: 30000</center></b><img src='images/CycleGAN_Case1_Epochs30000.png'></td>
        <td><b><center>Nr. of Epochs: 40000</center></b><img src='images/CycleGAN_Case2_Epochs40000.png'></td>
    </tr>
</table>

5. <font color='blue'> What is the impact of modifying the attributes of the `update_sim_pool` function on the CycleGAN output? </font><br>
For this we test the effect of changing the `max_size` attribute in the `update_sim_pool` function, keeping other parameters constant as in the base case. The `update_sim_pool` function based on the `update_image_pool` as in the [original paper]((https://github.com/junyanz/CycleGAN/blob/master/util/image_pool.lua)) 

<table>
    <tr>
        <td><b><center>Max_Size: 2</center></b><img src='images/CycleGAN_USP_MS2_Case1.png'></td>
        <td><b><center>Max_Size: 10</center></b><img src='images/CycleGAN_USP_MS10_Case2.png'></td>
    </tr>
</table>

The training losses for the two generators and discriminators seem to performing satisfactorily.
<br>
<tr>        
    <td><img src='images/CycleGAN_ToyProblem_Loss.png' style="width: 1000px;"/></td>        
</tr>

### Planned Tasks:
* Investigate other Loss functions for the discriminator like `mse`, `binary cross-entropy`or other customized loss functions from [SR-GAN](https://arxiv.org/abs/1811.11269).
* Investigate `Physical consistenncy loss` from [LLNL Implementation of CycleGAN](https://github.com/rushilanirudh/icf-jag-cycleGAN/blob/master/modelsv2.py).<sup>[2]</sup>

### Further Readings:
1. https://github.com/junyanz/CycleGAN
2. [Exploring Generative Physics Models with Scientific Priors in Inertial Confinement Fusion](https://arxiv.org/pdf/1910.01666.pdf)

### Discussions:
