Final version.

duckietown · Jan 2, 2021 · c4d7b98 · c4d7b98
1 parent 3028e2e
commit c4d7b98
Show file tree

Hide file tree

Showing 5 changed files with 116 additions and 18 deletions.
diff --git a/...montreal_2020/02-Improving-RL-baseline/02_instructions-Improving-RL-baseline.md b/...montreal_2020/02-Improving-RL-baseline/02_instructions-Improving-RL-baseline.md
@@ -12,7 +12,25 @@ Requires: Access to a Nvidia GPU (locally or remotely)
 
 ## Images of expected results {#demo-improving-rl-baseline-expected}
 
-TODO: Add some DAE results dans VAE results.
+You should expect to get results similar to the following by training the DAE for 150 epochs and continuing for another 150 epochs on images of size 240 * 320.
+
+<figure>
+    <figcaption>Reconstruction of the image by the DAE after training over 150 epochs with a learning rate of `0.001` and `adam` optimizer. On the left, we can see the original image. On the right is the image generated by the DAE.</figcaption>
+    <img style='width:22em' src="figures/image_through_dae_release_150_epoch.png"/>
+</figure>
+
+<figure>
+    <figcaption>Reconstruction of the image by the DAE after training over 300 epochs with a learning rate of `0.001` and `adam` optimizer. On the left, we can see the original image. On the right is the image generated by the DAE.</figcaption>
+    <img style='width:22em' src="figures/image_through_dae_release_300_epoch.png"/>
+</figure>
+
+You should expect to get results similar to the following by training the $beta$-VAE for 150 epochs on images of size 240 * 320 with a latent dimension of 128 and without using the DAE for the loss.
+
+<figure>
+    <figcaption>Reconstruction of the image by a $beta$-VAE after training over 150 epochs with a learning rate of `0.0005`, a latent dimension of 128 and `adam` optimizer without using a DAE for the loss.
+Left: original image, Right: VAE reconstruction of the original image. </figcaption>
+    <img style='width:22em' src="figures/image_through_beta_vae_release_150_epoch.png"/>
+</figure>
 
 ## Laptop setup notes {#demo-improving-rl-baseline-laptop-setup}
 
@@ -123,11 +141,11 @@ Option | Description | Default value
 `split` | number of images per file (if used without --compress) | 2000
 
 
-To reproduce our results, use the map we created by appending `--map-name $PWD/maps/dataset_generator.yaml` to the command. Later, you will need to have the images in png format, so you should use the flag `--compress`.
+To reproduce our results, use the map we created by appending *- -map-name \$PWD/maps/dataset_generator.yaml* to the command. Later, you will need to have the images in png format, so you should use the flag `--compress`.
 
-Once the files are generated, if they were generated in png, `cd` in the folder that contain the images. Then, list then in a file.
+Once the files are generated, if they were generated in png, `cd` in the folder that contain the images. Then, list them in a file. 
 
-    laptop $ ls -d $PWD/* > train.txt      
+ls -d \$PWD/* > train.txt
 
 Finally, copy `train.txt` in the directory designed by the keys ̀`data:files:base` and assign the name of this file to the key `data:files:train` of config file you will use (`config/default.yaml` by default).
 
@@ -165,12 +183,71 @@ If you wish to train the beta-VAE using directly the original input images and t
 
 If you want to add some flags to your Comet experiment, then you can add them with `--comet-tag tag1 tag2`.
 
-TODO: add ref to the dev process in the report.
+In practice, to reproduce the results for a DAE, you could go in the config file and set both `dae` in the `model` value and the desired number of epoch in the `num_epochs` value. Then, execute the following command.
+
+    laptop $ python3 train.py
+
+To train a $beta$-VAE without using a DAE, set the `$beta$` and `num_epochs` values of the config file. You also have to set `beta_vae` in the `model` value. Then, execute the following command.
+
+    laptop $ python3 train.py --no-dae
+
+If you want to sequentially train a DAE and a $beta$-VAE using this DAE from scratch, then set `DAE` in the `model` value of the config file, execute the training script, change the `model` value to `beta_vae` and execute `train.py` again. The following command do it automatically using the default config file.
+
+    laptop $ python3 train.py && sed -i 's/module: "dae"/module: "beta_vae"/g' config/defaults.yaml && python3 train.py && sed -i 's/module: "beta_vae"/module: "dae"/g' config/defaults.yaml
+
+To see the main training sessions we did, see the [training sessions](#improving-rl-baseline-final-contribution) section of the report. Note that most of the process we gone through involved changing the code rather than only changing config values.
+
+### Analysing the resulting models
+
+In order to check what the DAE and $beta$-VAE have learned to reconstruct and what the variables of the latent space of the $beta$-VAE represent, two scripts where created.
+
+The first script, `explore_latent_space.py`, can either be use to visualize the output of the VAE or to visualize its traversals, which are the image generated from a particular state by varying each latent variable one at a time.
+
+There are different options available when launching the script.
+
+
+Option | Description | Default value
+--- | --- | ---
+`config` | path to config file | ./config/defaults.yaml
+`vae-checkpoint` | vae checkpoint from which to start the training | None
+`select-state` | generate the latent space dimension bounds | false
+`state` | state to use if --select-state is not given or if no state have been selected | None
+`generate-bounds` | generate the latent space dimension bounds | false
+`generate-traversals` | generate a traversals of the latent space | false
+`dimensions` | dimensions to traverse | None
+`bounds-checkpoint` | vae latent space bounds checkpoint to use | None
+
+
+Using `--select-state` flag allows to both visualize the output of the $beta$-VAE on a random image from the training dataset and to choose the state to generate the traversals from. It will print the state in latent space corresponding to the image in the console. This printed state might have to be reformatted as a space separated list of number to be used with the `--state` flag. Using `--select-state` will also output three png files in a folder named `img` under the directory set in `output_path` value in the config file (`image_through_beta_vae.png`, `original_image_beta_vae.png`, `decoded_image_beta_vae.png`). Those two last files can be used by the `explore_dae.py` script if used with `--single-image` flag.
+
+The latent space bounds need to be generated in order to generate the traversals. It will be generated either automatically for the current model if the checkpoint file doesn't exists and is required or with the use of the `--generate-bounds` flag. The latent space bounds will be saved under `bounds_latest_ckpt.npy` in the directory set in `output_path` value in the config file.
+
+The `--dimensions` flag sets the dimensions to traverse and take the dimensions as a space separated list of number. If not specified, the script will go through all the latent variable while generating the traversals.
+
+The second script, `explore_dae.py`, can be used to visualize the output of the DAE or to visualize the output of both the original image and the image reconstructed by the VAE and then passed trough the DAE.
+
+There are different options available when launching the script.
+
+
+Option | Description | Default value
+--- | --- | ---
+`config` | path to config file | ./config/defaults.yaml
+`dae-checkpoint` | dae checkpoint from which to start the training | None
+`dae-output` | generate images to see the original image and its reconstruction by the DAE | false
+`print-values` | print output image values to console | false
+`single-image` | use latent space exploration output | false
+
+
+Use the `dae-output` to analyse the reconstruction of the images by the DAE. It will save png files in a folder named `img` under the directory set in `output_path` value in the config file (`image_through_dae.png`, `original_image_dae.png`, `decoded_image_dae.png`).
+
+Use the `print-values` flag if you need to check that the numerical values the DAE output make sense.
+
+Use the `single-image` flag to generate a single image in a folder named `img` under the directory set in `output_path` value in the config file (`image_through_dae.png`). The image will consist of four subimages corresponding clockwise from the top left to the original image, the original image reconstructed by the DAE, the image sequentially reconstructed by the $beta$-VAE and the DAE and the original image reconstructed by the $beta$-VAE. The images that will be used to generate this whole image are `original_image_beta_vae.png`, `decoded_image_beta_vae.png` that must be present in the `output_path` folder and that can be generated by the `explore_latent_space.py` script. This functionality is useful to analyse how the $beta$-VAE reconstruct the images and how it relates to the effect of the DAE that is used to compute the loss during the training.
 
 ## Troubleshooting {#demo-improving-rl-baseline-troubleshooting}
 
 Nothing ever go wrong, right?
 
 ## Demo failure demonstration {#demo-improving-rl-baseline-failure}
 
-TODO: add image of bad DAE results
+Since we didn't achieve our original goal, we don't really have a failure demonstration. You can see different possible examples in the training sessions section of the[report](#improving-rl-baseline-final-contribution).
diff --git a/...ntreal_2020/02-Improving-RL-baseline/02_project-report-Improving-RL-baseline.md b/...ntreal_2020/02-Improving-RL-baseline/02_project-report-Improving-RL-baseline.md
@@ -7,10 +7,8 @@ All of the work was done in collaboration between Étienne Boucher (@lifetheater
 
 ## The final result {#improving-rl-baseline-final-result}
 
-... Our initial goal was to compare an RL agent trained on top of a perceptual module with the current RL baseline. However we encountered some bottlenecks in the training of the perceptual model, and thus, offer a basis for work to be continued, rather than an improvement of the RL baseline ... 
-Sorry if the title of the project deceived you ! But you can still read the report to learn more about a very interesting idea you might be able to successfully implement !
+Our initial goal was to compare an RL agent trained on top of a perceptual module with the current RL baseline. However we encountered some bottlenecks in the training of the perceptual model, and thus, offer a basis for work to be continued, rather than an improvement of the RL baseline. Even though our work didn't lead to an impact on the Duckibots performance this semester, we hope that it will full of useful insights for the future of the RL agents in Duckietown.
 
-#TODO: check if we want to put anything here 
 <figure>
     <figcaption>Samples of original image (left) and reconstruction by the DAE (right)</figcaption>
     <img style='width:20em' src="figures/dae_sample.png"/>
@@ -246,8 +244,7 @@ The fact that the colors were really out of the expected range, brought us to qu
 
 <figure>
     <figcaption>Reconstruction of the image by the DAE after training over 600 epochs with a learning rate of `0.001` and `adam` optimizer. On the left, we can see the original image. On the right is the image generated by the DAE.</figcaption>
-    <img style='width:22em' src="figures/image_through_dae_epoch_600_after_drop.png
-    "/>
+    <img style='width:22em' src="figures/image_through_dae_epoch_600_after_drop.png"/>
 </figure>
 
 <figure>
@@ -360,6 +357,26 @@ Leaving the colorjittering transformations out of the data processing with the s
 
 These experiments might hint that the normalization of input images should be checked, or we should investigate further the impact of the size of the input.
 
+The final DAE model have a number of filter of 32, 32, 64, 64 for the encoder and the reverse for the decoder, a dense layer for the bottleneck of 128 neurons and have batch normalization along with LeakyReLu non-linearity for the convolution and transposed convolution layers. The weights of the convolution and transposed convolution layers are initialized with a He normal distribution. We can see below the results for training this model for 150 epochs and continuing for another 150 epochs.
+
+<figure>
+    <figcaption>Reconstruction of the image by the DAE after training over 150 epochs with a learning rate of `0.001` and `adam` optimizer. On the left, we can see the original image. On the right is the image generated by the DAE.</figcaption>
+    <img style='width:22em' src="figures/image_through_dae_release_150_epoch.png"/>
+</figure>
+
+<figure>
+    <figcaption>Reconstruction of the image by the DAE after training over 300 epochs with a learning rate of `0.001` and `adam` optimizer. On the left, we can see the original image. On the right is the image generated by the DAE.</figcaption>
+    <img style='width:22em' src="figures/image_through_dae_release_300_epoch.png"/>
+</figure>
+
+The final $beta$-VAE have a number of filter of 32, 64, 128, 256, 512 for the encoder and the reverse for the decoder, a dense layer after the encoder and before the decoder of 256 neurons and have batch normalization along with LeakyReLu non-linearity for the convolution and transposed convolution layers. The weights of the convolution and transposed convolution layers are initialized with a He normal distribution. We can see below the results for training this model for 150 epochs without using the DAE for the loss computation.
+
+<figure>
+    <figcaption>Reconstruction of the image by a $beta$-VAE after training over 150 epochs with a learning rate of `0.0005`, a latent dimension of 128 and `adam` optimizer without using a DAE for the loss.
+Left: original image, Right: VAE reconstruction of the original image. </figcaption>
+    <img style='width:22em' src="figures/image_through_beta_vae_release_150_epoch.png"/>
+</figure>
+
 To summerize, there are several parameters we considered and varied for the $\beta$-VAE training: 
 
 - Data :
@@ -407,7 +424,9 @@ The first one is `explore_latent_space.py` that can either be use to visualize t
     <img style='width:20em' src="figures/Traversals2.png"/>
 </figure>
 
-The second script is `explore_dae.py` that can be used 
+The second script is `explore_dae.py` that can be used to visualise the output of the DAE or to visualise the output of both the original image and the image passed through the VAE passed trough the DAE like in the figure 5.15. Note that, the image filename pattern must match the one that the `explore_latent_space.py` script use to export its images. 
+
+See the *Analysing the resulting models* section of the [instructions](#demo-improving-rl-baseline-run) for more details about how to use these scripts. 
 
 ## Formal performance evaluation / Results {#improving-rl-baseline-final-formal}
 ### DAE
@@ -425,20 +444,22 @@ Looking at the DAE reconstructions along the training, we notice that before the
     <img style='width:20em' src="figures/dae_train_loss_good_dae_run.svg"/>
 </figure>
 
+Then, in the case of that particular training session, the network learned at the end of the first plateau to generate different output values in the different RGB channels. It is something that didn't happend during the other training session and that we couldn't explain with precision. Investigation is required to find a way to insure that learning to output in color happen.
+
 ### Beta Variational Auto Encoder
-We can see the 1st plateau... see coresponding figure above... at 300 gray, while at 600 after the drop, in color.
+
+As for the $beta$-VAE, within the allowed time, we couldn't find a way to reconstruct images with more visually recognisable elements than the sky and the ground.
 
 ### Overall results 
-While the DAE gave satisfactory results, we were not able to obtain a good beta-VAE model. Even after trying different strategies including smarter weight initialisation, increasing the number of filter and the dimension of the latent space, we couldn't get anything else than the grey sky with uniform dark ground. There might be some more parameter tuning to be done to be able to reconstruct colors and details. 
+
+While the DAE gave satisfactory results, we were not able to obtain a good $beta$-VAE model. Even after trying different strategies including smarter weight initialisation, increasing the number of filter and the dimension of the latent space, we couldn't get anything else than the grey sky with uniform dark ground. There might be some more parameter tuning to be done to be able to reconstruct colors and details. 
 Nonetheless, the runs on smaller images might hint that it might be worth trying to develop a model first on smaller images.
 
-We didn't get to trying to train the RL part of DARLA, so we did not get to assess the performance of our model following the process outlined exposed earlier. 
-Instead, we have set the basis and infrastructure for future work in that direction.
+We didn't get to trying to train the RL part of DARLA, so we did not get to assess the performance of our model following the process outlined exposed earlier. Instead, we have set the basis and infrastructure for future work in that direction.
 
 ## Future avenues of development {#improving-rl-baseline-final-next-steps}
 
 The first step would be to complete the search for an untangled representation and then try using it to train the RL agent.
-One way to go would be to start with a model taking 64*64 images as it seemed like the most promising run of VAE, and then,
-depending on the performance of the agent, it could also be interesting to investigate reward shaping and other RL learning techniques like Rainbow DQN, TD3, and SAC.
+One way to go would be to start with a model taking 64*64 images as it seemed like the most promising run of VAE, and then, depending on the performance of the agent, it could also be interesting to investigate reward shaping and other RL learning techniques like Rainbow DQN, TD3, and SAC.
 
 <div id="./bibliography.bib"></div>
diff --git a/...0/02-Improving-RL-baseline/figures/image_through_beta_vae_release_150_epoch.png b/...0/02-Improving-RL-baseline/figures/image_through_beta_vae_release_150_epoch.png
diff --git a/...l_2020/02-Improving-RL-baseline/figures/image_through_dae_release_150_epoch.png b/...l_2020/02-Improving-RL-baseline/figures/image_through_dae_release_150_epoch.png
diff --git a/...l_2020/02-Improving-RL-baseline/figures/image_through_dae_release_300_epoch.png b/...l_2020/02-Improving-RL-baseline/figures/image_through_dae_release_300_epoch.png