A vacation research scheme project.
start date: 28 Nov 2019
end date: 12 Feb 2020
To create a terrain for the virtual environment from sonar scans using GAN
Created an intensity map of the sonar scans from the given dataset.
The intensity map consists of (intensity value X number of angles) to create an image
For example, if the intensity map is trying to show 30 degrees of the scan data.
The dimensions will be (176, 15) where 176 is the maximum intensity value and 15 is 30 degrees / 2.
(The scans are measured with an angle increment of 2 degrees between each scan).
The CLAHE (Contrast-Limited Adaptive Histogram Equalization) filter was used to adjust the spotlight effect of the raw camera images.
Then, both the intensity map images and the filtered camera images were resized to the same size (64x64).
After resizing the intensity map image and the raw camera image from the dataset to an ideal size for the network to train,
the intensity map and the camera image are paired together for training.
The pix2pix model from https://github.com/eriklindernoren/PyTorch-GAN/blob/master/implementations/pix2pix/pix2pix.py was used as the basis for the code implementation for this network.
Few Conv layers were removed from the Generator to suit the dataset.
For the generator, the intensity map was given to produce an outcome that would be ideally similar to the camera images from the dataset.
No data augmentation has been implemented in this model.
Generator Structure
One point to consider is that the dataset itself only had camera images for angles that were directly below (90 deg) the autonomous vehicle.
Hence, only a limited amount of scan data had the ground truth (raw camera image) to be compared with.
Majority of the scan data lacked ground truth.
During the test phase, the model was able to produce images for scan data that lacked the ground truths.
The result was not perfect, however, it was interesting to see the outcome.
The first row is the sonar intensity map
second is the GAN generated image
third is the original image
After training 0 samples:
After training 5000 samples:
After training 12000 samples:
Final Image of GAN generated image tiles for all the sonar scans:
The major limitation for this project was the lack of data. The model would have been able to perform better if it were to have more data to be trained with.
For the future, the project might be a fonudation that could redesign the way terrains are built in VR/AR environments.
Currently, the machine learning part is separated from the VR part. However, the project could carry on and merge both parts together to visualize the final product.
Ideally, it would be magnificent if the model could generate terrain in VR/AR as the data is being captured.
Link: https://cirs.udg.edu/caves-dataset/
Mallios, A.; Vidal, E.; Campos, R. & Carreras, M.
Underwater caves sonar data set
The International Journal of Robotics Research, 2017, 36, 1247-1251
doi: 10.1177/0278364917732838
Link: https://github.com/eriklindernoren/PyTorch-GAN/blob/master/implementations/pix2pix/
Link: https://arxiv.org/abs/1611.07004
Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1125-1134).
Week | Tasks |
---|---|
1 | - Getting familiar with Machine Learning/ Neural Network/ GANs - Examining ROS data |
2 | - Tried to make laser scan to point cloud within the ROS functions - However, the data seemed to be unorganized when transferring from one form to the other |
3 | - Trying to create an image directly from the raw laser scan data |
4 | - Customize the created image to sync with the camera information - Reshape the synced image to ideal size for training |
5 | - Train the model and examine the output - Recreate the pix2pix model to suit our purpose |
6 | - Create the model to suit our purpose - Implement data augmentation techniques on the model - Learn how to use the HPC |
7 | - Try to generate data for angles that are outside the camera frame |
8 | - Create an image of the full 360 sweep - Apply data augmentation to avoid spotlight effect |
9 | - Apply - Find the correlation between the scan data input and the generated outcome |
10 | - Optimize the code, Rewrite the code that were from other sources - Crop the images to get rid of the borders |
Supervisors | Dr. Ross Brown r.brown@qut.edu.au, Dr. Simon Denman s.denman@qut.edu.au |