Skip to content

Trains a fully convolutional deep neural network to identify and track a character target in a drone simulator via Python Keras

License

Notifications You must be signed in to change notification settings

WolfeTyler/DeepLearning-Keras-Drone-Follow-Me-Project

Repository files navigation

Deep Learning Python Keras Follow Me Project

In this project, I trained a deep neural network to identify and track a target character in drone quad-copter simulator.

alt text

Setup Instructions

Clone the repository

$ git clone https://github.com/udacity/RoboND-DeepLearning.git

Download the data

Save the following three files into the data folder of the cloned repository.

Training Data

Validation Data

Sample Evaluation Data

Download the QuadSim binary

To interface your neural net with the QuadSim simulator, you must use a version QuadSim that has been custom tailored for this project. The previous version that you might have used for the Controls lab will not work.

The simulator binary can be downloaded here

Install Dependencies

You'll need Python 3 and Jupyter Notebooks installed to do this project. The best way to get setup with these if you are not already is to use Anaconda following along with the RoboND-Python-Starterkit.

If for some reason you choose not to use Anaconda, you must install the following frameworks and packages on your system:

  • Python 3.x
  • Tensorflow 1.2.1
  • NumPy 1.11
  • SciPy 0.17.0
  • eventlet
  • Flask
  • h5py
  • PIL
  • python-socketio
  • scikit-image
  • transforms3d
  • PyQt4/Pyqt5

Implement the Segmentation Network

  1. Download the training dataset from above and extract to the project data directory.
  2. Implement your solution in model_training.ipynb
  3. Train the network locally, or on AWS.
  4. Continue to experiment with the training data and network until you attain the score you desire.
  5. Once you are comfortable with performance on the training dataset, see how it performs in live simulation!

Collecting Training Data

A simple training dataset has been provided in this project's repository. This dataset will allow you to verify that your segmentation network is semi-functional. However, if your interested in improving your score,you may want to collect additional training data. To do it, please see the following steps.

The data directory is organized as follows:

data/runs - contains the results of prediction runs
data/train/images - contains images for the training set
data/train/masks - contains masked (labeled) images for the training set
data/validation/images - contains images for the validation set
data/validation/masks - contains masked (labeled) images for the validation set
data/weights - contains trained TensorFlow models

data/raw_sim_data/train/run1
data/raw_sim_data/validation/run1

Training Set

  1. Run QuadSim
  2. Click the DL Training button
  3. Set patrol points, path points, and spawn points. TODO add link to data collection doc
  4. With the simulator running, press "r" to begin recording.
  5. In the file selection menu navigate to the data/raw_sim_data/train/run1 directory
  6. optional to speed up data collection, press "9" (1-9 will slow down collection speed)
  7. When you have finished collecting data, hit "r" to stop recording.
  8. To reset the simulator, hit "<esc>"
  9. To collect multiple runs create directories data/raw_sim_data/train/run2, data/raw_sim_data/train/run3 and repeat the above steps.

Validation Set

To collect the validation set, repeat both sets of steps above, except using the directory data/raw_sim_data/validation instead rather than data/raw_sim_data/train.

Image Preprocessing

Before the network is trained, the images first need to be undergo a preprocessing step. The preprocessing step transforms the depth masks from the sim, into binary masks suitable for training a neural network. It also converts the images from .png to .jpeg to create a reduced sized dataset, suitable for uploading to AWS. To run preprocessing:

$ python preprocess_ims.py

Training, Predicting and Scoring

With your training and validation data having been generated or downloaded from the above section of this repository, you are free to begin working with the neural net.

Note: Training CNNs is a very compute-intensive process.

Training your Model

Prerequisites

  • Training data is in data directory
  • Validation data is in the data directory
  • The folders data/train/images/, data/train/masks/, data/validation/images/, and data/validation/masks/ should exist and contain the appropriate data

To train complete the network definition in the model_training.ipynb notebook and then run the training cell with appropriate hyperparameters selected.

After the training run has completed, your model will be stored in the data/weights directory as an HDF5 file, and a configuration_weights file. As long as they are both in the same location, things should work.

Important Note the validation directory is used to store data that will be used during training to produce the plots of the loss, and help determine when the network is overfitting your data.

The sample_evalution_data directory contains data specifically designed to test the networks performance on the FollowME task. In sample_evaluation data are three directories each generated using a different sampling method. The structure of these directories is exactly the same as validation, and train datasets provided to you. For instance patrol_with_targ contains an images and masks subdirectory. If you would like to the evaluation code on your validation data a copy of the it should be moved into sample_evaluation_data, and then the appropriate arguments changed to the function calls in the model_training.ipynb notebook.

The notebook has examples of how to evaulate your model once you finish training. Think about the sourcing methods, and how the information provided in the evaluation sections relates to the final score. Then try things out that seem like they may work.

Scoring

To score the network on the Follow Me task, two types of error are measured. First the intersection over the union for the pixelwise classifications is computed for the target channel.

In addition to this we determine whether the network detected the target person or not. If more then 3 pixels have probability greater then 0.5 of being the target person then this counts as the network guessing the target is in the image.

We determine whether the target is actually in the image by whether there are more then 3 pixels containing the target in the label mask.

Using the above the number of detection true_positives, false positives, false negatives are counted.

How the Final score is Calculated

The final score is the pixelwise average_IoU*(n_true_positive/(n_true_positive+n_false_positive+n_false_negative)) on data similar to that provided in sample_evaulation_data

Network Setup

The FCN network is used to perform inference against the pixels in an image. The FCN is composed of a series of convolution layers down to a 1x1.

The first section of the network is the encoder. The 1x1 convolution layer is not tall or wide but is deep in filters. Although the end of the encoder is 1x1 the output isn't necessarily 1x1.

The next section of the network is the decoder. It's composed of transposed convolutions that increase the height and width while shortening the depth in opposite fashion as the end of the encoder.

To solve the follow me challenge I used a 3 encoder, 1x1 convolution, 3 decoder setup.

alt_text

alt_text

Network & Hyper Parameters

The source code contained a series of default hyper parameters:

learning_rate = 0 batch_size = 0 num_epochs = 0 steps_per_epoch = 200 validation_steps = 50 workers = 2

I iteratively adjusted one parameter at a time in order to perform a controlled experiment to understand what postitive/negative impacts the parameter adjustments would have to the results and final model performance scoring.

  • Batch_size I updated first to align with the volume of images being trained against
  • Num_epochs I adjusted second through trial and error starting with 10, then 20, finally 30 where it appeared the improvement was statistically trailing off through diminishing returns
  • Steps_per_epoch & validation_steps I decided to leave as is
  • Workers I changed from 2 down to 1 to keep simple
  • Learning_rate I knew from prior experience is very important to model performance. Initially I started with a smaller learning_rate of .00001, however as somewhat expected the val_loss didn't improve as quickly. I adjusted a few times until I reached .01

learning_rate = 0.01 batch_size = 32 num_epochs = 30 steps_per_epoch = 200 validation_steps = 50 workers = 1

alt_text

Performance Results

alt_text

alt_text

The model could be improved with additional training data both following the hero in dense crowded areas, and also in larger patrol paths with the hero appearing less frequently. Tracking a dog or car could also be completed by the model but the training would need to include other animals/vehicles in an alternate environment and with different patrol patterns.

About

Trains a fully convolutional deep neural network to identify and track a character target in a drone simulator via Python Keras

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published