RoboND Term 1 Deep Learning Project, Follow-Me
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Udacity - Robotics NanoDegree Program

Deep Learning Project

In this project, you will train a deep neural network to identify and track a target in simulation. So-called “follow me” applications like this are key to many fields of robotics and the very same techniques you apply here could be extended to scenarios like advanced cruise control in autonomous vehicles or human-robot collaboration in industry.

alt text

Setup Instructions

Clone the repository

$ git clone

Download the data

Save the following three files into the data folder of the cloned repository.

Training Data

Validation Data

Sample Evaluation Data

Download the QuadSim binary

To interface your neural net with the QuadSim simulator, you must use a version QuadSim that has been custom tailored for this project. The previous version that you might have used for the Controls lab will not work.

The simulator binary can be downloaded here

Install Dependencies

You'll need Python 3 and Jupyter Notebooks installed to do this project. The best way to get setup with these if you are not already is to use Anaconda following along with the RoboND-Python-Starterkit.

If for some reason you choose not to use Anaconda, you must install the following frameworks and packages on your system:

  • Python 3.x
  • Tensorflow 1.2.1
  • NumPy 1.11
  • SciPy 0.17.0
  • eventlet
  • Flask
  • h5py
  • PIL
  • python-socketio
  • scikit-image
  • transforms3d
  • PyQt4/Pyqt5

Implement the Segmentation Network

  1. Download the training dataset from above and extract to the project data directory.
  2. Implement your solution in model_training.ipynb
  3. Train the network locally, or on AWS.
  4. Continue to experiment with the training data and network until you attain the score you desire.
  5. Once you are comfortable with performance on the training dataset, see how it performs in live simulation!

Collecting Training Data

A simple training dataset has been provided in this project's repository. This dataset will allow you to verify that your segmentation network is semi-functional. However, if your interested in improving your score,you may want to collect additional training data. To do it, please see the following steps.

The data directory is organized as follows:

data/runs - contains the results of prediction runs
data/train/images - contains images for the training set
data/train/masks - contains masked (labeled) images for the training set
data/validation/images - contains images for the validation set
data/validation/masks - contains masked (labeled) images for the validation set
data/weights - contains trained TensorFlow models


Training Set

  1. Run QuadSim
  2. Click the DL Training button
  3. Set patrol points, path points, and spawn points. TODO add link to data collection doc
  4. With the simulator running, press "r" to begin recording.
  5. In the file selection menu navigate to the data/raw_sim_data/train/run1 directory
  6. optional to speed up data collection, press "9" (1-9 will slow down collection speed)
  7. When you have finished collecting data, hit "r" to stop recording.
  8. To reset the simulator, hit "<esc>"
  9. To collect multiple runs create directories data/raw_sim_data/train/run2, data/raw_sim_data/train/run3 and repeat the above steps.

Validation Set

To collect the validation set, repeat both sets of steps above, except using the directory data/raw_sim_data/validation instead rather than data/raw_sim_data/train.

Image Preprocessing

Before the network is trained, the images first need to be undergo a preprocessing step. The preprocessing step transforms the depth masks from the sim, into binary masks suitable for training a neural network. It also converts the images from .png to .jpeg to create a reduced sized dataset, suitable for uploading to AWS. To run preprocessing:

$ python

Note: If your data is stored as suggested in the steps above, this script should run without error.

Important Note 1:

Running does not delete files in the processed_data folder. This means if you leave images in processed data and collect a new dataset, some of the data in processed_data will be overwritten some will be left as is. It is recommended to delete the train and validation folders inside processed_data(or the entire folder) before running with a new set of collected data.

Important Note 2:

The notebook, and supporting code assume your data for training/validation is in data/train, and data/validation. After you run you will have new train, and possibly validation folders in the processed_ims. Rename or move data/train, and data/validation, then move data/processed_ims/train, into data/, and data/processed_ims/validationalso into data/

Important Note 3:

Merging multiple train or validation may be difficult, it is recommended that data choices be determined by what you include in raw_sim_data/train/run1 with possibly many different runs in the directory. You can create a tempory folder in data/ and store raw run data you don't currently want to use, but that may be useful for later. Choose which run_x folders to include in raw_sim_data/train, and raw_sim_data/validation, then run from within the 'code/' directory to generate your new training and validation sets.

Training, Predicting and Scoring

With your training and validation data having been generated or downloaded from the above section of this repository, you are free to begin working with the neural net.

Note: Training CNNs is a very compute-intensive process. If your system does not have a recent Nvidia graphics card, with cuDNN and CUDA installed , you may need to perform the training step in the cloud. Instructions for using AWS to train your network in the cloud may be found here

Training your Model


  • Training data is in data directory
  • Validation data is in the data directory
  • The folders data/train/images/, data/train/masks/, data/validation/images/, and data/validation/masks/ should exist and contain the appropriate data

To train complete the network definition in the model_training.ipynb notebook and then run the training cell with appropriate hyperparameters selected.

After the training run has completed, your model will be stored in the data/weights directory as an HDF5 file, and a configuration_weights file. As long as they are both in the same location, things should work.

Important Note the validation directory is used to store data that will be used during training to produce the plots of the loss, and help determine when the network is overfitting your data.

The sample_evalution_data directory contains data specifically designed to test the networks performance on the FollowME task. In sample_evaluation data are three directories each generated using a different sampling method. The structure of these directories is exactly the same as validation, and train datasets provided to you. For instance patrol_with_targ contains an images and masks subdirectory. If you would like to the evaluation code on your validation data a copy of the it should be moved into sample_evaluation_data, and then the appropriate arguments changed to the function calls in the model_training.ipynb notebook.

The notebook has examples of how to evaulate your model once you finish training. Think about the sourcing methods, and how the information provided in the evaluation sections relates to the final score. Then try things out that seem like they may work.


To score the network on the Follow Me task, two types of error are measured. First the intersection over the union for the pixelwise classifications is computed for the target channel.

In addition to this we determine whether the network detected the target person or not. If more then 3 pixels have probability greater then 0.5 of being the target person then this counts as the network guessing the target is in the image.

We determine whether the target is actually in the image by whether there are more then 3 pixels containing the target in the label mask.

Using the above the number of detection true_positives, false positives, false negatives are counted.

How the Final score is Calculated

The final score is the pixelwise average_IoU*(n_true_positive/(n_true_positive+n_false_positive+n_false_negative)) on data similar to that provided in sample_evaulation_data

Ideas for Improving your Score

Collect more data from the sim. Look at the predictions think about what the network is getting wrong, then collect data to counteract this. Or improve your network architecture and hyperparameters.

Obtaining a Leaderboard Score

Share your scores in slack, and keep a tally in a pinned message. Scores should be computed on the sample_evaluation_data. This is for fun, your grade will be determined on unreleased data. If you use the sample_evaluation_data to train the network, it will result in inflated scores, and you will not be able to determine how your network will actually perform when evaluated to determine your grade.

Experimentation: Testing in Simulation

  1. Copy your saved model to the weights directory data/weights.
  2. Launch the simulator, select "Spawn People", and then click the "Follow Me" button.
  3. Run the realtime follower script
$ python my_amazing_model.h5

Note: If you'd like to see an overlay of the detected region on each camera frame from the drone, simply pass the --pred_viz parameter to