Skip to content

4 . Data Collection

George E Fouche edited this page Apr 12, 2018 · 4 revisions

Data Collection

Good data is just as important as a good network architecture, therefore, collecting the best data is the key to success

Open quad sim put a check mark in Spawn crowd, then click on DL training QuandSim Front Page

Local control mode

The quad will start out in patrol mode, but since we have not added any waypoints yet, it will have nowhere to go. To add waypoints we must first switch to local control by pressing the H key. Quad Start

View and Navigation

  • To zoom out from the quad, we recommend using the mouse wheel.
  • To change the viewing perspective at any time during training, right click on the screen and move the mouse.
  • Use WASD keys to move the quad forward, left, back and right
  • Use and C to thrust up and down
  • Use QE keys to turn the quad toward the left or right
  • Press G to reset it to the starting pose. To look up these and other commands press the L legend key

Quad View Adjust

Managing data collection

There are three major aspects to the data collection process that you can control in order determine the type of data you collect. These are as follows:

  1. The path the quad will take while on patrol.
  2. The path the hero will walk.
  3. The locations of distractor spawn.

Setting Patrol Points

Press the P key to set a patrol point. A green patrol point will appear at the quad's position. Patrol Point 1

Move to another position and press P again to add one more patrol point somewhere nearby.

Patrol Point 3

We can now put the quad into patrol mode by pressing H. To switch back to local control press H again. To remove all the patrol points you have set press L Patrol Points Removed

Setting the Hero (Person that the Drone will be following)

To set a hero path point press O while in local control mode. The hero path points are very similar to the patrol points, except they are always placed at ground level. Decrease the quads altitude by pressing C to get a better look at the points you are setting. Similar to patrol points, all the hero path points can be removed by pressing K. The hero will start at the first point you create and walk around the path. When reaching the end, the hero will despawn before reappearing at the beginning of the path.

Hero Path

Setting Spawn points

All the characters in the sim except the hero will respawn after 30-50 seconds at one of the spawn points. We can control the number of people in a given area by the number and location of the spawn points we place. We can set a spawn point at the quads current x,y position by pressing the I key. Blue markers will appear at the spawn locations. We can remove all the spawn points by pressing J

Spawn points

Setting up a Data collection Run

For setting spawn and hero path points it is helpful to rotate the camera so you are viewing the quad from directly above.

Overhead view

To start, let's create a small collection experiment. Often it will be the case that we will want to run multiple collection runs and have each run target a specific type of data. It will also be necessary to have a significant sample of data containing the hero. If we create a very large patrol path and hero path it will be unlikely that we will collect many images containing the hero.

Run Setup

Starting the Collection Run

When we are satisfied with how we have placed the patrol path, hero path, and spawn points, press M to have people start spawning.

Start run spawns

To start recording data press the R key. Navigate to the raw_sim_data/train/target We are using the target directory because we have elected to have the hero appear in this collection run. Alternatively randomly chosen people can take the role of the hero. In this, in order to have the data preparation turn out correctly, we would select the non_target folder.

Press H to have the quad enter patrol mode. To speed up the data collection process, press the 9 key. Data from this run will be stored in the folder selected. When we are done collecting data, we can press R again to stop recording. While it is not advisable to add/remove the hero path/spawn points while the data collection is running, we can delete the patrol path and modify it if desired.

To reset the sim, press ESC

Data Structure

The data directory is organized as follows:

data/runs - contains the results of prediction runs
data/train/images - contains images for the training set
data/train/masks - contains masked (labeled) images for the training set
data/validation/images - contains images for the validation set
data/validation/masks - contains masked (labeled) images for the validation set
data/weights - contains trained TensorFlow models

data/raw_sim_data/train/run1
data/raw_sim_data/validation/run1

Validation Set

To collect the validation set, repeat both sets of steps above, except using the directory data/raw_sim_data/validation instead rather than data/raw_sim_data/train.

Image Preprocessing

Before the network is trained, the images first need to undergo a preprocessing step. The preprocessing step transforms the depth masks from the sim, into binary masks suitable for training a neural network. It also converts the images from .png to .jpeg to create a reduced sized dataset, suitable for uploading to AWS. To run preprocessing:

$ python preprocess_ims.py

Note: If your data is stored as suggested in the steps above, this script should run without error.

Important Note 1:

Running preprocess_ims.py does not delete files in the processed_data folder. This means if you leave images in processed data and collect a new dataset, some of the data in processed_data will be overwritten some will be left as is. It is recommended to delete the train and validation folders inside processed_data(or the entire folder) before running preprocess_ims.py with a new set of collected data.

Important Note 2:

The notebook, and supporting code assume your data for training/validation is in data/train, and data/validation. After you run preprocess_ims.py you will have new train, and possibly validation folders in the processed_ims. Rename or move data/train, and data/validation, then move data/processed_ims/train, into data/, and data/processed_ims/validationalso into data/

Important Note 3:

Merging multiple train or validation may be difficult, it is recommended that data choices be determined by what you include in raw_sim_data/train/run1 with possibly many different runs in the directory. You can create a tempory folder in data/ and store raw run data you don't currently want to use, but that may be useful for later. Choose which run_x folders to include in raw_sim_data/train, and raw_sim_data/validation, then run preprocess_ims.py from within the 'code/' directory to generate your new training and validation sets.