# Paopu - Data Collection

### Display live camera feed

First, let's initialize and display our camera.  

> Our neural network takes a 224x224 pixel image as input.  We'll set our camera to that size to minimize the filesize of our dataset (we've tested that it works for this task).
> In some scenarios it may be better to collect data in a larger image size and downscale to the desired size later.

In [1]:
import traitlets
import ipywidgets.widgets as widgets
from IPython.display import display
from jetbot import Camera, bgr8_to_jpeg

camera = Camera.instance(width=224, height=224)

image = widgets.Image(format='jpeg', width=224, height=224)  # this width and height doesn't necessarily have to match the camera

camera_link = traitlets.dlink((camera, 'value'), (image, 'value'), transform=bgr8_to_jpeg)

display(image)

Image(value=b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\x02\x01\x0…

Awesome, next let's create a few directories where we'll store all our data.  We'll create a folder ``dataset`` that will contain two sub-folders ``free`` and ``blocked``, 
where we'll place the images for each scenario.

In [2]:
import os

hand_dir = '../../../Datasets/dataset2/hand'
pao_dir = '../../../Datasets/dataset2/pao'
move_dir = '../../../Datasets/dataset2/move'
environment_dir = '../../../Datasets/dataset2/environment'

# we have this "try/except" statement because these next functions can throw an error if the directories exist already
try:
    os.makedirs(hand_dir)
    os.makedirs(pao_dir)
    os.makedirs(move_dir)
    os.makedirs(environment_dir)
except FileExistsError:
    print('Directories not created becasue they already exist')

Directories not created becasue they already exist


If you refresh the Jupyter file browser on the left, you should now see those directories appear.  Next, let's create and display some buttons that we'll use to save snapshots
for each class label.  We'll also add some text boxes that will display how many images of each category that we've collected so far. This is useful because we want to make
sure we collect about as many ``free`` images as ``blocked`` images.  It also helps to know how many images we've collected overall.

In [3]:
button_layout = widgets.Layout(width='128px', height='64px')

hand_button = widgets.Button(description='add hand', button_style='success', layout=button_layout)
pao_button = widgets.Button(description='add pao', button_style='success', layout=button_layout)
move_button = widgets.Button(description='add move', button_style='danger', layout=button_layout)
environment_button = widgets.Button(description='add environment', button_style='danger', layout=button_layout)

hand_count = widgets.IntText(layout=button_layout, value=len(os.listdir(hand_dir)))
pao_count = widgets.IntText(layout=button_layout, value=len(os.listdir(pao_dir)))
move_count = widgets.IntText(layout=button_layout, value=len(os.listdir(move_dir)))
environment_count = widgets.IntText(layout=button_layout, value=len(os.listdir(environment_dir)))

Right now, these buttons wont do anything.  We have to attach functions to save images for each category to the buttons' ``on_click`` event.  We'll save the value
of the ``Image`` widget (rather than the camera), because it's already in compressed JPEG format!

To make sure we don't repeat any file names (even across different machines!) we'll use the ``uuid`` package in python, which defines the ``uuid1`` method to generate
a unique identifier.  This unique identifier is generated from information like the current time and the machine address.

In [4]:
from uuid import uuid1

def save_snapshot(directory):
    image_path = os.path.join(directory, str(uuid1()) + '.jpg')
    with open(image_path, 'wb') as f:
        f.write(image.value)

def save_hand():
    global hand_dir, hand_count
    save_snapshot(hand_dir)
    hand_count.value = len(os.listdir(hand_dir))
    
def save_pao():
    global pao_dir, pao_count
    save_snapshot(pao_dir)
    pao_count.value = len(os.listdir(pao_dir))
    
def save_move():
    global move_dir, move_count
    save_snapshot(move_dir)
    move_count.value = len(os.listdir(move_dir))
def save_environment():
    global environment_dir, environment_count
    save_snapshot(environment_dir)
    environment_count.value = len(os.listdir(environment_dir))
    
# attach the callbacks, we use a 'lambda' function to ignore the
# parameter that the on_click event would provide to our function
# because we don't need it.
hand_button.on_click(lambda x: save_hand())
pao_button.on_click(lambda x: save_pao())
move_button.on_click(lambda x: save_move())
environment_button.on_click(lambda x: save_environment())

Great! Now the buttons above should save images to the ``free`` and ``blocked`` directories.  You can use the Jupyter Lab file browser to view these files!

Now go ahead and collect some data 

1. Place the robot in a scenario where it's blocked and press ``add blocked``
2. Place the robot in a scenario where it's free and press ``add free``
3. Repeat 1, 2

> REMINDER: You can move the widgets to new windows by right clicking the cell and clicking ``Create New View for Output``.  Or, you can just re-display them
> together as we will below

Here are some tips for labeling data

1. Try different orientations
2. Try different lighting
3. Try varied object / collision types; walls, ledges, objects
4. Try different textured floors / objects;  patterned, smooth, glass, etc.

Ultimately, the more data we have of scenarios the robot will encounter in the real world, the better our collision avoidance behavior will be.  It's important
to get *varied* data (as described by the above tips) and not just a lot of data, but you'll probably need at least 100 images of each class (that's not a science, just a helpful tip here).  But don't worry, it goes pretty fast once you get going :)

In [5]:
display(image)

Image(value=b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\x02\x01\x0…

In [6]:
display(widgets.HBox([hand_count, hand_button]))
display(widgets.HBox([pao_count, pao_button]))
display(widgets.HBox([move_count, move_button]))
display(widgets.HBox([environment_count, environment_button]))

HBox(children=(IntText(value=35, layout=Layout(height='64px', width='128px')), Button(button_style='success', …

HBox(children=(IntText(value=50, layout=Layout(height='64px', width='128px')), Button(button_style='success', …

HBox(children=(IntText(value=93, layout=Layout(height='64px', width='128px')), Button(button_style='danger', d…

HBox(children=(IntText(value=9, layout=Layout(height='64px', width='128px')), Button(button_style='danger', de…

## Next

Once you've collected enough data, we'll need to copy that data to our GPU desktop or cloud machine for training.  First, we can call the following *terminal* command to compress
our dataset folder into a single *zip* file.

> The ! prefix indicates that we want to run the cell as a *shell* (or *terminal*) command.

> The -r flag in the zip command below indicates *recursive* so that we include all nested files, the -q flag indicates *quiet* so that the zip command doesn't print any output

In [16]:
!ls ../../../Datasets

dataset2  dataset2.zip	dataset.zip


In [19]:
!zip -r -q ../../../Datasets/dataset2.zip ../../../Datasets/dataset2

You should see a file named ``dataset.zip`` in the Jupyter Lab file browser.  You should download the zip file using the Jupyter Lab file browser by right clicking and selecting ``Download``.

Next, we'll need to upload this data to our GPU desktop or cloud machine (we refer to this as the *host*) to train the collision avoidance neural network.  We'll assume that you've set up your training
machine as described in the JetBot WiKi.  If you have, you can navigate to ``http://<host_ip_address>:8888`` to open up the Jupyter Lab environment running on the host.  The notebook you'll need to open there is called ``collision_avoidance/train_model.ipynb``.

So head on over to your training machine and follow the instructions there!  Once your model is trained, we'll return to the robot Jupyter Lab enivornment to use the model for a live demo!