# Data Preparation

## Introduction

The first part of building any AI model is finding appropriate data. For your project, the model will be classifying images, so your data should be images as well. In this module, you will learn how to obtain your data using different methods and process it, so it is ready to be ingested by the model.

## Image Representation

Most of the digital images available today are represented in __raster__ format. Raster image contains a two-dimensional table of _pixels_ - small squared (or rectangular) shape elements of the image. 

![](https://drive.google.com/uc?export=view&id=1AeC4a6YCNDBFm3ArdyiKgM7rCHn92xg1)

There are various ways the information can be embedded into pixel, however we will use the most popular method of RGB, representing three-sized tuple of Red, Green, and Blue intensities. 


![](https://drive.google.com/uc?export=view&id=1yjsqrUxGGecJXvrwFHSX6ZJ1jrjUK11R)

In other words, every pixel is a sequence of three numbers (usually from 0 to 255) and image is a table of pixels.

![](https://drive.google.com/uc?export=view&id=1Hi3HFJU-yXJX1vtd7lyhIBtSY5WEeXCW)

Hence, input representation for any colored image can be thought of as three-dimensional table, with dimensions ($width$, $height$, $3$), where ($width$, $height$) tuple is known as _image resolution_. The ratio of the two ($width/frac$) is known as aspect ratio.

## Obtaining Data

There are various ways the data can be obtained for the project, however we will be focusing on two methods:

1. Creating images using photo camera (such as the one you have in your phones).
2. Creating snapshots from a video feed (which can be created using web camera).

These methods have no meaningful advantages over each other for this project. You can use whichever is more preferrable to you or combination of both. Beloww is the sceleton code that you can use to prepare the data for data pre-processing.

Mount your Google Drive using your Gmail account. Proceed to the folder where you will store the data for your project.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
### Name Project folder
proj_name = 'rps'
###

proj_path = '/content/gdrive/MyDrive/'+proj_name

You only need to run following cells once to create folders. Cells output will throw an error for any subsequent runs.

In [None]:
%mkdir $proj_path

mkdir: cannot create directory ‘/content/gdrive/MyDrive/rps’: File exists


In [None]:
%mkdir {proj_path+'/rock'}

In [None]:
%mkdir {proj_path+'/paper'}

In [None]:
%mkdir {proj_path+'/scissors'}

Change current directory to project folder.

In [None]:
## Change the current directory
%cd $proj_path
%ls

/content/gdrive/MyDrive/rps
[0m[01;34mpaper[0m/  [01;34mrock[0m/  [01;34mscissors[0m/


### Using Web Camera

Run the following cell. It will create a function that will take a snapshot from your webcam by pressing a button and save the file in your current folder.

In [None]:
from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode

def take_photo(filename='photo.jpg', quality=0.8):
  js = Javascript('''
    async function takePhoto(quality) {
      const div = document.createElement('div');
      const capture = document.createElement('button');
      capture.textContent = 'Capture';
      div.appendChild(capture);

      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({video: true});

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      // Wait for Capture to be clicked.
      await new Promise((resolve) => capture.onclick = resolve);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();
      div.remove();
      return canvas.toDataURL('image/jpeg', quality);
    }
    ''')
  display(js)
  data = eval_js('takePhoto({})'.format(quality))
  binary = b64decode(data.split(',')[1])
  with open(filename, 'wb') as f:
    f.write(binary)
  return filename

Change the parameters in the cells to choose number of snapshots for each gesture. Run the cell and follow the prompt.

![](https://drive.google.com/uc?export=view&id=1YrurRp9FYI3yGmzqqIG9wuPrIpBiAO1D)

In [None]:
from IPython.display import Image

### Choose these parameters
# maxRock - number of images taken with "rock" gesture
# maxPaper - number of images taken with "paper" gesture
# maxScissors - number of images taken with "scissors" gesture
###

maxRock = 5
maxPaper = 5
maxScissors = 5

### End initialization
###
try:
  ct = int(1)
  maxPaper = maxRock + maxPaper
  maxScissors = maxPaper + maxScissors
  while True:
    if ct <= maxRock:
      print("Show Rock and press Capture!")
      label = "rock"
    elif ct <= maxPaper:
      print("Show Paper and press Capture!")
      label = "paper"
    else:
      print("Show Scissors and press Capture!")
      label = "scissors"

    filename = take_photo(label + "/" +label+'_'+str(ct))

    print('Saved to {}'.format(filename))

    ct += 1
    if ct > maxScissors:
      break
except Exception as err:
  # Errors will be thrown if the user does not have a webcam or if they do not
  # grant the page permission to access it.
  print(str(err))

Show Rock and press Capture!


<IPython.core.display.Javascript object>

Saved to rock/rock_1
Show Rock and press Capture!


<IPython.core.display.Javascript object>

Saved to rock/rock_2
Show Rock and press Capture!


<IPython.core.display.Javascript object>

Saved to rock/rock_3
Show Rock and press Capture!


<IPython.core.display.Javascript object>

Saved to rock/rock_4
Show Rock and press Capture!


<IPython.core.display.Javascript object>

Saved to rock/rock_5
Show Paper and press Capture!


<IPython.core.display.Javascript object>

Saved to paper/paper_6
Show Paper and press Capture!


<IPython.core.display.Javascript object>

Saved to paper/paper_7
Show Paper and press Capture!


<IPython.core.display.Javascript object>

Saved to paper/paper_8
Show Paper and press Capture!


<IPython.core.display.Javascript object>

Saved to paper/paper_9
Show Paper and press Capture!


<IPython.core.display.Javascript object>

Saved to paper/paper_10
Show Scissors and press Capture!


<IPython.core.display.Javascript object>

Saved to scissors/scissors_11
Show Scissors and press Capture!


<IPython.core.display.Javascript object>

Saved to scissors/scissors_12
Show Scissors and press Capture!


<IPython.core.display.Javascript object>

Saved to scissors/scissors_13
Show Scissors and press Capture!


<IPython.core.display.Javascript object>

Saved to scissors/scissors_14
Show Scissors and press Capture!


<IPython.core.display.Javascript object>

Saved to scissors/scissors_15


### Using photo camera

0. If you have iPhone, please change File formats before you take pictures. In Settings -> Camera -> Formats, choose Most Compatible. You can change it back after pictures are taken.
1. Make pictures of different gestures using you camera.
2. Download and Install Google Drive App (if not already installed).
3. Copy images in the respective folders.
4. Run the cell below to rename files to general convention.

## Creating Data Tips

1. Ensure image is of high quality and the gesture is discernible by human eye. If a human cannot classify the image, neither will machine!
2. Keep background clean and the classification object large enough to improve quality of learning. 
  - If there are extraneous items in the background, the algorithm may incorrectly aim these items to be part of classification solution.
  - If object is too small and not prominent, algorithm will tend to focus on background more, reducing accuracy.
3. Ensure appropriate aspect ratio. Changing aspect ratio will stretch/shrink object on the image, which makes learning more difficult.
4. Ensure labels are correct, i.e. appropriate images are stored in their respective folders.

## Data pre-processing

To uphold principle of fairness and simplify machine learning process, all images need to be converted into same file format (.jpg) and same resolution. Please run the cell below to convert the files to common format.

Participants are also encouraged to manually edit the pictures, using tips above, to improve accuracy of their learning algorithm.

In [None]:
from os import listdir, rename
from os.path import isfile, join

for label in ['rock', 'paper', 'scissors']:
  ct = 1
  for i in [f for f in listdir(join(proj_path,label)) if isfile(join(proj_path,label, f))]:
    rename(join(proj_path,label,i), join(proj_path,label,label+'_'+str(ct)+'.jpg'))
    ct += 1

In [None]:
from os import listdir, rename
from os.path import isfile, join
from PIL import Image 
proj_path = '/content/gdrive/MyDrive/rps'
for label in ['rock', 'paper', 'scissors']:
  for i in [f for f in listdir(join(proj_path,label)) if isfile(join(proj_path,label, f))]:
    im = Image.open(join(proj_path,label, i))
    im = im.resize((640,480))
    im.save(join(proj_path,label, i), 'jpeg')
    print("Successfully Converted {}".format(i))

Successfully Converted rock_1.jpg
Successfully Converted rock_2.jpg
Successfully Converted paper_1.jpg
Successfully Converted paper_2.jpg
Successfully Converted paper_3.jpg
Successfully Converted scissors_1.jpg
Successfully Converted scissors_2.jpg
Successfully Converted scissors_3.jpg
