# Gap Framework - Computer Vision

In this session, we will introduce you to preprocessing image data for computer vision. Preprocessing, storage, retrieval and batch management are all handled by two classes, the <b style='color:saddlebrown'>Image</b> and <b style='color:saddlebrown'>Images</b> class.

    Image - represents a single preprocessed image
    Images - represents a collection (or batch) of preprocessed images

In [None]:
# Let's go the directory of the Gap Framework
import os
os.chdir("../")
!cd
#ls #on linux

### Setup

Let's start by importing the Gap <b style='color:saddlebrown'>vision</b> module.

In [None]:
# import the Gap Vision module
from gapml.vision import Image, Images

Let's go to a respository of images for sign language. We will use this repository for image preprocessing for computer vision.

In [None]:
os.chdir("../Training/AITraining/Intermediate/Machine Learning/sign-lang")

In [None]:
# The sign language characters (a-z) are labeled 1 .. 26, and 0 is for not a character.
# Each of the training images are under a subdirectory of the corresponding label.
labels = os.listdir("gestures")
print(labels)

### Image Class

The <b style='color:saddlebrown'>Image</b> class supports the preprocessing of a single image into machine learning ready data. It can process JPG, PNG, TIF and GIF files. We will start by instantiating an <b style='color:saddlebrown'>Image</b> object for an image in the sign-lang image collection. For parameters, we will give the path to the image, and the label value (1). Labels must be mapped into integer values. For the sign-lang dataset, 1-26 is mapped to the 26 letters of the alphabet, and 0 is for a non-letter. 

When we instantiate the image, by default the following will happen:

    1. The image is read in and decompressed, as a numpy array.
    2. The image is processed according to the configuration parameters or defaults (e.g., resize, normalized, flatten,   
       channel conversion).
    3. The raw image data, processed image data, thumbnail, and metadata are stored to a HDF5 file system.

In [None]:
image = Image('gestures/1/1.jpg', 1)

### Image Properties

Let's look at some properties of the <b style='color:saddlebrown'>Image</b> class.

Note how the shape of the ML ready data is (50, 50, 3). We will change that in a bit.

In [None]:
print( image.name )   # The root name of the image (w/o suffix)
print( image.type )   # Type of image (e.g. jpeg)
print( image.dir )    # The directory where the ML ready data will be stored
print( image.size )   # The original size of the image
print( image.shape )  # The shape of the preprocessed image (ML ready data)
print( image.label )  # The label
print( image.time )   # The amount of time (secs) to preprocess the image

### Image Data

Let's look at both the raw and ML ready data.

In [None]:
print("Raw Data", image.raw)
print("ML ready data", image.data)

By default, the number of channels is preserved (e.g., 1 for grayscale, 3 for RGB, and 4 for RGBA), and the data is normalized. On the later, the 0..255 pixel values are rescaled between 0 and 1.

Let's now change the preprocessing of the image to a grayscale image and resize it to 32x32. When we print the shape, you can see the 3rd dimension (channels) is gone - indicating a grayscale image, and the size is now 32 by 32.

In [None]:
image = Image('gestures/1/1.jpg', 1, config=['grayscale', 'resize=(32,32)'])

print( image.shape )

Let's now say that the image data will be feed into a ANN (not CNN) or to a CNN with a 1D input vector. In this case, we need to feed the ML ready data as a flatten 1D vector. We can do that to. Now when we print the shape you can see its 1024 (32 x 32).

In [None]:
image = Image('gestures/1/1.jpg', 1, config=['grayscale', 'resize=(32,32)', 'flatten', 'raw'])

print( image.shape )

### Image Loading

When an image is preprocessed, the ML ready data, raw data and attributes are stored in an HDF5 file. We can subsequently recall (load) the image information from the HDF5 file into an <b style='color:saddlebrown'>Image</b> object.

In [None]:
image = Image()   # Create an empty image object
image.load('1.h5')

Let's see if we get the same properties again.

In [None]:
print( image.name )   # The root name of the image (w/o suffix)
print( image.type )   # Type of image (e.g. jpeg)
print( image.dir )    # The directory where the ML ready data will be stored
print( image.size )   # The original size of the image
print( image.shape )  # The shape of the preprocessed image (ML ready data)
print( image.label )  # The label

Let's check that we get the raw and ML ready data again.

In [None]:
print("Raw Data", image.raw)
print("ML ready data", image.data)

### Thumbnails

We can also generate a store a thumbnail of the original image with the *config* parameter thumb. In the example below, we create a thumbnail with size 16x16

In [None]:
image = Image('gestures/1/1.jpg', 1, config=['grayscale', 'thumb=(16,16)'])
print(image.thumb)
print("Thumb Shape", image.thumb.shape)

### Aysnchronous Preprocess

The image data can also be preprocessed asynchronously. In this mode, the parameter *ehandler* is set to an event handler (function) that will be called when the image is done being preprocessed. The image object is passed as a parameter to the event handler.

In [None]:
def func(image):
    print("DONE", image.name)

image = Image('gestures/1/1.jpg', 1, ehandler=func)

In [None]:
# Let's cleanup and remove the HDF5 file
os.remove("1.h5")

### Remote Image (Url)

The <b style='color:saddlebrown'>Image</b> class (and correspondly the <b style='color:saddlebrown'>Images</b> class), paths to the image file may be specified as an URL; providing the ability to preprocess images stored at remote locations. In this case, an HTTP request is made to retrieve the image data over the network.

In [None]:
# Let's load an image from the CNN news website
image = Image('https://cdn.cnn.com/cnnnext/dam/assets/180727161452-trump-speech-economy-072718-exlarge-tease.jpg', 2)

Let's look at some properties.

In [None]:
# Let's display some properties of the image that was fetched from a remote location and then preprocessed in ML ready data.
print(image.name)
print(image.size)
print(image.shape)

### Raw Image (Pixel)

The <b style='color:saddlebrown'>Image</b> class (and correspondly the <b style='color:saddlebrown'>Images</b> class), paths to the image file may alternatively be the raw pixel input; providing the ability to preprocess images without retreiving from storage, when they are otherwise already in memory.

In [None]:
# import the openCV module
import cv2

# Read the pixel data into memory for an image using openCV
raw = cv2.imread('gestures/1/1.jpg')

# Let's load the image from directly the raw pixel data in memory
image = Image(raw, 1)

Let's look at some properties. 

Note, since this is raw pixel data, the image has no name, and the size is the decompressed (raw) size in memory.

In [None]:
# Let's display some properties of the image that was directly loaded from raw pixel data and then preprocessed in ML ready data.
print(image.name)
print(image.size)
print(image.shape)

### Image Augmentation

Image Augmentation is the process of generating (synthesizing) new images from existing images, which can then be used to augment the training process. Augmentation can include, rotation, skew, sharpending and blur of existing images. These new images are then feed into the neural network during training to augment the training set. Rotating and skew aid in recognizing images from different angles, and sharpening and blur help generalize recognition (combat overfitting), as well as recognition under different lightening and time of day conditions.

The <b style='color:saddlebrown'>Image</b> class supports generating new images by rotation. Any degree of rotation can be specified from 0 to 360.

In [None]:
# Let's now rotate it 90 degress
rotated = image.rotate(90)

# Let's now look at the rotated image
cv2.imshow('image',rotated)
cv2.waitKey(0)

## Images Class

The <b style='color:saddlebrown'>Images</b> class supports the preprocessing of a collection of images into machine learning ready data. For required parameters, the <b style='color:saddlebrown'>Images</b> class takes a list of images and either a list of corresponding labels, or a single value, where all the images share the same label.

Let's start by creating an <b style='color:saddlebrown'>Images</b> object for all the images under the subfolder 1 (letter A).

In [None]:
# Let's get a list of all the images in the subfolder for the label 1 (letter A)
imgdir = "gestures/1/"
imglst = [imgdir + x for x in os.listdir(imgdir)]

# There should be 1200 images
len(imglst)

Let's now create an <b style='color:saddlebrown'>Images</b> object and preprocess all the above images.
    1. Process all 1200 images in the subfolder 1
    2. Set the label to 1
    3. Convert them to grayscale.
    
By default, the image data will be stored in an HDF5 file with the name 'collection.1.h5'.

In [None]:
# Preprocess the set of images
images = Images(imglst, 1, config=['grayscale'])

# Check that the image data is stored in HDF5 file.
os.path.exists("collection.1.h5")

### Images Properties

Next, we will show some properties of the <b style='color:saddlebrown'>Images</b> class.

Note, how fast it was to preprocess the set of 1200 images into machine ready data and store them in an HDF5 file.

In [None]:
print( images.name )   # The name of the collection of images
print( image.dir )     # where the ML ready data is stored
print( images.time )   # The length of time to preprocess the collection of images
print( len(images) )     # The len() operator is overridden to return the number of images in the collection

In [None]:
# Let's print the vector of labels
print("LABELS", images.labels)

The <b style='color:saddlebrown'>Image</b> objects for each corresponding image can be accessed using the [] index operator. Let's get the 33rd one.

In [None]:
# The third Image object
image = images[32]
print(type(image))
print("Name", image.name)

### Directories (Subfolders) of Images

The <b style='color:saddlebrown'>Images</b> class can alternately take a list of subfolders (vs. list of images); in which case, all the images under each subfolder are preprocessed into ML ready data. This is useful if your images are separated into subfolders, where each subfolder is a separate class (label) of images. This is a fairly common practice.

In this case, the corresponding label in the same index of the labels parameter will be assigned to each image in the subfolder.

In the example below, we also use the *name* parameter to specify a name (vs. default) for the collection.

In [None]:
# Let's process a list of subfolders of images, and name the collection 'foobar'
images = Images(['gestures/1', 'gestures/2'], [1,2], name='foobar')

In [None]:
# We have two subfolders of 1200 images each, so we should expect 2400 images
print(len(images))

# cleanup
os.remove('foobar.h5')

### Aysnchronous Preprocess

A collection of images can also be preprocessed asynchronously. In this mode, the parameter *ehandler* is set to an event handler (function) that will be called when the collection of images is done being preprocessed. The <b style='color:saddlebrown'>Images</b> object is passed as a parameter to the event handler.

In [None]:
def func(images):
    print("DONE", images.name, "TIME", images.time)
    
images = Images(imglst, 1, config=['grayscale'], ehandler=func)

### Assemblying a Collection

For performance purposes, one may decide to process subparts of a collection asynchronosly, and then assemble them together into a single collection. The Images class provides support for this assembling subcollections into a single collection using the overridden *+=* operator, to merge other preprocessed collections into a single collection.

In the example below, we first create a collection for the letter 'A' images (label 1) and then separately create a second collection for the letter 'B' images (label 2), and use the *+=* operator to merge the second collection into the first collection.

In [None]:
# Create collection for the letter 'A' images
images = Images(['gestures/1'], 1)

# Create collection for the letter 'B' images and merge it with the letter 'A' image collection
images += Images(['gestures/2'], 2)

# The merged collection should have 2400 images (1200 for 'A', and 1200 for 'B')
print(len(images))

### Splitting a Collection into Training and Test Data

The *split* property will split the <b style='color:saddlebrown'>Image</b> objects into training and test data. The list of training image objects is then randomized. When used as a setter, the property takes either 1 or 2 arguments. The first argument is the percentage that is test data, and the optional second argument is the seed for the random shuffle.

In [None]:
# Split the image objects into 80% training and 20% test
images.split = 0.20

# Let's verify that the training set is 80% (960 of 1200) by printing the internal variable _train
print(len(images._train))

# Let's now print the randomized list of image object indices
print("TRAIN INDICES", images._train)

Let's now add the optional parameter for a random seed.

In [None]:
# Split the image objects into 80% training and 20% test
images.split = 0.20, 42

# Let's now print the randomized list of image object indices
print("TRAIN INDICES", images._train)

### Batch Feeding (Batch Gradient Descent)

There are three ways to use the <b style='color:saddlebrown'>Images</b> object to feed a neural network. In batch mode, the entire training set can be ran through the neural network as a single pass, prior to backward probagation and updating the weights using gradient descent. This is known as 'batch gradient descent'.

When the *split* property is used as a getter, it returns the image data and corresponding labels for the training and test set similar to using sci-learn's train_test_split() function.

In [None]:
# Set the percentage and seed, and split the data
images.split = 0.20, 42

# Get the training, test sets and corresponding labels
x_train, x_test, y_train, y_test = images.split

Let's verify and print the len of the train, test and corresponding labels.

In [None]:
print("x_train", len(x_train))
print("y_train", len(y_train))
print("x_test", len(x_test))
print("y_test", len(y_test))

Let's verify the contents that the elements are what we expect.

In [None]:
# Each element in x_train list should be a numpy array
print(type(x_train[0]))
# Each element should be in the shape 50 x 50 pixels
print(x_train[0].shape)

In [None]:
# Each elment in y_train should be the label (integer)
print(type(y_train[0]))

### Next Iterating (Stochastic Gradient Descent)

Another way of feeding a neural network is to feed one image at a time and do backward probagation, using gradient descent. This is known as stochastic gradient descent.

The *next()* operator supports iterating through the training list one image object at a time. Once all of the entire training set has been iterated through, it is reset and the training set is randomly re-shuffled for the next epoch.

In [None]:
# Let's iterate through the ML ready data and label for each image in the training set
while True:
    data, label = next(images)
    if data is None: break
    print(type(data), label)

### Mini-batch generation

Another way of feeding a neural network is through mini-batches. A mini-batch is a subset of the training set, that is greater than one. After each mini-batch is feed, then backward probagation, using gradient descent, is done. 

Typically, minibatches are set to sizes like 30, 50, 100, or 200. We will use the *minibatch* property as a setter to set the mini-batch size to 100.

In [None]:
# Set minibatch size to 100 images
images.minibatch = 100

When we use the *minibatch* property as a getter, it will create a generator.

In [None]:
# Calculate the number of batches
nbatches = len(images) // 100

# process each mini-batch
for _ in range(nbatches):
    # Create a generator for the next minibatch
    g = images.minibatch
    # Get the data, labels for each item in the minibatch
    for data, label in g:
        pass

## Datset of Images

Let's load the entire dataset - that's 27 collections of 1200 images each.

In [None]:
# Prepare each set of labeled data into machine learning ready data
# The images are 50x50, bitdepth=8, 1 channel (grayscale)
total = 0
collections=[]
for label in labels:
    # Get a list of all images in the subdirectory for this label (should be 1200 images)
    imgdir = "gestures/" + label + "/"
    imglst = [imgdir + x for x in os.listdir(imgdir)]
    images = Images(imglst, int(label), name='tmp' + label, config=['flatten', 'grayscale', 'raw'])
    collections.append(images)
    print("Procesed: " + label, "Number of images:", len(images), "Time: ", images.time)
    total += images.time
    
print("average:", total / len(labels))
    

Let's verify the preprocessing of our image data

In [None]:
# Let's see how many batches (collections) we have (hint: should be 27)
print(len(collections))

# Let's verify that the items in the collections are an Images object
collection = collections[3]
print(type(collection))

# For a collection, let's see how many image objects we have (hint: should be 1200)
print(len(collection))

Let's look at the first Image object in this collection.

In [None]:
# Let's get the first Image item and verify it is an Image object
image = collection[0]
print(type(image))

Let's name view some of the properties and verify that images got processed as expected.

In [None]:
# Let's get some basic information about the image
print(image.name)  # the root name of the image
print(image.type)  # the image file suffix
print(image.size)  # the size of the image on disk

In [None]:
# Let's now check the raw (uncompressed) unprocessed image
print(image.raw.shape)

In [None]:
# Let's look at how the image got processed.
print(image.shape)  # Note, that the preprocessed image was flattened into a 1D vector. It was 50x50, and now is 2500.

Let's now take a look at the image. Remember to hit any key to exit the viewer (i.e., cv2.waitKey(0))

In [None]:
# Let's view the raw image
import cv2
cv2.imshow('image',image.raw)
cv2.waitKey(0)

# End of Session 3

In [None]:
# some cleanup
os.remove('collection.1.h5')
for _ in range(27):
    os.remove('tmp' + str(_) + '.h5')