# Gap Framework - Computer Vision

In this session, we will introduce you to preprocessing image data for computer vision. Preprocessing, storage, retrieval and batch management are all handled by two classes, the Image and Images class.

    Image - represents a single preprocessed image
    Images - represents a collection (or batch) of preprocessed images

In [1]:
# Let's go the directory of the Gap Framework
import os
os.chdir("../")
!cd

C:\Users\'\Desktop\Gap-ml


### Setup

Let's start by importing the Gap <b style='color:saddlebrown'>vision</b> module

In [2]:
# import the Gap Vision module
from vision import Image, Images

Let's go to a respository of images for sign language. We will use this repository for image preprocessing for computer vision.

In [3]:
os.chdir("../Training/AITraining/Intermediate/Machine Learning/sign-lang")

In [4]:
# The sign language characters (a-z) are labeled 1 .. 26, and 0 is for not a character.
# Each of the training images are under a subdirectory of the corresponding label.
labels = os.listdir("gestures")
    

### Image Class

The Image class supports the pre-processing of a single image into machine learning ready data. It can process JPG, PNG, TIF and GIF files. We will start by instantiating an image object for an image in the sign-lang image collection. For parameters, we will give the path to the image, and the label value (1). Labels must be mapped into integer values. For the sign-lang dataset, 1-26 is mapped to the 26 letters of the alphabet, and 0 is for a non-letter. 

When we instantiate the image, by default the following will happen:

    1. Image is read in and decompressed, as a numpy array.
    2. The image is processed according to the configuration parameters or defaults (e.g., resize, normalized, flatten,   
       channel conversion)
    3. The raw image data, processed image data, thumbnail, and metadata are stored to a HDF5 file system.

In [6]:
image = Image('gestures/1/1.jpg', 1)

### Image Properties

Let's look at some properties of the Image class.

Note how the shape of the ML ready data is (50, 50, 3). We will change that in a bit.

In [14]:
print( image.name )   # The root name of the image (w/o suffix)
print( image.type )   # Type of image (e.g. jpeg)
print( image.dir )    # The directory where the ML ready data will be stored
print( image.size )   # The original size of the image
print( image.shape )  # The shape of the preprocessed image (ML ready data)
print( image.label )  # The label
print( image.time )   # The amount of time (secs) to preprocess the image

1
jpg
./
1336
(50, 50, 3)
1
0.040000200271606445


### Image Data

Let's look at both the raw and ML ready data.

In [13]:
print("Raw Data", image.raw)
print("ML ready data", image.data)

Label is 1
Raw Data [[[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 ...

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]
  [0 0 0]
  ...
  [0 0 0]
  [0 0 0]
  [0 0 0]]]
ML ready data [[[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]
  ...
  [0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]
  ...
  [0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]
  ...
  [0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 ...

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]
  ...
  [0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]
  ...
  [0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]
  ...
  [0. 0. 0.]
  [0. 0. 0.]
  [0. 0. 0.]]]


By default, the number of channels is preserved (e.g., 1 for grayscale, 3 for RGB, and 4 for RGBA), and the data is normalized. On the later, the 0..255 pixel values are rescaled between 0 and 1.

Let's now change the preprocessing of the image to a grayscale image and resize it to 32x32. When we print the shape, you can see the 3rd dimension (channels) is gone - indicating a grayscale image, and the size is now 32 by 32.

In [15]:
image = Image('gestures/1/1.jpg', 1, config=['grayscale', 'resize=(32,32)'])

print( image.shape )

(32, 32)


Let's now say that the image data will be feed into a ANN (not CNN) or to a CNN with a 1D input vector. In this case, we need to feed the ML ready data as a flatten 1D vector. We can do that to. Now when we print the shape you can see its 1024 (32 x 32).

In [16]:
image = Image('gestures/1/1.jpg', 1, config=['grayscale', 'resize=(32,32)', 'flatten'])

print( image.shape )

(1024,)


### Image Loading

When an image is preprocessed, the ML ready data, raw data and attributes are stored in an HDF5 file. We can subsequently recall (load) the image information from the HDF5 file into an image object.

In [27]:
image = Image()   # Create an empty image object
image.load('1.h5')

In [28]:
print( image.name )   # The root name of the image (w/o suffix)
print( image.type )   # Type of image (e.g. jpeg)
print( image.dir )    # The directory where the ML ready data will be stored
print( image.size )   # The original size of the image
print( image.shape )  # The shape of the preprocessed image (ML ready data)
print( image.label )  # The label

1
h5
./
18888
(1024,)
1


## Images Class

Let's now load an entire dataset of images

In [None]:
# Prepare each set of labeled data into machine learning ready data
# The images are 50x50, bitdepth=8, 1 channel (grayscale)
total = 0
collections=[]
for label in labels:
    # Get a list of all images in the subdirectory for this label (should be 1200 images)
    imgdir = "gestures/" + label + "/"
    imglst = [imgdir + x for x in os.listdir(imgdir)]
    images = Images(imglst, int(label), dir='tmp' + label, config=['flatten', 'grayscale', 'nostore'])
    collections.append(images)
    print("Procesed: " + label, "Number of images:", len(images), "Time: ", images.time)
    total += images.time
    
print("average:", total / len(labels))
    

Let's verify the preprocessing of our image data

In [None]:
# Let's see how many batches (collections) we have (hint: should be 27)
print(len(collections))

# Let's verify that the items in the collections are an Images object
collection = collections[3]
print(type(collection))

# For a collection, let's see how many image objects we have (hint: should be 1200)
print(len(collection))

Let's look at the first Image object in this collection.

In [None]:
# Let's get the first Image item and verify it is an Image object
image = collection[0]
print(type(image))

In [None]:
# Let's get some basic information about the image
print(image.name)  # the root name of the image
print(image.type)  # the image file suffix
print(image.size)  # the size of the image on disk

In [None]:
# Let's now check the raw (uncompressed) unprocessed image
print(image.raw.shape)

In [None]:
# Let's look at how the image got processed.
print(image.shape)  # Note, that the preprocessed image was flattened into a 1D vector. It was 50x50, and now is 2500.

In [None]:
# Let's view the raw image
import cv2
cv2.imshow('image',image.raw)
cv2.waitKey(0)

## Let's presume that it is some later date, and we want to recall the collections from disk.

We start by creating an empty Images() object. We then identify the name of the collection and execute the load() method with the collection name.

In [None]:
# Let's load up one collection
images = Images(dir="tmp0/")
images.load("collection.1")

Wow, we just recalled the whole collection from disk. Let's now look at our machine learning ready data!

In [None]:
print(len(images))
image = images[17]
print(image.name)
print(image.type)
print(image.size)
print(image.shape)

In [None]:
print(image.raw.shape)

# End of Session 3