# Gap Framework - Computer Vision

In this session, we will introduce you to preprocessing image data for computer vision. Preprocessing, storage, retrieval and batch management are all handled by two classes, the Image and Images class.

    Image - represents a single preprocessed image
    Images - represents a collection (or batch) of preprocessed images

In [1]:
import os
os.chdir("../")

Let's start by importing the Gap Vision module

In [2]:
# import the Gap Vision module
from image import Image, Images

In [3]:
!cd

C:\Users\'\Desktop\epipog-nlp


In [4]:
os.chdir("../Training/AITraining/Intermediate/Machine Learning/sign-lang")

In [5]:
# The sign language characters (a-z) are labeled 1 .. 26, and 0 is for not a character.
# Each of the training images are under a subdirectory of the corresponding label.
labels = os.listdir("gestures")
    

In [8]:
# Prepare each set of labeled data into machine learning ready data
# The images are 50x50, bitdepth=8, 1 channel (grayscale)
collections=[]
for label in labels:
    # Get a list of all images in the subdirectory for this label (should be 1200 images)
    imgdir = "gestures/" + label + "/"
    imglst = [imgdir + x for x in os.listdir(imgdir)]
    images = Images(imglst, int(label), dir='tmp' + label, config=['flatten', 'grayscale'])
    collections.append(images)
    print("Procesed: " + label, "Number of images:", len(images), "Time: ", images.time)
    

Procesed: 0 Number of images: 1200 Time:  2.0600028038024902
Procesed: 1 Number of images: 1200 Time:  2.1200029850006104
Procesed: 10 Number of images: 1200 Time:  2.130002975463867
Procesed: 11 Number of images: 1200 Time:  2.350003242492676
Procesed: 12 Number of images: 1200 Time:  2.3040056228637695
Procesed: 13 Number of images: 1200 Time:  2.220003128051758
Procesed: 14 Number of images: 1200 Time:  2.24200439453125
Procesed: 15 Number of images: 1200 Time:  2.2900032997131348
Procesed: 16 Number of images: 1200 Time:  2.2500030994415283
Procesed: 17 Number of images: 1200 Time:  2.15000319480896
Procesed: 18 Number of images: 1200 Time:  2.280003070831299
Procesed: 19 Number of images: 1200 Time:  2.1800031661987305
Procesed: 2 Number of images: 1200 Time:  6.380008935928345
Procesed: 20 Number of images: 1200 Time:  5.140007257461548
Procesed: 21 Number of images: 1200 Time:  5.071007013320923
Procesed: 22 Number of images: 1200 Time:  5.78000807762146
Procesed: 23 Number of i

Let's verify the preprocessing of our image data

In [26]:
# Let's see how many batches (collections) we have (hint: should be 27)
print(len(collections))

# Let's verify that the items in the collections are an Images object
collection = collections[3]
print(type(collection))

# For a collection, let's see how many image objects we have (hint: should be 1200)
print(len(collection))

27
<class 'image.Images'>
1200


Let's look at the first Image object in this collection.

In [27]:
# Let's get the first Image item and verify it is an Image object
image = collection[0]
print(type(image))

<class 'image.Image'>


In [28]:
# Let's get some basic information about the image
print(image.name)  # the root name of the image
print(image.type)  # the image file suffix
print(image.size)  # the size of the image on disk

1
jpg
1459


In [29]:
# Let's now check the raw (uncompressed) unprocessed image
print(image.raw.shape)

(50, 50)


In [30]:
# Let's look at how the image got processed.
print(image.shape)  # Note, that the preprocessed image was flattened into a 1D vector. It was 50x50, and now is 2500.

(2500,)


In [31]:
# Let's view the raw image
import cv2
cv2.imshow('image',image.raw)
cv2.waitKey(0)

-1

## Let's presume that it is some later date, and we want to recall the collections from disk.

We start by creating an empty Images() object. We then identify the name of the collection and execute the load() method with the collection name.

In [35]:
# Let's load up one collection
images = Images(dir="tmp/")
images.load("foobar")

Wow, we just recalled the whole collection from disk. Let's now look at our machine learning ready data!

In [36]:
print(len(images))
image = images[17]
print(image.name)
print(image.type)
print(image.size)
print(image.shape)

1200
None
None
0
None


In [37]:
print(image.raw.shape)

AttributeError: 'NoneType' object has no attribute 'shape'