# To E or Not to E
## Machine Vision Challenge

Welcome to the ASL Challenge! Here what we'll be doing is going through a real life example of how IBM machine learning is being leveraged
to help improve the lives of those who are hearing-impaired.

John Pace from Mark III Systems is working on building an app that can perform sign language letter recognition in real time. This would allow a person to
understand exactly what someone with ASL is trying to say as it happens, with minimal delay. It will essentially allow two people to communicate
with each other in their own native way of speaking. Communication barriers are subsequently broken down and it will be as if their is no distinction
between speaking and signing.

This Notebook represents a template for rapidly executing a machine vision proof of concept using IBM Watson Studio. In the early stages of any data science project there will be a need to quickly assess whether there is a model capable of performing the inferences that are desired. For example, if we are to develop a deep learning model that will perform real-time ASL translation then we need to first determine if we can classify images of the ASL letters and words. 

A data scientist must conduct experiments, going through an iterative process of hypothesis testing and exploratory data analysis to gauge the feasibility and potential of any given model. In this case, we are making the hypothesis that we can differentiate a few ASL letters with a moderate sample size of images using a particular algorithm. The model will classify images based on whether they display the signs for the letters ‘E’, ‘S’, or ‘Y’. If we are not able to obtain a better than random accuracy result from our model then we would consider the need to revisit the hypothesis and the chosen algorithms.    

## Step 1: Install & Load Packages

The first step in running a Notebook is to install and load all of the packages needed to import and process data, manipulate images, and train and test a machine vision models. There are some standard ingredients such as NumPy and Pandas, Python packages for managing numerical arrays and data frames respectively. To apply machine vision though we have also included some popular deep learning packages such as Keras and TensorFlow.

##### Directions
1) Click into the cell below labeled "#Install Packages" and click the "Run" button in the Notebook toolbar.

2) Run the next cell labeled "#Load Packages".

##### Expected Output

When the cells have completed running then a number will appear within the "In [ ]:" statement at the upper left of each cell. The first cell will not produce any output when completed, but the second cell will post a statement that says "Using TensorFlow backend."

In [None]:
#Install Packages

! pip install ibm-cos-sdk -q
! pip install keras==2.1.4 -q
! pip install tensorflow==1.9 -q
! pip install h5py -q # let us save the model

In [None]:
#Load Packages

import zipfile
from ibm_botocore.client import Config
import ibm_boto3
import pandas as pd
import numpy as np
import keras
import matplotlib.pyplot as plt
from keras.layers import Dense,GlobalAveragePooling2D
from keras.applications import MobileNet , ResNet50
from keras.preprocessing import image
from keras.applications.mobilenet import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.optimizers import Adam
import shutil
from keras.preprocessing import image
import tensorflow
import types
from botocore.client import Config
from numpy import loadtxt
from keras.models import load_model
import pickle
%matplotlib inline
from tensorflow import ConfigProto
from tensorflow import InteractiveSession
import os
import requests 

class WatsonWarriors:
    def __init__(self,token):
        self.host = "https://api.watsonwarriors.ai:8487"
        self.token = token;
    def complete(self):
        return requests.post(url = self.host + '/validate/0', json = { 'data': 1 }, headers = { 'Authorization': 'Bearer '+self.token });

## Step 2: Get the Data

Data science projects rely on large sets of clean and curated data for exploratory analysis, as well as model training and testing. In the context of a machine vision model we need data in the form of a large quantity of labelled images. 

The next section of code will import a zip file containing images nested within four folders. Each folder is labeled with the classification of the images contained in that folder, and labels are ‘E’, ‘S’, ‘Y’, and ‘Control’. Staging the images in this way allows our code to infer labels from the folders which images belong to. Some additional cleanup is needed to remove some bad images at the end of the code chunk.


##### Directions
1) Run the cell labeled "#Get Data".

##### Expected Output

There is no expected output for this step. When the cell has completed running then a number will appear within the "In [ ]:" statement at the upper left of the cell.

In [None]:
#Get Data

# This function will plot images in the form of a grid with 1 row and 5 columns where images are placed in each column.
def plotImages(images_arr):
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, ax in zip( images_arr, axes):
        ax.imshow(img)
    plt.tight_layout()
    plt.show()
BATCH_SIZE = 8
IMG_SHAPE  = 224 # Our training data consists of images with width of 150 pixels and height of 150 pixels
img_width = 224
img_height = 224
image_gen = ImageDataGenerator(rescale=1./255, horizontal_flip=True)


os.getcwd()

#For this step, we are simply passing our "credentials" to IBM cloud to allow us to download data from IBM Cloud Storage. You can sort of think of it as entering a
#website password. Besides our username and password, we're also telling it directly the name of the files we want to use, where we want the files to go, and which files
#are not useful for our purposes.


credentials_1 = {
    'IAM_SERVICE_ID': 'iam-ServiceId-245853a8-b4bb-4cae-9f38-88a5a14930d8',
    'IBM_API_KEY_ID': 'J2qAH8aiRNHF97QXXzMpRvukNb7berjw3QK9IMlNBQ1y',
    'ENDPOINT': 'https://s3-api.us-geo.objectstorage.service.networklayer.com',
    'IBM_AUTH_ENDPOINT': 'https://iam.ng.bluemix.net/oidc/token',
    'BUCKET': 'wildfirepredictionibm-donotdelete-pr-rcabpk7sog3qdr',
    'FILE': 'ASL_Sample_Images.zip'
}

cos = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='J2qAH8aiRNHF97QXXzMpRvukNb7berjw3QK9IMlNBQ1y',
    ibm_auth_endpoint="https://iam.ng.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3-api.us-geo.objectstorage.service.networklayer.com')

cos.download_file(Bucket=credentials_1['BUCKET'],Key='ASL_Sample_Images.zip',Filename='ASL_Sample_Images.zip')

with zipfile.ZipFile('/home/dsxuser/work/ASL_Sample_Images.zip', 'r') as zip_ref:
    zip_ref.extractall('/home/dsxuser/work/Sample_Images/')

! rm "./Sample_Images/ASL Images/Y/Y_421.jpg"
! rm "./Sample_Images/ASL Images/S/S_421.jpg"
! rm "./Sample_Images/ASL Images/E/E_421.jpg"
! rm "./Sample_Images/ASL Images/Controls/Controls_421.jpg"

## Step 3: Discover Classes and Sample Size

Before we dig further into machine learning, first we need to understand what we are trying to predict. Sometimes we are trying to predict future values of a number, other times we want to know what group, or class, data belongs to. It has already been discussed that we are working with ASL images pertaining to the letters "E", "S", and "Y".These are known as our "classes", or "labels" as we referred to them earlier. We know which "classes" our images belong to, but it is important to make sure the computer does as well. A machine learning model only knows as much as we tell it. For this step, we will be simply be loading our images from the folder, verifying our anticipated classes of "E", "S", and "Y" along with "Controls". While "Controls" appears as a class of images, they really are just images we will be providing to the model to act as a sort of contrast to our chosen letters of "E", "S", and "Y". This will further highlight the individual characteristics of each class, and our classification algorithms will take care of the rest and find what makes each set of images unique. With enough data, it will know what class a new image belongs to as we receive it. 

How will the model know which images belong to which class? Simple. When we load our images, they will be assigned a class based on the name of the folder they are in. So we have a separate folder for E, one for Y, one for S, and the last for Controls.

##### Directions
1) Run the cell labeled "#Discover Classes and Sample Size".

##### Expected Output

Observe the three outputs: Image Classes, Class E Contents, and Augmented images (what do you notice?). What you see here is what our model will see. The first two consist of text, and the third will be images displayed below our "Discover Classes and Sample Size" cell.




In [None]:
#Discover Classes and Sample Size

#Here we are setting up our "training set" for machine learning. You can think of it as a very carefully curated dataset that contains attributes we specifically want the model
#to pick up on. 


train_datagen=ImageDataGenerator(preprocessing_function=preprocess_input) #included in our dependencies

train_dir = './Sample_Images/ASL Images/'

e_dir = './Sample_Images/ASL Images/E/'



train_generator=train_datagen.flow_from_directory(train_dir,   # need to put in the path to the folder that has the training set in it....the folder names are the classes.
                                                 target_size=(224,224),
                                                 color_mode='rgb',
                                                 batch_size=8,   
#                                                   validation = validate_dir , # batch size may need to be changed if the memory gets full
                                                 class_mode='categorical',
                                                 shuffle=True)

train_data_gen = image_gen.flow_from_directory(batch_size=BATCH_SIZE, 
                                               directory=train_dir, 
                                               shuffle=True, 
                                               target_size=(IMG_SHAPE,IMG_SHAPE))
augmented_images = [train_data_gen[0][0][7] for i in range(5)]


#Outputs
print("")
print("")
print("")
print("Here we will output our folder names, show example contents of the E folder, and plot a few images with special type of augmentation applied. Can you guess what was done to the images?")
print("")
print("")
print("-Classes")
print(os.listdir(train_dir))
print("")
print("-Class 'E'")
print("")
print('%.50s' % os.listdir(e_dir))
print("")
print("")
print("Notice anything weird about these photos?")
plotImages(augmented_images)



## Step 4: Augment Images

In the next section we will enhance the original dataset by rotating and stretching the images that were provided. This will provide additional examples for training a model and give the model a richer set of conditions in which to make inferences. This is a critical step in performing image classification. As data scientists, it is difficult to collect data representative of every single way an image can appear. Image augmentation reshapes and repositions images in a variety of ways beyond just rotating the image and expanding and contracting it. After all, we all take photos differently, have different cameras. Some cameras have better zoom than others, higher/lower resolution, and so on. It is vital that we can teach our classification model to already account for these scenarios. With even a limited dataset, we can augment what we have and turn one image into two (or 3, 4, 5...), and strenghten the accuracy of our model; without ever having to take additional images. We'll show you two separate examples of ways images can be augmented: rotation and zooming

##### Directions
1) Run the cell below labeled "#Augment Images".

##### Expected Output

Once the cells have run, you can take a look at how we've changed them. You'll view one set of images pertaining to a 70% rotation, and another where they have been zoomed in 60%.

In [None]:
#Augment Images

image_gen = ImageDataGenerator(rescale=1./255, rotation_range=70)

train_data_gen = image_gen.flow_from_directory(batch_size=BATCH_SIZE, 
                                               directory=train_dir, 
                                               shuffle=True, 
                                               target_size=(IMG_SHAPE, IMG_SHAPE))

augmented_images = [train_data_gen[0][0][3] for i in range(5)]

print("70% Rotation")
print("")
print("")
plotImages(augmented_images)


image_gen = ImageDataGenerator(rescale=1./255, zoom_range=0.6)

train_data_gen = image_gen.flow_from_directory(batch_size=BATCH_SIZE, 
                                               directory=train_dir, 
                                               shuffle=True, 
                                               target_size=(IMG_SHAPE, IMG_SHAPE))

augmented_images = [train_data_gen[0][0][0] for i in range(5)]

print("60% Zoom")
print("")
print("")
plotImages(augmented_images)


## Step 5: Import Pre-Trained Model

Time to finally execute a model! Here you'll be working with one that has already been trained. Machine learning models, especially those that work with images, can take quite a while to run. Pre-training a model is important as it allows you use one that has already been taught, or "trained", to work with the dataset in question. As new data is generated and uploaded, having a model already trained allows one to save time. What a data scientist needs to be careful of, though, is making sure the model they've chosen to reuse is optimized for peak performance. You would not want to reuse something that is incomplete or inaccurate. Algorithms are not perfect, but we can build them to be as close to perfect as possible.

The algorithm and model used here is called MobileNet. MobileNet is a convolutional neural network optimized for high-performance image recognition without requiring the immense computation power typically needed to perform accurate image classification. A neural network essentially is an algorithm modeled to recognize patterns and behavior in a way similar how humans do it. It utilizes a network of "neurons" that observe details of the data you provide, and determine what features make your dataset unique. It tries to identify these features and their weight on a model, and then assign numerical "weights" to each of the features. A neural network will continuously rewrite itself, under the provided parameters, until it finds what it believes to be the most ideal balance of weights across all features. 

Example:

A neural network runs through an imageset and identifies features A, B, and C. The first run through, they are given weights of 0.3, 0.5, and 0.2 respectively and check if that accurately describes your images. It will then run again with a new set of weights, and compare that to the original.

Attempt 1:
A = 0.3
B = 0.5
C = 0.2

Attempt 2:
A = 0.1
B = 0.3
C = 0.6

.
.
.
.


##### Directions
1) Run the cell below labeled "#Import Pre-Trained Model".

##### Expected Output

Below our "Import Pre-Trained Model" cell you will see a table containing columns labeled "Layer", "Output Shape", and "Param #". What is happening is that the program is checking which parameters of the model used apply to our current problem. These parameters could be thought of details that identify the behavior and characteristics of our imageset.

In [None]:
#Import Pre-Trained Model

#### download model
cos.download_file(Bucket=credentials_1['BUCKET'],Key='asl_mobile_netv2.h5',Filename='asl_mobile_netv2.h5')     # first model is called 'asl_mobile_net.h5'   the second is called 'asl_mobile_netv2.h5'
from keras.utils.generic_utils import CustomObjectScope

with CustomObjectScope({'relu6': keras.applications.mobilenet.relu6,'DepthwiseConv2D': keras.applications.mobilenet.DepthwiseConv2D}):
    model = load_model('asl_mobile_netv2.h5')
model.summary()

## Step 6: View and Test Model

Now our model is loaded and ready to be tested. This will be done by taking the pre-trained model and feeding it a series of images. For each of the images, it will try to determine which class it belongs to. The idea is that it should already know what an "E" image looks like, and the same for S and Y. The moment the model views an image, it will make a determination on which letter it pertains to. In a perfect world, it could identify each image's class 100% of the time. In reality, though, there are grey areas and similarities between the image types. Our three image classes of E, S, and Y have traits that they share, and others that are unique to each. With enough data, one can attempt to identify all of these traits. In practice, however, a few images from "E" will be labeled as "S", some "S" images will be labeled as "Y".......

#### Directions

1) Run the cell below labeled "#View and Test Model".

##### Expected Output

An image is displayed along with three pieces of information: 1. Its filename (which will indicate whether it is "E", "S", or "Y"), 2. Probability that each class applies to an image), and 3. Predicted Class

The filename, when compared to the predicted class, will tell us whether or not the model accurately identified which letter describes each image. Because the three letters have similarities to each other in terms of visual appearance, we receive scores, or probabilities, that a certain letter is true for an image. They are ordered Controls, E, S, and Y.

A scenario for a "E" image may look as follows....

Class Probabilities [Controls, E, S, Y]:
[0.07, 0.008, 0.227, 0.687]

Predicted Class:

"E"




In [None]:
#View and Test Model

sess = tensorflow.Session()
labels = {}
labels["label_names"] = ["Control" , "E" , "S" , "Y"]
img_list = ["./Sample_Images/ASL Images/Controls/Controls_3.jpg"  , "./Sample_Images/ASL Images/Y/Y_63.jpg"  , "./Sample_Images/ASL Images/E/E_21.jpg" , "./Sample_Images/ASL Images/S/S_3.jpg"  ]
img_path = img_list[2]

img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
plt.imshow(x/255.)

img = tensorflow.read_file(img_path)
img = tensorflow.image.decode_jpeg(img, channels=3)
img.set_shape([None, None, 3])
img = tensorflow.image.resize_images(img, (224, 224))
img = img.eval(session=sess) # convert to numpy array
img = np.expand_dims(img, 0) # make 'batch' of 1
print("---Filename")
print("")
print("")
print(img_path)
print("")
print("")
pred = model.predict(img)
print("---Class Probabilities, Predicted Class, Original Image")
print("")
print("")
print(pred)
pred = labels["label_names"][np.argmax(pred)]
pred

#print(img_list[1])


## Step 7: Complete Challenge

### Directions

1) Go back to the Watson Warriors tab, and paste the code from the Finish the Challenge task below.

2) Run the cell below.

3) Return to the Watson Warriors tab to see your score!

In [None]:
## Paste code

## This completes the challenge!!