<table style="border: none" align="left">
   <tr style="border: none">
              <th style="border: none"><font face="verdana" size="5" color="black"><b>IBM Cloud Discovery Lab: Neural Network Model</b></th>
      <th style="border: none"><img src="https://github.com/pmservice/customer-satisfaction-prediction/blob/master/app/static/images/ml_icon_gray.png?raw=false" alt="Watson Machine Learning icon" height="40" width="40"></th>
   </tr>
</table>

### Lab Details:

This lab was created to demonstrate Watson Studio's capabilities to integrate with open source frameworks and libraries such as Tensorflow and Keras, providing a high level of customization within the Notebooks whether you prefer to program using Python, R or Scala, and using a broad set of tools into IBM Cloud's environment architecture.

#### Services and Tools:

In addition to Watson Studio, We'll make use of the following IBM Cloud's services:

- Apache Spark;
- Cloud Object Storage;
- Python Web App with Flask;
- Watson Machine Learning;


#### Datasets used:
AIRCRAFT
http://image-net.org/synset?wnid=n02686568

BIRDS
http://image-net.org/synset?wnid=n01503061

HUMANS
http://image-net.org/synset?wnid=n02472987



## Libraries Installation

In [None]:
!pip install wget

In [None]:
!pip install dict

In [None]:
!pip install dictionary

In [None]:
!pip install --upgrade --index-url https://test.pypi.org/simple/ watson-machine-learning-client

## Libraries Import

In [None]:
import os, urllib3, requests, json, time, requests, wget, base64, glob
import shutil, random, tarfile, ibm_boto3, dict, dictionary
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import tensorflow as tf

from IPython.display import clear_output, Image, display, HTML
from skimage import img_as_float
from six.moves import urllib
from uuid import uuid4
from urllib.request import urlopen
from botocore.client import Config
from watson_machine_learning_client import WatsonMachineLearningAPIClient
from keras.preprocessing import image
from keras.applications.inception_v3 import decode_predictions, preprocess_input

%matplotlib inline

Conveniently, Tensorflow includes a [script](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py) that will handle the transfer learning of either [Inception V3 model](https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models) or a [MobileNet](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html). Both were trained on [ImageNet 2012 competition](http://www.image-net.org/challenges/LSVRC/2012/) images (1000 categories and 1.2 million images).

**Inception V3**: higher accuracy but slower — Top 1 Accuracy on ImageNet: 78% — ~85MB model size <br />
**MobileNets**: smaller and faster, but lower accuracy. — Top 1 Accuracy on ImageNet: 70.7% — ~19MB model size

## Import Inception V3 Retrain architecture

In [None]:
!wget -O retrain.py https://raw.githubusercontent.com/tensorflow/tensorflow/7f53659bc67bba5567ea3f0b69710329843e0228/tensorflow/examples/image_retraining/retrain.py

## Import architecture labels

In [None]:
!wget -O label_image.py https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/label_image/label_image.py

## Importing Training and Testing ImageNet datasets from the dropbox

In [None]:
!wget -O trainingset.zip https://www.dropbox.com/s/4unryao4xt8ehgp/training_images.zip?dl=0

In [None]:
!wget -O testingset.zip https://www.dropbox.com/s/0eu80yday3r4bn8/testing_images.zip?dl=0

In [None]:
!unzip -o ./trainingset.zip

In [None]:
!unzip -o ./testingset.zip

## Directory Lookup

In [None]:
!ls  -aLF ./training_images/

In [None]:
!ls  -aLF ./testing_images/

In [None]:
os.remove('training_images/.DS_Store') 
os.remove('testing_images/.DS_Store') 

## Visualizing the training images

In [None]:
directories = os.listdir('training_images/')
images = []
for folder in os.listdir('training_images'):
    path = os.path.join('training_images', folder)
    images.extend([os.path.join(path, f) for f in os.listdir(path)])

# Plot some sample images in the dataset.
plt.figure(figsize=(20,10))
for i in range(15):
    img = mpimg.imread(random.choice(images))
    plt.subplot(3, 5, i+1)
    plt.imshow(img)
    frame = plt.gca()
    frame.axes.get_xaxis().set_visible(False)
    frame.axes.get_yaxis().set_visible(False)

## Training process to Applying transfer learning using the Inception V3 Architecture

Normally training a model from scratch would take an enormous amount of time and resources. Here, however, we will only be training the final layer of the network, so the training time is much more reasonable.

Let's go over some of the arguments we will be using.

The ***bottleneck_dir*** will be used to cache the outputs of the lower layers on disk so they don’t have to repeatedly be recalculated. 'Bottleneck' is an informal term often used for the layer just before the final output layer that actually does the classification. Since images are reused several times during training, it would be too time-consuming to calculate the layers before the bottleneck for each image each time we use it. These lower layers never changed, so we can just run the image through them once, then cache and reuse the outputs.


The ***how_many_training_steps*** option is used to specify that we want to run this example for 1000 iterations. This amount can be experimented with.

The ***model_dir*** option asks us where to store the trained model.

The ***summaries_dir*** option asks us where to save summary logs for TensorBoard (which we won't be using here).

The ***output_graph*** option is where the script will write out a version of the Inception v3 neural network with a final layer retrained to our categories. 

The ***output_labels*** will be the file where the labels are stored. These labels are the same as the image folder names.

Lastly, we use the ***image_dir*** argument to pass in the directory containing the labeled class folders containing our images.

In [None]:
!python retrain.py \
    --bottleneck_dir=./ml-model/bottlenecks \
    --how_many_training_steps 1000 \
    --learning_rate 0.01 \
    --train_batch_size 200 \
    --model_dir=./ml-model/pretrained_model \
    --summaries_dir=./retrain-logs \
    --output_graph=./ml-model/retrained_graph.pb \
    --output_labels=./ml-model/retrained_labels.txt \
    --image_dir=./training_images/ \
    --saved_model_dir =./saved-model/

### Training Concepts

As the retraining script runs, you'll see a series of step outputs, each showing the following information:

* The **training accuracy** shows the percentage of the images used in the current training batch that were correctly labeled
* **Validation accuracy**: The validation accuracy is the precision (percentage of correctly-labelled images) on a randomly-selected group of images from a different set.
* **Cross entropy** is a loss function that shows how well the learning process is progressing (the lower the better).

If the *training accuracy* is high but the *validation accuracy* stays low, the model is overfitting or memorizing specific features in the training images that don't help it classify images more generally.

When training keep an eye on the *cross entropy*. The goal is to get this value as small as possible, and you can tell if the model is learning by if the loss is trending downwards or not.


## Visualizing the Tensorflow model

In [None]:
# This visualization code taken from: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb
def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = tf.compat.as_bytes("<stripped %d bytes>"%size)
    return strip_def
  
def rename_nodes(graph_def, rename_func):
    res_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = res_def.node.add() 
        n.MergeFrom(n0)
        n.name = rename_func(n.name)
        for i, s in enumerate(n.input):
            n.input[i] = rename_func(s) if s[0]!='^' else '^'+rename_func(s[1:])
    return res_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))
  
    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

In [None]:
with tf.gfile.FastGFile("./ml-model/retrained_graph.pb", 'rb') as f:
    graph_def = tf.GraphDef()

    # Parse the graph.
    graph_def.ParseFromString(f.read())
    show_graph(graph_def)

## Visualizing the testing images

In [None]:
directories_testing = os.listdir('testing_images/')
images_testing = []
for folder in os.listdir('testing_images'):
    path_testing = os.path.join('testing_images', folder)
    images.extend([os.path.join(path_testing, f) for f in os.listdir(path_testing)])

# Plot some sample images in the dataset.
plt.figure(figsize=(20,10))
for i in range(15):
    img = mpimg.imread(random.choice(images))
    plt.subplot(3, 5, i+1)
    plt.imshow(img)
    frame = plt.gca()
    frame.axes.get_xaxis().set_visible(False)
    frame.axes.get_yaxis().set_visible(False)

## Validating the model and providing an image path for a test 

In [None]:
model_dir = './ml-model'

test_image = './testing_images/validation_aircrafts/n04583620_4216.JPEG'

input_layer = 'Mul'
input_height = 299
input_width = 299


%env MODEL_DIR=$model_dir
%env INPUT_HEIGHT=$input_height
%env INPUT_WIDTH=$input_width
%env TEST_IMAGE=$test_image
%env INPUT_LAYER=$input_layer

## Running the scripts and getting the results of the Image Recognition tests

In [None]:
img = mpimg.imread(test_image)
plt.figure(figsize=(8,8))
plt.imshow(img)
frame = plt.gca()
frame.axes.get_xaxis().set_visible(False)
frame.axes.get_yaxis().set_visible(False)

!python ./label_image.py \
    --graph=$MODEL_DIR/retrained_graph.pb --labels=$MODEL_DIR/retrained_labels.txt \
    --input_layer=$INPUT_LAYER \
    --output_layer=final_result \
    --input_height=$INPUT_HEIGHT --input_width=$INPUT_WIDTH \
    --image=$TEST_IMAGE










## Creating REST API 

In [None]:
cos_credentials = {
  "insert your credentials here"
}

auth_endpoint = 'https://iam.bluemix.net/oidc/token'
service_endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'

In [None]:
cos = ibm_boto3.resource('s3',
                         ibm_api_key_id=cos_credentials['apikey'],
                         ibm_service_instance_id=cos_credentials['resource_instance_id'],
                         ibm_auth_endpoint=auth_endpoint,
                         config=Config(signature_version='oauth'),
                         endpoint_url=service_endpoint)

In [None]:
from uuid import uuid4

bucket_uid = str(uuid4())
buckets = ['training-data-' + bucket_uid, 'training-results-' + bucket_uid]

for bucket in buckets:
    if not cos.Bucket(bucket) in cos.buckets.all():
        print('Creating bucket "{}"...'.format(bucket))
        try:
            cos.create_bucket(Bucket=bucket)
        except ibm_boto3.exceptions.ibm_botocore.client.ClientError as e:
            print('Error: {}.'.format(e.response['Error']['Message']))


In [None]:
data_links = ['http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz',
              'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz',
              'http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz',
              'http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz']

In [None]:
from urllib.request import urlopen

bucket_obj = cos.Bucket(buckets[0])

for data_link in data_links:
    filename=data_link.split('/')[-1]
    print('Uploading data {}...'.format(filename))
    with urlopen(data_link) as data:
        bucket_obj.upload_fileobj(data, filename)
        print('{} is uploaded.'.format(filename))

In [None]:
for bucket_name in buckets:
    print(bucket_name)
    bucket_obj = cos.Bucket(bucket_name)
    for obj in bucket_obj.objects.all():
        print("  File: {}, {:4.2f}kB".format(obj.key, obj.size/1024))

## Integration with Watson Machine Learning client API services

In [None]:
wml_credentials = {
  "Insert your credentials here"
}

In [None]:
!rm -rf $PIP_BUILD/watson-machine-learning-client

In [None]:
!pip install watson-machine-learning-client --upgrade

In [None]:
client = WatsonMachineLearningAPIClient(wml_credentials)

In [None]:
client.version

In [None]:
model_dir_path = '=/'

## Properties to integrate the model with the Watson Machine Learning (WML) API

In [None]:
model_meta_props = {client.repository.ModelMetaNames.NAME: "ibm_disco_lab",
                            client.repository.ModelMetaNames.AUTHOR_NAME: "Jorge Chagas",
                            client.repository.ModelMetaNames.AUTHOR_EMAIL: "jorge.barbosa@ibm.com",
                            client.repository.ModelMetaNames.FRAMEWORK_NAME: "tensorflow",
                            client.repository.ModelMetaNames.FRAMEWORK_VERSION: "1.5",
                            client.repository.ModelMetaNames.RUNTIME_NAME: "python",
                            client.repository.ModelMetaNames.RUNTIME_VERSION: "3.5"}
published_model_details = client.repository.store_model(model=model_dir_path, meta_props=model_meta_props, training_data='./training_images/')

In [None]:
definition_uid = client.repository.get_model_uid(published_model_details) 

In [None]:
definition_uid

In [None]:
model_details = client.repository.get_details(definition_uid)

## Model details into the WML

In [None]:
print(json.dumps(model_details, indent=2))

In [None]:
client.repository.list_models()

In [None]:
loaded_model = client.repository.load(definition_uid)

In [None]:
loaded_model

In [None]:
print("Url: " + client.repository.get_model_url(model_details))

In [None]:
model_uid = client.repository.get_model_uid(model_details)
print("Saved model uid: " + model_uid)

## Verification for the model deployment

In [None]:
deployment_details = client.deployments.create(model_uid, "IBM Disco")

In [None]:
scoring_url =  client.deployments.get_scoring_url(deployment_details)
print(scoring_url)

In [None]:
url = 'https://us-south.ml.cloud.ibm.com'
username = 'your_wml_username'
password = 'your_wml_password
scoring_endpoint = scoring_url

In [None]:
headers = urllib3.util.make_headers(basic_auth='{}:{}'.format(username, password))
path = '{}/v3/identity/token'.format(url)
response = requests.get(path, headers=headers)
mltoken = json.loads(response.text).get('token')
print(mltoken)

In [None]:
img = image.load_img('./testing_images/validation_aircrafts/n04583620_4216.JPEG',target_size=(299,299))
img

In [None]:
input_image = image.img_to_array(img)
input_image = np.expand_dims(input_image, axis=0)
input_image = preprocess_input(input_image).tolist()

In [None]:
scoring_data = {'values': input_image}

In [None]:
predictions = client.deployments.score(scoring_url, scoring_data)
print("Scoring result: " + str(predictions))

In [None]:
predictions.get('values')