# MNIST Classifier Using Watson Machine Learning

This notebook will allow you to submit jobs to the [Watson Machine Learning (WML) service](https://www.ibm.com/cloud/machine-learning). 

At this point in the process you should have a working Watson Studio instance, alongside a Watson Machine Learning service, and a Cloud Object Storage bucket with the MNIST dataset files uploaded to it. From here we will cover steps 6-8 as defined in the Github, replicated here for clarity:

1. Create a Watson Studio instance 
2. Add a Deep Learning project
3. Hook up the Jupyter notebook
4. Get the data, and upload it to your storage bucket
5. Obtain the credentials for both COS and WML
6. **Add your credentials to the template notebook**
7. **Train the model- monitoring progress and results**
8. **Deploy the model, and the test the endpoint**

*If any of these requirements are sounding unfamiliar or confusing, then please check back on the [README.md](https://github.com/FarrandTom/wml-tf-mnist-classifier) of the Github repository.*

Now that you have gotten this far, the basic workflow is as follows:

                                 Download the model .zip
            Download a .zip of your model files to this notebook. We are going to be
            using Github to host our model files- as it allows for simple versioning
            and control. There is nothing stopping you from using your Cloud Object 
            Storage bucket to hold the model.
                                            |
                                            ↓
                                      Define your job
            Define the credentials for your Cloud services; Watson Machine Learning,
            Cloud Object Storage. During this step you will also define the parameters
            for the job you would like to execute e.g. the deep learning framework, 
            version, and the command to start your job. 
                                            |
                                            ↓
                                          Train
                                            |
                                            ↓
                                          Deploy
                                            |
                                            ↓
                                        Inference



## Download the model .zip

#### Install wget

The first step of the process is to install wget in the virtual machine where your notebook is running. This will allow us to get the TensorFlow code from GitHub which we'll be using to perform the digit classification. 

To install wget - run the code cell below:

In [1]:
!pip install --upgrade wget

Collecting wget
  Downloading https://files.pythonhosted.org/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip
Building wheels for collected packages: wget
  Running setup.py bdist_wheel for wget ... [?25ldone
[?25h  Stored in directory: /home/dsxuser/.cache/pip/wheels/40/15/30/7d8f7cea2902b4db79e3fea550d7d7b85ecb27ef992b618f3f
Successfully built wget
Installing collected packages: wget
Successfully installed wget-3.2


#### Download Model Code

If you wish to use own custom model, then simply replace the URL with you a link to your own .zip files. You could also modify the below code to link your Cloud Object Storage bucket, however that is outside the scope of this tutorial. 

If you are using a custom model, please pay attention to the note within the code block.

In [None]:
import os
import wget
filename = 'tf-model.zip'
url = 'MY GITHUB LINK HERE/tf-model.zip?raw=true'

# NOTE: If you are re-running this code block again, having changed the model or adding your own custom model
# be careful to ensure that your new model is the one which is truly downloaded.
if not os.path.isfile( filename ): wget.download(url)

## Define Your Job

In this tutorial, you don't train the model by running the sample model-building code in your notebook directly. Instead, you submit a job to run the model-building code in a training run on Watson Machine Learning GPU-accelerated infrastructure.

Submitting a training run requires two things:

- A .zip file containing model-building code
- Metadata

From the previous steps we have taken we already have both of these pre-requisities. 

**A little note on terminology**: The model-building code and the metadata together are referred to as the *training definition*. You can think of this as being short for "training run definition".

#### Add Watson Machine Learning Credentials

Next, we shall define the `client` object. This will create a persistent connection to the Watson Machine Learning service we have provisioned- allowing us to define our model.

Replace the `***` with the Watson Machine Learning credentials which we grabbed earlier.

In [3]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

wml_credentials = { "url"         : "https://ibm-watson-ml.mybluemix.net",
                    "username"    : "***",
                    "password"    : "***",
                    "instance_id" : "***"
                   }

client = WatsonMachineLearningAPIClient(wml_credentials)



 All model information is stored in the `client.repository` object. We define it below. 
 
 Feel free to replace the `***` with your e-mail, however it will run regardless.

In [None]:
metadata = {
client.repository.DefinitionMetaNames.NAME              : "python-client-tutorial_training-definition",
client.repository.DefinitionMetaNames.AUTHOR_EMAIL      : "***",
client.repository.DefinitionMetaNames.FRAMEWORK_NAME    : "tensorflow",
client.repository.DefinitionMetaNames.FRAMEWORK_VERSION : "1.5",
client.repository.DefinitionMetaNames.RUNTIME_NAME      : "python",
client.repository.DefinitionMetaNames.RUNTIME_VERSION   : "3.5",
client.repository.DefinitionMetaNames.EXECUTION_COMMAND : "python3 convolutional_network.py --trainImagesFile ${DATA_DIR}/train-images-idx3-ubyte.gz --trainLabelsFile ${DATA_DIR}/train-labels-idx1-ubyte.gz --testImagesFile ${DATA_DIR}/t10k-images-idx3-ubyte.gz --testLabelsFile ${DATA_DIR}/t10k-labels-idx1-ubyte.gz --learningRate 0.001 --trainingIters 20000"
}
definition_details = client.repository.store_definition( "tf-model.zip", meta_props=metadata )
definition_uid     = client.repository.get_definition_uid( definition_details )
print( "definition_uid: ", definition_uid )

You have now connected to the Watson Machine learning service and subsequently *defined* your model. Meaning you have laid out scaffolding for how you want it to handle data, and execute. 

You are yet to define where the data you have uploaded to the Cloud Object Storage bucket is stored. This will be the next step...

#### Add Cloud Object Storage Credentials

The `client.training` object contains the references to our training data and will be where the Watson Machine Learning service looks for the data we have uploaded into the Cloud Object Storage. 

Replace the `***` with the credentials corresponding to those we grabbed from the Cloud Object Storage bucket. 

**Note:** The `client.training` object does not specifiy any constraints on how your data should be laid out and setup. This is done in the model code which you define.

In [None]:
metadata = {
client.training.ConfigurationMetaNames.NAME         : "python-client-tutorial_training-run",
client.training.ConfigurationMetaNames.AUTHOR_EMAIL : "***",
client.training.ConfigurationMetaNames.TRAINING_DATA_REFERENCE : {
   "connection" : { 
      "endpoint_url"      : "***",
      "access_key_id"     : "***",
      "secret_access_key" : "***"
      },
   "source" : { 
      "bucket" : "***",
      },
      "type" : "s3"
   },
client.training.ConfigurationMetaNames.TRAINING_RESULTS_REFERENCE: {
   "connection" : {
      "endpoint_url"      : "***",
      "access_key_id"     : "***",
      "secret_access_key" : "***"
      },
      "target" : {
         "bucket" : "***",
      },
      "type" : "s3"
   }
}

## Train the Model

It is then very simple for you to start the training run as you have now defined all of the necessary information to kick it off!

Simply run the below block, which will then return to your run unique identifier of your model. 

In [None]:
run_details = client.training.run(definition_uid, meta_props=metadata)
run_uid     = client.training.get_run_uid(run_details)
print("run_uid: ", run_uid)

#### Monitor Progress

To understand the status of the training job you can run, and subsequently re-run the following code block. It will tell you the status of the job, and any error codes produced (though more detailed logs can be found in the `.txt` files associated with each run). 

In [None]:
client.training.get_status(run_uid)

#### Results

Results alongside all other output from the model run, will be dropped into the results bucket of your Cloud Object Storage instance.

## Deploy the Model

You can use your trained model to classify new images only after the model has been deployed. 

Prior to deploying the model you must store your trained model in the Watson Machine Learning repository (you have already worked with this when defining your model, thus shouldn't be any new syntax). 

The below code block stores the trained model in your WML repository.

In [None]:
stored_model_name    = "python-client-tutorial_model"
stored_model_details = client.repository.store_model(run_uid, stored_model_name)
model_uid            = client.repository.get_model_uid(stored_model_details)
print("model_uid: ", model_uid)

Deploy the freshly stored model by running the following block. It will provide an endpoint which you can call to generate inferences on new data!

In [None]:
deployment_name  = "MNIST-Classifier"
deployment_desc  = "Live deployment of the MNIST classifier"
deployment       = client.deployments.create(model_uid, deployment_name, deployment_desc)
scoring_endpoint = client.deployments.get_scoring_url(deployment)
print("scoring_endpoint: ", scoring_endpoint)

## Inference

You can now submit new data to be evaluated by the trained model (a process known as inference). We will be calling our scoring endpoint and passing it a payload of a never before seen handwritten digit.

Firstly, download the sample file of the digits "5" and "4" to this notebook.

In [None]:
filename = 'tf-mnist-test-payload.json'
'MY GIT HUB BELOW'
url = 'https://raw.githubusercontent.com/pmservice/wml-sample-models/master/tensorflow/hand-written-digit-recognition/test-data/tf-mnist-test-payload.json'
if not os.path.isfile( filename ): wget.download( url )

You'll now load the contents of the file into a data structure recognised by the model scoring endpoint. In this instance that is JSON (a common data structure used in communications over HTTP connections). 

In [None]:
import json
with open('tf-mnist-test-payload.json') as data_file: test_data = json.load(data_file)
payload = test_data['payload']

Finally, calling the scoring endpoint!

You should expect to get back: `{'values': [5, 4]}`

In [None]:
client.deployments.score(scoring_endpoint, payload)

### That's all folks! Hopefully you enjoyed the tutorial, learnt something and are feeling motivated to create some amazing models using Watson Machine Learning!