<table style="border: none" align="left">
   <tr style="border: none">
      <th style="border: none"><font face="verdana" size="5" color="black"><b>From local scikit-learn model to cloud with watson machine learning client</b></th>
      <th style="border: none"><img src="https://github.com/pmservice/customer-satisfaction-prediction/blob/master/app/static/images/ml_icon_gray.png?raw=true" alt="Watson Machine Learning icon" height="40" width="40"></th>
   </tr>
   <tr style="border: none">
       <th style="border: none"><img src="https://github.com/pmservice/wml-sample-models/raw/master/scikit-learn/hand-written-digits-recognition/images/numbers_banner-04.png" width="600" alt="Icon"> </th>
   </tr>
</table>

This notebook contains steps and code to work with [watson-machine-learning-client](https://pypi.python.org/pypi/watson-machine-learning-client) library available in PyPI repository. This notebook introduces commands for getting data and for basic data exploration, pipeline creation, model training and evaluation, model persistance to Watson Machine Learning repository, model deployment, and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3.5, scikit-learn, watson-machine-learning-client package.

You will use a toy dataset available in scikit-learn, **sklearn.datasets.load_digits**, which contains hand-written digits images. Use the toy dataset to recognize hand-written digits.

## Learning goals

The learning goals of this notebook are:

-  Load a sample dataset from scikit-learn.
-  Explore data.
-  Prepare data for training and evaluation.
-  Create Scikit-learn machine learning pipeline.
-  Train and evaluate a model.
-  Persist a model in Watson Machine Learning repository.
-  Deploy model for online scoring using client library.
-  Score sample records using client library.


## Contents

This notebook contains the following parts:

1.	[Setup](#setup)
2.	[Load and explore data](#load)
3.	[Create scikit-learn model](#model)
4.	[Persist model](#persistence)
5.	[Deploy and score in a Cloud](#scoring)
6.	[Summary and next steps](#summary)

<a id="setup"></a>
## 1. Setup

Before you use the sample code in this notebook, you must perform the following setup task:

-  Create a [Watson Machine Learning Service](https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/) instance (a free plan is offered). 
- Configure local python environment
  + python 3.5
  + scikit-learn 0.17.1
  + xgboost 0.6 (here not used dependency)
  + watson-machine-learning-client

**Tip**: You can install libraries from [PyPI](https://pypi.python.org/pypi) running cell below.

In [1]:
!pip install watson-machine-learning-client --upgrade

Requirement already up-to-date: watson-machine-learning-client in /gpfs/global_fs01/sym_shared/YPProdSpark/user/sec7-2ac43fd194bc9a-8ef090487f81/.local/lib/python3.5/site-packages
Requirement already up-to-date: lomond in /gpfs/global_fs01/sym_shared/YPProdSpark/user/sec7-2ac43fd194bc9a-8ef090487f81/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: tabulate in /gpfs/global_fs01/sym_shared/YPProdSpark/user/sec7-2ac43fd194bc9a-8ef090487f81/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: tqdm in /gpfs/global_fs01/sym_shared/YPProdSpark/user/sec7-2ac43fd194bc9a-8ef090487f81/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: requests in /gpfs/global_fs01/sym_shared/YPProdSpark/user/sec7-2ac43fd194bc9a-8ef090487f81/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: urllib3 in

<a id="load"></a>
## 2. Load and explore data

In this section you will load the data from scikit-learn sample datasets and perform a basic exploration.

In [2]:
import sklearn
from sklearn import datasets

digits = datasets.load_digits()

Loaded toy dataset consists of 8x8 pixels images of hand-written digits.

Let's display first digit data and label using **data** and **target**.

In [3]:
print(digits.data[0].reshape((8, 8)))

[[ 0.  0.  5. 13.  9.  1.  0.  0.]
 [ 0.  0. 13. 15. 10. 15.  5.  0.]
 [ 0.  3. 15.  2.  0. 11.  8.  0.]
 [ 0.  4. 12.  0.  0.  8.  8.  0.]
 [ 0.  5.  8.  0.  0.  9.  8.  0.]
 [ 0.  4. 11.  0.  1. 12.  7.  0.]
 [ 0.  2. 14.  5. 10. 12.  0.  0.]
 [ 0.  0.  6. 13. 10.  0.  0.  0.]]


In [4]:
digits.target[0]

0

In next step you will count data examples.

In [5]:
samples_count = len(digits.images)

print("Number of samples: " + str(samples_count))

Number of samples: 1797


<a id="model"></a>
## 3. Create a Scikit-learn model

In this section you will learn how to prepare data, create a Scikit-learn machine learning pipeline, and train a model.

### 3.1: Prepare data

In this subsection you will split your data into: train, test, and score datasets.

In [6]:
train_data = digits.data[: int(0.7*samples_count)]
train_labels = digits.target[: int(0.7*samples_count)]

test_data = digits.data[int(0.7*samples_count): int(0.9*samples_count)]
test_labels = digits.target[int(0.7*samples_count): int(0.9*samples_count)]

score_data = digits.data[int(0.9*samples_count): ]

print("Number of training records: " + str(len(train_data)))
print("Number of testing records : " + str(len(test_data)))
print("Number of scoring records : " + str(len(score_data)))

Number of training records: 1257
Number of testing records : 360
Number of scoring records : 180


As you can see our data has been successfully split into three datasets: 

-  The train data set, which is the largest group, is used for training.
-  The test data set will be used for model evaluation and is used to test the assumptions of the model.
-  The score data set will be used for scoring in Cloud.

### 3.2: Create pipeline and train a model

In this section you will create scikit-learn machine learning pipeline and then train the model.

In the first step you need to import the scikit-learn machine learning packages that will be needed in the subsequent steps.

In [7]:
from sklearn.pipeline import Pipeline
from sklearn import preprocessing
from sklearn import svm, metrics

Standardize features by removing the mean and scaling to unit variance.

In [8]:
scaler = preprocessing.StandardScaler()

Next, define estimators you want to use for classification. Support Vector Machines (SVM) with radial basis function as kernel is used in the following example.

In [9]:
clf = svm.SVC(kernel='rbf')

Let's build the pipeline now. A pipeline consists of transformer and an estimator.

In [10]:
pipeline = Pipeline([('scaler', scaler), ('svc', clf)])

Now, you can train your SVM model by using the previously defined **pipeline** and **train data**.

In [11]:
model = pipeline.fit(train_data, train_labels)

You can check your **model quality** now. To evaluate the model, use **test data**.

In [12]:
predicted = model.predict(test_data)

print("Evaluation report: \n\n%s" % metrics.classification_report(test_labels, predicted))

Evaluation report: 

             precision    recall  f1-score   support

          0       1.00      0.97      0.99        37
          1       0.97      0.97      0.97        34
          2       1.00      0.97      0.99        36
          3       1.00      0.94      0.97        35
          4       0.78      0.97      0.87        37
          5       0.97      0.97      0.97        38
          6       0.97      0.86      0.91        36
          7       0.92      0.97      0.94        35
          8       0.91      0.89      0.90        35
          9       0.97      0.92      0.94        37

avg / total       0.95      0.94      0.95       360



You can tune your model now to achieve better accuracy. For simplicity of this example tuning section is omitted.

<a id="persistence"></a>
## 4. Persist model

In this section you will learn how to store your model in Watson Machine Learning repository by using commont python client.

**Tip**: You can check documentation for watson-machine-learning-client either by visiting [documentation webpage](https://wml-api-pyclient.mybluemix.net) or running below code.

### 4.1: Work with your instance

First, you must import client libraries.

In [13]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

Authenticate to Watson Machine Learning service on IBM Cloud.

**Action**: Put authentication information from your instance of Watson Machine Learning service here.</div>

In [14]:
wml_credentials={
  "url": "https://ibm-watson-ml.mybluemix.net",
  "access_key": "****",
  "username": "****",
  "password": "****",
  "instance_id": "****"
}

**Tip**: Authentication information can be found on **Service Credentials** tab of service instance created on IBM  Cloud. <BR>If you cannot see **instance_id** field in **Serice Credentials** generate new credentials by pressing **New credential (+)** button. 

#### Create API client by running below code.

In [15]:
client = WatsonMachineLearningAPIClient(wml_credentials)

#### Get instance details.

In [16]:
import json

instance_details = client.service_instance.get_details()
print(json.dumps(instance_details, indent=2))

{
  "metadata": {
    "created_at": "2018-01-30T10:39:51.938Z",
    "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e",
    "modified_at": "2018-01-31T11:50:25.659Z",
    "guid": "fd1db635-c9f3-4d83-afc7-32efec367e5e"
  },
  "entity": {
    "region": "us-south",
    "account": {
      "id": "8dba8395f5b8dc6fcab49d43161ab9f1",
      "type": "TRIAL",
      "name": "Wojciech Sobala's Account"
    },
    "deployments": {
      "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/deployments"
    },
    "source": "Bluemix",
    "status": "Active",
    "plan_id": "0f2a3c2c-456b-40f3-9b19-726d2740b11c",
    "published_models": {
      "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models"
    },
    "organization_guid": "a7b3ef43-5dc2-41fc-8fa7-29b0846e5807",
    "plan": "standard",
    "space_guid": "a6b628f9-3f64-45c0-bf38-739876b63fec",
 

### 4.2: Publish model

#### Publish model in Watson Machine Learning repository on Cloud.

Define model name, autor name and email.

In [19]:
model_props = {client.repository.ModelMetaNames.AUTHOR_NAME: "IBM", 
               client.repository.ModelMetaNames.AUTHOR_EMAIL: "ibm@ibm.com",
               client.repository.ModelMetaNames.NAME: "LOCALLY created Digits prediction model"}

In [20]:
published_model = client.repository.store_model(model=model, meta_props=model_props, \
                                                training_data=train_data, training_target=train_labels)

### 4.3: Get model details

In [21]:
published_model_uid = client.repository.get_model_uid(published_model)
model_details = client.repository.get_details(published_model_uid)

print(json.dumps(model_details, indent=2))

{
  "metadata": {
    "created_at": "2018-01-31T12:11:03.446Z",
    "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/844087b1-ba8d-484d-acfe-7dfc132df858",
    "modified_at": "2018-01-31T12:11:03.515Z",
    "guid": "844087b1-ba8d-484d-acfe-7dfc132df858"
  },
  "entity": {
    "latest_version": {
      "created_at": "2018-01-31T12:11:03.515Z",
      "url": "https://ibm-watson-ml.mybluemix.net/v3/ml_assets/models/844087b1-ba8d-484d-acfe-7dfc132df858/versions/fed3e784-8b83-4cca-9688-154fdb202241",
      "guid": "fed3e784-8b83-4cca-9688-154fdb202241"
    },
    "runtime_environment": "python-3.5",
    "name": "LOCALLY created Digits prediction model",
    "learning_configuration_url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/844087b1-ba8d-484d-acfe-7dfc132df858/learning_configuration",
    "model_type": "scikit-learn-0.17",
    "input_data_schema": {
      "f

#### Get all models

In [23]:
models_details = client.repository.list_models()

------------------------------------  ---------------------------------------  ------------------------  -----------------  -----
GUID                                  NAME                                     CREATED                   FRAMEWORK          TYPE
51cdbdea-28ea-488d-9edd-dab2a8c44208  Sentiment Prediction                     2018-01-30T14:15:06.339Z  mllib-2.0          model
b7e9fc69-f0db-444f-89b4-fe7aa5e8a4f1  Handwritten Digits Recognition           2018-01-30T14:15:24.106Z  scikit-learn-0.17  model
fae8e498-136e-4c35-81bd-a9a69a50b1c7  CHAID PMML model for Iris data           2018-01-30T14:17:19.921Z  pmml-4.2           model
22e4cd95-4ede-4fa7-b361-ef9841a0e31f  My cool mnist model                      2018-01-31T10:21:42.501Z  tensorflow-1.2     model
e4643014-a6c7-45a0-94cd-f52e00f62bf9  My cool mnist model                      2018-01-31T11:48:14.134Z  tensorflow-1.2     model
844087b1-ba8d-484d-acfe-7dfc132df858  LOCALLY created Digits prediction model  2018-01-31T1

### 4.3: Load model

In this subsection you will learn how to load back saved model from specified instance of Watson Machine Learning.

In [24]:
loaded_model = client.repository.load(published_model_uid)

You can make predictions to check that model has been loaded correctly.

In [25]:
test_predictions = loaded_model.predict(test_data[:10])

In [26]:
print(test_predictions)

[4 0 5 3 6 9 6 4 7 5]


As you can see you are able to make predictions, so model is loaded back correctly. You have already learned how save to and load the model from Watson Machine Learning repository.

### 4.4: Delete model

You can delete published model from Watson Machine Learning repository using below code. The code is commented out at this stage since the model will be needed later for deployment.

In [27]:
# client.repository.delete(published_model_uid)

<a id="scoring"></a>
## 5. Deploy and score in a Cloud

In this section you will learn how to create online scoring and to score a new data record by using the Watson Machine Learning Client.

### 5.1: Create model deployment

#### Create online deployment for published model

In [28]:
created_deployment = client.deployments.create(published_model_uid, "Deployment of locally created scikit model")

**Note**: Here we use deployment url saved in published_model object. In next section we show how to retrive deployment url from Watson Mchine Learning instance.

Now you can print an online scoring endpoint. 

In [29]:
scoring_endpoint = client.deployments.get_scoring_url(created_deployment)

print(scoring_endpoint)

https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/844087b1-ba8d-484d-acfe-7dfc132df858/deployments/134f404d-91be-47ad-9524-f8508c1604fe/online


### 5.2: Get deployments

In [30]:
deployments = client.deployments.get_details()

print(json.dumps(deployments, indent=2))

{
  "resources": [
    {
      "metadata": {
        "created_at": "2018-01-30T14:15:59.342Z",
        "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/b7e9fc69-f0db-444f-89b4-fe7aa5e8a4f1/deployments/06faa2cd-e12b-4595-b96d-555d9fae6363",
        "modified_at": "2018-01-30T14:16:00.444Z",
        "guid": "06faa2cd-e12b-4595-b96d-555d9fae6363"
      },
      "entity": {
        "scoring_url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/b7e9fc69-f0db-444f-89b4-fe7aa5e8a4f1/deployments/06faa2cd-e12b-4595-b96d-555d9fae6363/online",
        "runtime_environment": "python-3.5",
        "published_model": {
          "created_at": "2018-01-30T14:15:59.307Z",
          "url": "https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/b7e9fc69-f0db-444f-89b4-fe7aa5e8a4f1",
          "guid": "b7e9fc69-f0db-444f-89b4-fe

You can get deployment_url by parsing deployment details for last deployed model.

In [32]:
deployment_url = client.deployments.get_url(created_deployment)

print(deployment_url)

https://ibm-watson-ml.mybluemix.net/v3/wml_instances/fd1db635-c9f3-4d83-afc7-32efec367e5e/published_models/844087b1-ba8d-484d-acfe-7dfc132df858/deployments/134f404d-91be-47ad-9524-f8508c1604fe


### 5.3: Score

You can use below method to do test scoring request against deployed model.

**Action**: Prepare scoring payload with records to score.

In [33]:
scoring_payload = {"values": [list(score_data[0]), list(score_data[1])]}

Use ``client.deployments.score()`` method to run scoring.

In [34]:
predictions = client.deployments.score(scoring_endpoint, scoring_payload)

In [35]:
print(json.dumps(predictions, indent=2))

{
  "fields": [
    "prediction"
  ],
  "values": [
    [
      5
    ],
    [
      2
    ]
  ]
}


### 5.4: Delete deployment

Use the following method to delete deployment.

In [37]:
client.deployments.delete(client.deployments.get_uid(created_deployment))

You can check depoyments by running ``list`` method.

In [38]:
client.deployments.list()

------------------------------------  -----------------------  ------  ------------------------  -----------------
GUID                                  NAME                     TYPE    CREATED                   FRAMEWORK
06faa2cd-e12b-4595-b96d-555d9fae6363  Digits recognition       online  2018-01-30T14:15:59.342Z  scikit-learn-0.17
56f3501f-844c-4cb7-9bbc-02131da8f6bf  Iris species prediction  online  2018-01-30T14:17:22.742Z  pmml-4.2
b44c9c91-796a-4caf-bb15-357065df1ac9  Mnist model deployment   online  2018-01-31T11:50:01.606Z  tensorflow-1.2
------------------------------------  -----------------------  ------  ------------------------  -----------------


### 5.5: Delete model

In [39]:
client.repository.delete(published_model_uid)

You can check your stored models by running below ``list`` method.

In [41]:
client.repository.list_models()

------------------------------------  ------------------------------  ------------------------  -----------------  -----
GUID                                  NAME                            CREATED                   FRAMEWORK          TYPE
51cdbdea-28ea-488d-9edd-dab2a8c44208  Sentiment Prediction            2018-01-30T14:15:06.339Z  mllib-2.0          model
b7e9fc69-f0db-444f-89b4-fe7aa5e8a4f1  Handwritten Digits Recognition  2018-01-30T14:15:24.106Z  scikit-learn-0.17  model
fae8e498-136e-4c35-81bd-a9a69a50b1c7  CHAID PMML model for Iris data  2018-01-30T14:17:19.921Z  pmml-4.2           model
22e4cd95-4ede-4fa7-b361-ef9841a0e31f  My cool mnist model             2018-01-31T10:21:42.501Z  tensorflow-1.2     model
e4643014-a6c7-45a0-94cd-f52e00f62bf9  My cool mnist model             2018-01-31T11:48:14.134Z  tensorflow-1.2     model
------------------------------------  ------------------------------  ------------------------  -----------------  -----


<a id="summary"></a>
## 6. Summary and next steps     

 You successfully completed this notebook! You learned how to use scikit-learn machine learning as well as Watson Machine Learning for model creation and deployment. Check out our _[Online Documentation](https://console.ng.bluemix.net/docs/services/PredictiveModeling/index.html?pos=2)_ for more samples, tutorials, documentation, how-tos, and blog posts. 

### Authors

**Wojciech Sobala**, Data Scientist at IBM developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.

Copyright © 2017 IBM. This notebook and its source code are released under the terms of the MIT License.