# Use PMML with IBM Watson Machine Learning

### Contents

1. [Set up the environment](#setup_environment)
1. [Explore and prepare training data](#explore_prepare_data)
1. [Create train and test dataset](#train_test_set)
1. [Train the model](#train_model)
1. [Save the model](#save_model)
1. [Deploy and score](#deploy_model)

In [None]:
%matplotlib inline 

import pandas as pd
import numpy as np
import json

from sklearn.model_selection import train_test_split
from sklearn import linear_model

from sklearn2pmml import sklearn2pmml
from sklearn2pmml.pipeline import PMMLPipeline

<a id="setup_environment"></a>
## 1. Set up the environment

To authenticate to Watson Machine Learning in the IBM Cloud, you need api_key and service location.

Using [IBM Cloud CLI](https://cloud.ibm.com/docs/cli/index.html) or directly through the IBM Cloud portal.

Using IBM Cloud CLI:

```
ibmcloud login
ibmcloud iam api-key-create API_KEY_NAME
```

NOTE: To get the Service URL [Endpoint URLs section of the Watson Machine Learning docs](https://cloud.ibm.com/apidocs/machine-learning).

**Action**: Enter your api_key and location in the following cell.

In [None]:
API_KEY = 'API_KEY'
LOCATION = 'LOCATION'

In [None]:
WML_CREDENTIALS = {
    "apikey": API_KEY,
    "url": LOCATION
}

**Action**: Assign space ID below

In [None]:
SPACE_ID = 'SPACE_ID'

**Action**: Assign project ID below

In [None]:
PROJECT_ID = 'PROJECT_ID'

### 1.2 Installing IBM Watson Machine Learning library

NOTE: Documentation could be found [here](http://ibm-wml-api-pyclient.mybluemix.net/)

In [None]:
%pip install -U ibm-watson-machine-learning --quiet

In [None]:
from ibm_watson_machine_learning import APIClient

wml_client = APIClient(WML_CREDENTIALS)
print(wml_client.version)

<a id="explore_prepare_data"></a>
## 2. Explore and prepare training data

### 2.1 Importing training data

In [None]:
from sklearn import datasets

iris = datasets.load_iris()

In [None]:
df = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                     columns= iris['feature_names'] + ['target'])

### 2.2. Exploring and preparing data

In [None]:
df.head()

In [None]:
df.describe()

<a id="train_test_set"></a>
## 3. Create train and test dataset

NOTE: Test dataset (30%) and Training dataset (70%)

In [None]:
X = df.drop(['target'], axis=1)
Y = df['target']

print(X.shape)
print(Y.shape)

In [None]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, stratify=Y)

<a id="train_model"></a>
## 4. Train the PMML Pipeline

In [None]:
from sklearn.linear_model import LogisticRegression

In [None]:
# LogisticRegression

logistic_regression = LogisticRegression(solver='lbfgs', penalty='l2', C=1.5)

In [None]:
pmml_pipeline = PMMLPipeline([
  ("classifier", logistic_regression)
])

pmml_pipeline.fit(X_train, Y_train)

### 4.1 Model evaluation

In [None]:
pmml_pipeline.score(X_test, Y_test)

<a id="save_model"></a>
## 5. Save the model

### 5.1 Save the model to PMML file

In [None]:
pmml_filename = "logistic_regression_pipeline.xml"
sklearn2pmml(pmml_pipeline, pmml_filename)

In [None]:
!ls

### 5.1 Save the model to IBM Watson Studio project

In [None]:
wml_client.set.default_project(PROJECT_ID)

In [None]:
sofware_spec_uid = wml_client.software_specifications.get_id_by_name("pmml-3.0_4.3")
metadata = {
            wml_client.repository.ModelMetaNames.NAME: 'Logistic Regression PMML pipeline',
            wml_client.repository.ModelMetaNames.TYPE: 'pmml_4.2.1',
            wml_client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: sofware_spec_uid
}

published_model = wml_client.repository.store_model(model=pmml_filename, meta_props=metadata, training_data=df, training_target=Y)

### 5.2 Save the model to IBM Watson Studio space

In [None]:
wml_client.set.default_space(SPACE_ID)

In [None]:
wml_client.spaces.list(limit=10)

In [None]:
published_model = wml_client.repository.store_model(model=pmml_filename, meta_props=metadata, training_data=df, training_target=Y)

In [None]:
published_model_uid = wml_client.repository.get_model_id(published_model)
model_details = wml_client.repository.get_details(published_model_uid)
print(json.dumps(model_details, indent=2))

In [None]:
wml_client.repository.list_models()

In [None]:
# wml_client.repository.delete(published_model_uid)

<a id="deploy_model"></a>
## 6. Deploy and score

NOTE: Deploy and score the model deployed at IBM Watson Machine Learning

In [None]:
metadata = {
    wml_client.deployments.ConfigurationMetaNames.NAME: "preprod_iris_classification_deployment",
    wml_client.deployments.ConfigurationMetaNames.ONLINE: {}
}

created_deployment = wml_client.deployments.create(published_model_uid, meta_props=metadata)

In [None]:
# Get deployment UID and show details on the deployment
deployment_uid = wml_client.deployments.get_uid(created_deployment)
wml_client.deployments.get_details(deployment_uid)

In [None]:
wml_client.deployments.list()

In [None]:
# wml_client.deployments.delete(deployment_uid)

### 6.1 Score model

NOTE: Test the API created from IBM Watson Machine Learning.

In [None]:
deployment_id = wml_client.deployments.get_id(created_deployment)

scoring_data = {
    wml_client.deployments.ScoringMetaNames.INPUT_DATA: [
        {
            'fields': ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'],
            'values': [[5.1, 3.5, 1.4, 0.2]]
        }]
}

predictions = wml_client.deployments.score(deployment_id, scoring_data)
print(predictions)