<table style="border: none" align="left">
   <tr style="border: none">
      <th style="border: none"><font face="verdana" size="5" color="black"><b>Use a continuous learning system to predict the best heart drug</b></font></th>
      <th style="border: none"><img src="https://github.com/pmservice/customer-satisfaction-prediction/blob/master/app/static/images/ml_icon_gray.png?raw=true" alt="Watson Machine Learning icon" height="40" width="40"></th>
   </tr> 
   <tr style="border: none">
       <td style="border: none"><img src="https://github.com/pmservice/wml-sample-models/raw/master/spark/drug-selection/images/learning_banner-05.png" width="600" alt="Icon"></td>
   </tr>
</table>

This notebook contains steps and code to configure a **continuous learning system** using the Watson Machine Learning (WML) client, and start scoring new data. This notebook introduces commands for getting data, model persistance to Watson Machine Learning repository, model deployment, continuous learning system configuration and scoring.

Some familiarity with Python is helpful. This notebook uses Python 3 and Apache Spark 2.1.

You will use the data set published on git, **drug_feedback_data.csv**, which contains anonymous information about patients records. Use the details of this data set to predict the best drug to treat heart disease.

## Learning goals

This notebook teaches you how to:
-  Prepare a feedback data set in Db2 Warehouse on Cloud on IBM Cloud
-  Publish a sample model in the Watson Machine Learning (WML) repository

You will also learn how to use the WML API to:
-  Configure a continuous learning system for the published model 
-  Deploy a model for online scoring 
-  Track model performance changes after learning system iteration 
-  Explore and visualize model performance using the plotly package


## Contents

This notebook contains the following parts:

1.	[Set up the environment](#setup)
2.	[Create spark ml model](#model)
3.	[Persist model](#load)
4.	[Configure a continuous learning system](#configuration)
5.	[Track the model's performance](#performance)
6.	[Visualize model performance](#visualization)
7.	[Send new records to the feedback data store](#visualization)
8.	[Summary and next steps](#summary)

<a id="setup"></a>
## 1. Set up the environment

Before you use the sample code in this notebook, you must perform the following setup tasks:

-  Create a [Watson Machine Learning (WML) Service](https://console.ng.bluemix.net/catalog/services/ibm-watson-machine-learning/) instance (a free plan is offered and information about how to create the instance is [here](https://dataplatform.ibm.com/docs/content/analyze-data/wml-setup.html))
- Create a [Spark Service](https://console.ng.bluemix.net/catalog/services/spark/) instance (an entry plan is offered).
- Create a [Db2 Warehouse on Cloud Service](https://console.bluemix.net/catalog/services/db2-warehouse-on-cloud/) instance (an entry plan is offered).
- Create the **DRUG_TRAIN_DATA_UPDATED** and **DRUG_FEEDBACK_DATA** tables in **Db2 Warehouse on Cloud**. 
  + Download [drug_train_data_updated.csv](https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spark/drug-selection/data/drug_train_data_updated.csv) and [drug_feedback_data.csv](https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spark/drug-selection/data/drug_feedback_data.csv) files from git repository.
  + Click **Open the console** to get started with **Db2 Warehouse on Cloud** icon.
  + Select the **Load Data** and **Desktop** load type.
  + **Drag and drop** previously downloaded file and press **Next**.
  + Select **Schema** to import data and click **New Table**. 
  + Write name for **new table** than click **Next** to finish data import.
  + Use `;` as **field separator**.
  + Click **Next** to create a table with the uploaded data.

<a id="model"></a>
## 2. Create the spark machine learning model

In this section you will learn how to prepare data, create an Apache Spark machine learning pipeline, and train a model.

- [2.1 Load the training data from Db2 Warehouse on Cloud](#load)
- [2.2 Prepare the data](#prep)
- [2.3 Create the pipeline](#pipe)
- [2.4 Train the model](#train)

### 2.1 Load the training data from Db2 Warehouse on Cloud<a id="load"></a>

Run the following cell to the load the DRUG_TRAIN_DATA_UPDATED table content into the Spark DataFrame.

Enter your authentication data as required. 

**Tip:** The authentication information can be found under the **Service Credentials**  tab of Db2 Warehouse on Cloud service instance created in IBM Cloud. Click **New credential** to create credentials if you do not have any.

In [1]:
db2_service_credentials = {
  "port": 50000,
  "db": "BLUDB",
  "username": "***",
  "ssljdbcurl": "jdbc:db2://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:50001/BLUDB:sslConnection=true;",
  "host": "dashdb-entry-yp-dal10-01.services.dal.bluemix.net",
  "https_url": "https://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:8443",
  "dsn": "***",
  "hostname": "dashdb-entry-yp-dal10-01.services.dal.bluemix.net",
  "jdbcurl": "***",
  "ssldsn": "***",
  "uri": "***",
  "password": "***"
}

In [47]:
# The code was removed by Watson Studio for sharing.

In [9]:
db2_credentials = {
    'jdbcurl': db2_service_credentials['jdbcurl'],
    'user': db2_service_credentials['username'],
    'password': db2_service_credentials['password']
}

In [10]:
tablename = "{schema}.{table}".format(schema=db2_credentials['user'], table='DRUG_TRAIN_DATA_UPDATED')

In [11]:
DRUG_TRAIN_DATA_UPDATED_data = spark.read.jdbc(db2_credentials['jdbcurl'], table=tablename, properties=db2_credentials)

In [12]:
DRUG_TRAIN_DATA_UPDATED_data.show(5)

+---+---+----+-----------+--------+--------+-----+
|AGE|SEX|  BP|CHOLESTEROL|      NA|       K| DRUG|
+---+---+----+-----------+--------+--------+-----+
| 43|  M|HIGH|       HIGH|0.656371|0.046979|drugA|
| 32|  M|HIGH|     NORMAL|0.529750|0.056087|drugA|
| 37|  F|HIGH|       HIGH|0.559171|0.042713|drugA|
| 24|  M|HIGH|     NORMAL|0.613261|0.064726|drugA|
| 29|  M|HIGH|       HIGH|0.625272|0.048637|drugA|
+---+---+----+-----------+--------+--------+-----+
only showing top 5 rows



The DRUG column is the target/label column.

### 2.2 Prepare the data<a id="prep"></a>

In this subsection you will split your data into two data sets: 
- Train data set
- Test data set

In [13]:
(train_data, test_data) = DRUG_TRAIN_DATA_UPDATED_data.randomSplit([0.8, 0.2], 24)

print("Number of records for training: " + str(train_data.count()))
print("Number of records for evaluation: " + str(test_data.count()))

Number of records for training: 150
Number of records for evaluation: 31


As you can see, your data has been successfully split into two data sets:
 - The train data set, which is the largest group, is used for training.
 - The test data set is used for model evaluation.

### 2.3 Create the pipeline<a id="pipe"></a>

In this section you will create an Apache Spark machine learning pipeline.

First, import the Apache Spark machine learning packages that will be needed in the subsequent steps.

In [14]:
from pyspark.ml.feature import OneHotEncoder, StringIndexer, IndexToString, VectorAssembler
from pyspark.ml.classification import DecisionTreeClassifier
from pyspark.ml.evaluation import MulticlassClassificationEvaluator
from pyspark.ml import Pipeline, Model

In the following step, use the StringIndexer transformer to convert all the string fields to numeric ones.

In [15]:
stringIndexer_sex = StringIndexer(inputCol = 'SEX', outputCol = 'SEX_IX')
stringIndexer_bp = StringIndexer(inputCol = 'BP', outputCol = 'BP_IX')
stringIndexer_chol = StringIndexer(inputCol = 'CHOLESTEROL', outputCol = 'CHOL_IX')
stringIndexer_label = StringIndexer(inputCol="DRUG", outputCol="label").fit(DRUG_TRAIN_DATA_UPDATED_data)

Create a feature vector by combining all the features together.

In [16]:
vectorAssembler_features = VectorAssembler(inputCols=["AGE", "SEX_IX", "BP_IX", "CHOL_IX", "NA", "K"], outputCol="features")

Next, define the estimators you want to use for classification. Decision Tree is used in the following example.

In [17]:
dt = DecisionTreeClassifier(labelCol="label", featuresCol="features")

Finally, convert the indexed labels back to the original labels.

In [18]:
labelConverter = IndexToString(inputCol="prediction", outputCol="predictedLabel", labels=stringIndexer_label.labels)

Build the pipeline. A pipeline consists of transformers and an estimator.

In [19]:
pipeline_dt = Pipeline(stages=[stringIndexer_label, stringIndexer_sex, stringIndexer_bp, stringIndexer_chol, vectorAssembler_features, dt, labelConverter])

### 2.4 Train the model<a id="train"></a>

Now, you can train your Decision Tree model by using the previously defined pipeline and train data.

In [20]:
model = pipeline_dt.fit(train_data)

You can check your model accuracy now. Use test data to evaluate the model.

In [21]:
predictions = model.transform(test_data)
evaluatorDT = MulticlassClassificationEvaluator(labelCol="label", predictionCol="prediction", metricName="accuracy")
accuracy = evaluatorDT.evaluate(predictions)

print("Accuracy = %g" % accuracy)

Accuracy = 0.870968


You can tune your model now to achieve better accuracy. To keep this example simple, the tuning section is omitted.

<a id="load"></a>
## 3. Store the model

In this section you will learn how to store sample model in Watson Machine Learning repository by using repository client.

First, install and import the client library.

In [22]:
!rm -rf $PIP_BUILD/watson-machine-learning-client

In [23]:
!pip install watson-machine-learning-client --upgrade

Collecting watson-machine-learning-client
  Downloading watson_machine_learning_client-1.0.53-py3-none-any.whl (561kB)
[K    100% |████████████████████████████████| 563kB 1.1MB/s eta 0:00:01
[?25hCollecting tqdm (from watson-machine-learning-client)
  Using cached tqdm-4.19.9-py2.py3-none-any.whl
Requirement already up-to-date: tabulate in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s86b-18e61b28c674e4-3fbaf243aed6/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: urllib3 in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s86b-18e61b28c674e4-3fbaf243aed6/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: certifi in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s86b-18e61b28c674e4-3fbaf243aed6/.local/lib/python3.5/site-packages (from watson-machine-learning-client)
Requirement already up-to-date: pandas in /gpfs/global_fs01/sym_shared/YPProdSpark/user/s86b-18e61b28c674e4-3fbaf2

**Note**: Apache Spark 2.1 is required.

In [24]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient
import json

Authenticate to the Watson Machine Learning service on IBM Cloud.

**Tip**: Authentication information (your credentials) can be found in the <a href="https://console.bluemix.net/docs/services/service_credentials.html#service_credentials" target="_blank" rel="noopener no referrer">Service Credentials</a> tab of the service instance that you created on IBM Cloud. 

If you cannot see the **instance_id** field in **Service Credentials**, click **New credential (+)** to generate new authentication information. 

**Action**: Enter your Watson Machine Learning service instance credentials here.


In [25]:
wml_credentials={
  "url": "https://ibm-watson-ml.mybluemix.net",
  "username": "***",
  "password": "***",
  "instance_id": "***"
}

In [26]:
# The code was removed by Watson Studio for sharing.

Create the WatsonMachineLearningAPIClient.

In [27]:
client = WatsonMachineLearningAPIClient(wml_credentials)

#### Prepare the metadata

**Tip**: If the accuracy value falls below the threshold value, retraining action is required.

Prepare the additional information to be saved as model's metadata:
* TRAINING_DATA_REF
* EVALUATION_METHOD: **multiclass**
* EVALUATION_METRICS name: **accuracy** (metric name used to evaluate the model)
* EVALUATION_METRICS value: **0.87** (accuracy value calculated few steps above)
* EVALUATION_METRICS threshold: **0.8** (if the accuracy after evaluation using feedback data is below this threshold auto-retraining is triggered)

Prepare the training data reference that will be required by the continuous learning system to trigger retraining action.

**Tip**: All required fields can be found on Service Credentials tab of Db2 Warehouse on Cloud service instance created in IBM Cloud.

In [28]:
training_data_reference = {
 "name": "DRUG feedback",
 "connection": db2_service_credentials,
 "source": {
  "tablename": "DRUG_TRAIN_DATA_UPDATED",
  "type": "dashdb"
 }
}

Add all the information to model meta props.

In [29]:
model_props = {
    client.repository.ModelMetaNames.NAME: "Best Heart Drug Selection",
    client.repository.ModelMetaNames.TRAINING_DATA_REFERENCE: training_data_reference,
    client.repository.ModelMetaNames.EVALUATION_METHOD: "multiclass",
    client.repository.ModelMetaNames.EVALUATION_METRICS: [
        {
           "name": "accuracy",
           "value": accuracy,
           "threshold": 0.8
        }
    ]
}

Store the model.

In [31]:
published_model_details = client.repository.store_model(model=model, meta_props=model_props, training_data=train_data, pipeline=pipeline_dt)
model_uid = client.repository.get_model_uid(published_model_details)

**Tip**: Use `client.repository.ModelMetaNames.show()` to get the list of available props.

Check your models details:

In [32]:
print(published_model_details)

{'metadata': {'created_at': '2018-03-29T08:35:31.303Z', 'url': 'https://ibm-watson-ml.mybluemix.net/v3/wml_instances/3f6e5c2b-4880-46aa-9d79-62e90ccc9d56/published_models/298832f7-d21d-4a07-a028-b5716dba1264', 'modified_at': '2018-03-29T08:35:31.484Z', 'guid': '298832f7-d21d-4a07-a028-b5716dba1264'}, 'entity': {'latest_version': {'created_at': '2018-03-29T08:35:31.484Z', 'url': 'https://ibm-watson-ml.mybluemix.net/v3/ml_assets/models/298832f7-d21d-4a07-a028-b5716dba1264/versions/f97de42d-c0ea-42f2-a216-8c01744e85bd', 'guid': 'f97de42d-c0ea-42f2-a216-8c01744e85bd'}, 'runtime_environment': 'spark-2.1', 'name': 'Best Heart Drug Selection', 'learning_configuration_url': 'https://ibm-watson-ml.mybluemix.net/v3/wml_instances/3f6e5c2b-4880-46aa-9d79-62e90ccc9d56/published_models/298832f7-d21d-4a07-a028-b5716dba1264/learning_configuration', 'model_type': 'mllib-2.1', 'input_data_schema': {'type': 'struct', 'fields': [{'metadata': {'name': 'AGE', 'scale': 0}, 'nullable': True, 'type': 'integer'

<a id="configuration"></a>
## 4. Configure the continuous learning system

In this section you will learn how to the configure continuous learning system with the WML REST API Client.

Use a continuous learning system to:
- Monitor the model quality
- Retrain the model if the quality is below a specified threshold value
- Redeploy the model if the retrained model performs better

For more information about REST APIs, see the [Swagger Documentation](http://watson-ml-api.mybluemix.net/).

- [4.1 Prepare the authorization header](#token)
- [4.2 Configure the continuous learning system for the published model](#config)
- [4.3 Patch the configuration for published model](#patch)

### 4.1 Prepare the authorization header<a id="token"></a>

Prepare the authorization header that combines the WML token and Spark instance credentials.

In [34]:
spark_credentials = {
  "tenant_id": "***",
  "tenant_id_full": "***",
  "cluster_master_url": "https://spark.bluemix.net",
  "tenant_secret": "***",
  "instance_id": "***",
  "plan": "ibm.SparkService.PayGoPersonal"
}

In [35]:
# The code was removed by Watson Studio for sharing.

### 4.2 Configure the continuous learning system for the published model<a id="config"></a>

**Tip**:  ```tablename``` is the only difference compared to ```training_data_reference```.

In [36]:
feedback_data_reference = {
 "name": "DRUG feedback",
 "connection": db2_service_credentials,
 "source": {
  "tablename": "DRUG_FEEDBACK_DATA",
  "type": "dashdb"
 }
}

Define values for the following fields to finalize the payload:
- ```min_feedback_data_size``` - this is minimal number of records in the feedback data set to start the continuous learning system iteration
- ```auto_retrain``` [never, always, conditionally] - this parameter specifies if the retraining process should be triggered (´conditionally´ will trigger the retraining process when evaluation result is below specified threshold value)
- ```auto_redeploy``` [never, always, conditionally] - this parameter specifies if retrained model should be deployed (´conditionally´ will trigger redeployment when newly trained model quality is better)

In [37]:
system_config = {
    client.learning_system.ConfigurationMetaNames.FEEDBACK_DATA_REFERENCE: feedback_data_reference,
    client.learning_system.ConfigurationMetaNames.MIN_FEEDBACK_DATA_SIZE: 10,
    client.learning_system.ConfigurationMetaNames.SPARK_REFERENCE: spark_credentials,
    client.learning_system.ConfigurationMetaNames.AUTO_RETRAIN: "conditionally",
    client.learning_system.ConfigurationMetaNames.AUTO_REDEPLOY: "always"
}

client.learning_system.setup(model_uid=model_uid, meta_props=system_config)

{'auto_redeploy': 'always',
 'auto_retrain': 'conditionally',
 'feedback_data_reference': {'connection': {'db': 'BLUDB',
   'dsn': 'DATABASE=BLUDB;HOSTNAME=dashdb-entry-yp-dal10-01.services.dal.bluemix.net;PORT=50000;PROTOCOL=TCPIP;UID=dash6973;PWD=5338f7276f54;',
   'host': 'dashdb-entry-yp-dal10-01.services.dal.bluemix.net',
   'hostname': 'dashdb-entry-yp-dal10-01.services.dal.bluemix.net',
   'https_url': 'https://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:8443',
   'jdbcurl': 'jdbc:db2://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:50000/BLUDB',
   'password': '5338f7276f54',
   'port': 50000,
   'ssldsn': 'DATABASE=BLUDB;HOSTNAME=dashdb-entry-yp-dal10-01.services.dal.bluemix.net;PORT=50001;PROTOCOL=TCPIP;UID=dash6973;PWD=5338f7276f54;Security=SSL;',
   'ssljdbcurl': 'jdbc:db2://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:50001/BLUDB:sslConnection=true;',
   'uri': 'db2://dash6973:5338f7276f54@dashdb-entry-yp-dal10-01.services.dal.bluemix.net:50000/BLUDB',
   'us

You have successfully configured continuous learning. You can check the details with a GET call:

In [38]:
learning_details = client.learning_system.get_details(model_uid)

### 4.3 Patch the configuration for stored model<a id="patch"></a>

To update the learning configuration, use the PATCH request as shown below.

In [40]:
feedback_data_reference_updated = {
    "connection": db2_service_credentials,
    "source": {
        "type": "dashdb",
         "tablename": "DRUG_FEEDBACK_DATA"
    }
}

In [44]:
updated_config = {
    client.learning_system.ConfigurationMetaNames.FEEDBACK_DATA_REFERENCE: feedback_data_reference_updated
}

client.learning_system.update(model_uid, updated_config)

{'auto_redeploy': 'always',
 'auto_retrain': 'conditionally',
 'evaluation_definition': {'method': 'multiclass',
  'metrics': [{'name': 'accuracy', 'threshold': 0.8}]},
 'feedback_data_reference': {'connection': {'db': 'BLUDB',
   'dsn': 'DATABASE=BLUDB;HOSTNAME=dashdb-entry-yp-dal10-01.services.dal.bluemix.net;PORT=50000;PROTOCOL=TCPIP;UID=dash6973;PWD=5338f7276f54;',
   'host': 'dashdb-entry-yp-dal10-01.services.dal.bluemix.net',
   'hostname': 'dashdb-entry-yp-dal10-01.services.dal.bluemix.net',
   'https_url': 'https://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:8443',
   'jdbcurl': 'jdbc:db2://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:50000/BLUDB',
   'password': '5338f7276f54',
   'port': 50000,
   'ssldsn': 'DATABASE=BLUDB;HOSTNAME=dashdb-entry-yp-dal10-01.services.dal.bluemix.net;PORT=50001;PROTOCOL=TCPIP;UID=dash6973;PWD=5338f7276f54;Security=SSL;',
   'ssljdbcurl': 'jdbc:db2://dashdb-entry-yp-dal10-01.services.dal.bluemix.net:50001/BLUDB:sslConnection=true;',
  

<a id="performance"></a>
## 5. Track the model's performance

To start a learning system iteration, use the client run(...) method. During the iteration, the published model is evaluated, and if the evaluated accuracy is below a specified threshold value, model retraining is triggered. Both data sets, training and feedback, are used for retraining and evaluation.

Set the`aynchronous` parameter to enable the iteration state to be monitored.

In [46]:
run_details = client.learning_system.run(model_uid, asynchronous=False)



#######################################################################

Synchronous run for uid: '604d5ef2-879d-49f0-a7a8-33be2eaaeab4' started

#######################################################################


INITIALIZED
RUNNING................
COMPLETED


--------------------------------------------------------------------------------------------
Successfully finished learning iteration run, run_uid='604d5ef2-879d-49f0-a7a8-33be2eaaeab4'
--------------------------------------------------------------------------------------------




#### Get the run uid

In [48]:
run_uid = client.learning_system.get_run_uid(run_details)

#### Get run details

In [53]:
learning_run_details = client.learning_system.get_run_details(run_uid)

#### Get evaluation values

In [51]:
metrics = client.learning_system.get_metrics(model_uid)
# print(json.dumps(metrics, indent=2))

**Tip**: To see the evaluation result, wait for the iteration to complete.

**Note**: To display the evaluation details as a table, you need to install the ```tabulate``` package.

In [52]:
client.learning_system.list_metrics(model_uid)

----------  ------------------------  -----------  ------------------  --------------  -----------------------------------
PHASE       TIMESTAMP                 METRIC NAME  METRIC VALUE        METRIC THRESH.  VERSION
setup       2018-03-29T08:35:31.382Z  accuracy     0.8709677419354839  0.8             f97de42d-c0ea-42f2-a216-8c01744e85b
monitoring  2018-03-29T08:46:59.079Z  accuracy     0.75                0.8             f97de42d-c0ea-42f2-a216-8c01744e85b
training    2018-03-29T08:47:33.793Z  accuracy     0.8398066941113299  0.8             acffe913-247e-4085-a335-540e53aaced
----------  ------------------------  -----------  ------------------  --------------  -----------------------------------


You can see that this iteration consists of the following phases:
- Monitoring - the model quality was checked (evaluation) using feedback data. 
- Training - because the evaluation result (0.75) is below specified threshold value  (0.8) model retraining was triggered. An evaluation of the retrained model shows an accuracy of 0.92.

**Tip**: If the `auto_redeploy` option is set to 'conditionally', the newly trained model will be redeployed because it shows better accuracy than the original one.

<a id="visualization"></a>
## 6. Visualize the model performance

In this subsection you visualize iteration results with Plotly, which is an online analytics and data visualization tool.

**Example**:  First, you need to install the required packages. To do this, run the following code. Run it only one time.

!pip install plotly --user

!pip install cufflinks --user

Import Plotly and the other required packages.

In [None]:
!pip install cufflinks

In [56]:
import sys
import pandas
import plotly.plotly as py
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import cufflinks as cf
import plotly.graph_objs as go

init_notebook_mode(connected=True)
sys.path.append("".join([os.environ["HOME"]])) 

#### Prepare the data for plotly.

In [57]:
phases = []
evaluation_values = []
threshold_values = []

for i,x in enumerate(metrics['resources']):
    phases.append(x['phase'] + '_' + str(i))
    evaluation_values.append(x['values'][0]['value'])
    threshold_values.append(x['values'][0]['threshold'])

#### Plot a linear chart.

In [58]:
trace1 = go.Scatter(
    x = phases,
    y = evaluation_values,
    mode = 'lines+markers',
    name = 'accuracy'
)

trace2 = go.Scatter(
    x = phases,
    y = threshold_values,
    mode = 'lines',
    name = 'threshold'
)

layout = dict(title = 'Model performance',
              xaxis = dict(title = 'Phase'),
              yaxis = dict(title = 'Evaluation result'),
              )

fig = dict(data=[trace1, trace2], layout=layout)
iplot(fig)

Within a single continuous learning system iteration you can observe two phases:
* Monitoring - the initial model is evaluated using feedback data
* Training - the model is retrained using a combination of training and feedback data. Next, the model is evaluated.

After retraining, the model accuracy increased to the desired level (that is, above the specified threshold value).

<a id="feedback"></a>
## 7. Send new records to the feedback data store 

You can use the feedback endpoint to send new records to the feedback data store.

Generate some records based on the training data.

In [59]:
from pyspark.sql.functions import UserDefinedFunction, col, column
from pyspark.sql.types import IntegerType

col_name = 'AGE'
udf_add = UserDefinedFunction(lambda x: x + 1, IntegerType())
new_records_df = train_data.select(*[udf_add(column).alias(col_name) if column == col_name else column for column in train_data.columns])
new_records_df = new_records_df.withColumn("K", col("K").cast("double")).withColumn("NA", col("NA").cast("double"))
new_records_pdf = new_records_df.toPandas()

In [60]:
records=[]

import numpy as np

for i in range(new_records_pdf.shape[0]):
    records.append([x.tolist() if type(x).__module__ == np.__name__ else x for x in new_records_pdf.loc[i].values.tolist()])

In next step, send the feedback payload.

In [61]:
client.learning_system.send_feedback(model_uid, records, fields=train_data.columns)

{'rows_inserted': 150}

**Tip:** Now, you can run another iteration using the new feedback data.

<a id="summary"></a>
## 8. Summary and next steps     

You successfully completed this notebook! 
 
You learned how to use Continuous Learning System of Watson Machine Learning. 
Check out our [Online Documentation](https://dataplatform.ibm.com/docs/content/analyze-data/wml-setup.html)
 for more samples, tutorials, documentation, how-tos, and blog posts. 

### Authors

**Lukasz Cmielowski**, PhD, is an Automation Architect and Data Scientist at IBM with a track record of developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.

**Maria Oleszkiewicz**, MSc, is a developer who took part in building the wml api client used in this notebook.

Copyright © 2018 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>