DISCLAIMER: This application is used for demonstrative and illustrative purposes only and does not constitute an offering that has gone through regulatory review.
This code pattern can be thought of as two distinct parts:
-
A predictive model will be built using Spark within a Jupyter Notebook on IBM Watson Studio. The model is then deployed to the Watson Machine Learning service, where it can be accessed via a REST API.
-
A Node.js web app that allows a user to input some data to be scored against the previous model.
When the reader has completed this Code Pattern, they will understand how to:
- Build a predictive model within a Jupyter Notebook on Watson Studio
- Deploy the model to the IBM Watson Machine Learning service
- Via a Node.js app, score some data against the model via an API call to the Watson Machine Learning service
Sample output
Here's an example of what the final web app looks like
And it's result
- The developer creates an IBM Watson Studio Workspace.
- IBM Watson Studio uses an Apache Spark service.
- IBM Watson Studio uses Cloud Object storage to manage your data.
- IBM Watson Studio uses a Jupyter Notebook to import data, train, and evaluate their model.
- Data is imported and stored on Cloud Object Storage.
- Models trained via Jupyter Notebooks are deployed using the Watson Machine Learning service.
- A Node.js web app is deployed on IBM Cloud, it calls the predictive model hosted on the Watson Machine Learning service.
- A user visits the web app, enters their information, and the predictive model returns a response.
- IBM Watson Studio: Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
- Jupyter Notebook: An open source web application that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text.
- PixieDust: Provides a Python helper library for IPython Notebook.
- Node.js: An open-source JavaScript run-time environment for executing server-side JavaScript code.
- An IBM Cloud Account
- An account on IBM Watson Studio.
NOTE: As of 12/14/2018, the Watson Machine Learning service on IBM Cloud is only available in the Dallas, London, Frankfurt, or Tokyo regions.
- Setup project and data in Watson Studio
- Create and deploy a predictive model with Watson Studio
- The client side
To complete this code pattern we'll need to do a few setup steps before creating our model. In Watson Studio we need to: create a project, add our patient data (which our model will be based on), upload our notebook, and provision a Watson Machine Learning service.
-
Log into IBM's Watson Studio. Once in, you'll land on the dashboard.
-
Create a new project by clicking
+ New project
and choosingData Science
: -
Enter a name for the project name and click
Create
.
NOTE: By creating a project in Watson Studio a free tier
Object Storage
service will be created in your IBM Cloud account. Select theFree
storage type to avoid fees.
The data used in this example was generated using a normal distribution. Attributes such as age, gender, heartrate, minutes of exercise per week, and cholesterol are used to create the model we will eventually deploy.
-
From the new project
Overview
panel, click+ Add to project
on the top right and choose theData
asset type. -
A panel on the right of the screen will appear to assit you in uploading data. Follow the numbered steps in the image below.
- Ensure you're on the
Load
tab. [1] - Click on the
browse
option. From your machine, browse to the location of thepatientdataV6.csv
file in this repository, and upload it. [not numbered] - Once uploaded, go to the
Files
tab. [2] - Ensure the
patientdataV6.csv
appears. [3]
- Ensure you're on the
-
TIP: Once successfully uploaded, the file should appear in the
Data assets
section of theAssets
tab.
-
Click on the navigation menu on the left (
☰
) to show additional options. Click on theWatson Services
option. -
From the overview page, click
+ Add service
on the top right and choose theMachine Learning
service. Select theLite
plan to avoid fees. -
Once provisioned, you should see the service listed in the
Watson Services
overview page. Select the service by opening the link in a new tab. We're now in the IBM Cloud tool, where we will create service credentials for our now Watson Machine Learning service. Follow the numbered steps in the image below. We'll be using these credentials in Step 2, so keep them handy!. -
TIP: You can now go back the project via the navigation menu on the left (
☰
).
The notebook we'll be using can be viewed in notebooks/predictiveModel.ipynb
, and a completed version can be found in examples/exampleOutput.ipynb
.
-
From the new project
Overview
panel, click+ Add to project
on the top right and choose theNotebook
asset type. Fill in the following information:- Select the
From URL
tab. [1] - Enter a
Name
for the notebook and optionally a description. [2] - Under
Notebook URL
provide the following url: https://github.com/IBM/predictive-model-on-watson-ml/blob/master/notebooks/predictiveModel.ipynb [3] - For
Runtime
select theSpark Python 3.6
option. [4]
- Select the
-
TIP: Once successfully imported, the notebook should appear in the
Notebooks
section of theAssets
tab.
Now that we're in our Notebook editor, we can start to create our predictive model by stepping through the notebook.
-
Click the
(►) Run
button to start stepping through the notebook. -
When you reach the cell entitled 2. Load and explore data pause and follow the instructions in that cell. On the very next cell we need to add our data. Follow the numbered steps in the image below.
- Click on the
Data
icon. [1] - Select the
Insert to code
option under the file patientdataV6.csv. [2] - Choose the
Insert SparkSession Data Frame
option. [3]
- Click on the
-
The above step will have inserted a chunk of code into your notebook. We need to make two changes:
- Rename the
df_data_1
variable todf_data
. [1] - Re-add the line
.option('inferSchema','True')\
to thespark.read()
call. [2]
- Rename the
-
Keep stepping through the code, pausing on each step to read the code and see the output for the opertion we're performing. At the end of Step 4 we'll have used the Random Forest Classifier from PySpark to create a model LOCALLY.
The gist of the next two steps is to use the Watson Machine Learning Python client to persist and deploy the model we just created.
-
At the beginning of Step 5. Persist model, before we deploy our model, we need up update the cell with credentials from our Watson Machine Learning service. (Remember that from Step 1.3 Provision a Watson Machine Learning service?)
-
Update the
wml_credentials
variable below. Copy and paste the entire credential dictionary, which can be found on the Service Credentials tab of the Watson Machine Learning service instance created on the IBM Cloud. -
Keep stepping through the code, pausing on each step to read the code and see the output for the opertion we're performing. At the end of Step 5 we'll have used the Watson Machine Learning service to persist our predictive model! 🎉
-
Now let's run Step 6 of the notebook. Deploy our model so we can have an endpoint to score data against.
Now that we have an API, let's create a client side interface that a typical user would interact with.
You can deploy this application as a Cloud Foundry application to IBM Cloud by simply clicking the button below. This option will create a deployment pipeline, complete with a hosted Git lab project and devops toolchain.
-
You may be prompted for an IBM Cloud API Key during this process. Use the
Create (+)
button to auto-fill this field and the others. -
Click on the
Deploy
button to deploy the application. -
You can view the URL where the app will live by either waiting for the deployment to finish, or by finding your app from the IBM Cloud dashboard. or Click on the application name, then choose
Visit App URL
from theOverview
page to open the application in a separate tab.
-
From the application's overview page, select the
Connections
option from the left menu panel. This will allow us to associate our Watson Machine Learning service with the application. Find the Watson Machine Learning service (likely prefixed withpm-20
), click theConnect
button, and choose the default options for the IAM generated credentials. and select the Watson Machine Learning service you provisioned earlier. -
When prompted to restage your application click the
Restage
button. The app will take a couple of minutes to come back online. Refresh any web pages that have the app running.
Why do this? The application is expecting information about the Maching Learning service via environment variables. By associating the application with the service, details about the service, such as the deployment id, and other sensitive information are accessible through environment variables.
The app is fairly self-explantory, simply fill in the data you want to score and click on the Score now
button to test how those figures would score against our model.
-
Verify that the model predicts that there is a risk of heart failure for the patient with these medical characteristics.
-
Verify that the model predicts that there is not a risk of heart failure for the patient with these medical characteristics.
This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.