# Tutorial for Using Azure Machine Learning Notebook
Lixun Zhang  
Nov 9, 2015

## 1 Introduction
### 1.1 Who's the target audience
This example is prepared for audience who are comfortable with developing models using Jupyter notebook and have a basic understanding of Azure Machine Learning (Azure ML) experiment. The purpose here is to give a complete walkthrough of how to use Jupyter notebook within Azure Machine Studio. Using linear regression as an example, we'll have the opportunity to see how Azure ML's Jupyter notebook can be used to access data from an Azure Machine Learning experiment, fit a model, set up a web service on Azure ML and consume the service.

### 1.2 Why Azure ML notebook
The major advantage of using Azure ML notebook is that you won't need to install Python on your local computer. This makes it possible for anyone with internet access to write Python programs using a web browser. Azure ML's notebook is based on the [Anaconda][Anaconda link] distribution, which has many packages installed. 

If you are new to Jupyter notebook, you can learn more about it from [The IPython Notebook][ipython link] and [Jupyter/IPython Notebook Quick Start Guide][jupyter link]. 

If you are new to Azure ML and want to learn more about it, the [Data Scientists' Guide][guide link] can help you get started.

[Anaconda link]: https://www.continuum.io/
[ipython link]: http://ipython.org/notebook.html
[jupyter link]: http://jupyter-notebook-beginner-guide.readthedocs.org/en/latest/index.html
[guide link]: https://gallery.cortanaanalytics.com/Experiment/Tutorial-for-Data-Scientists-3

## 2 Read data
In this example we'll be using a dataset from an Azure ML experiment. In [Figure 1][pic 1] I created an experiment that makes use of the "Boston" dataset from the R package "MASS." After running the experiment, right-click on the output port of the "Convert to CSV" module. By selecting "Python 2", you should see a new notebook generated ([Figure 2][pic 2]). For the remainder of this example we'll use the data generated from this experiment. 

[![Figure 1][pic 1]][pic 1] Figure 1


[![Figure 2][pic 2]][pic 2] Figure 2

In your newly generated Jupyter notebook, run the automatically generated code to bring the data into the current session. With these lines of code we are using the [azureml][azureml link]'s Workspace subpackage. In your example, you should see different values for workspace\_id and authorization\_token. Similarly, your experiment number and node\_id will be different from mine.

[pic 1]: https://cloud.githubusercontent.com/assets/9322661/10639017/e7f19826-77db-11e5-9279-b8ca981912d7.PNG
[pic 2]: https://cloud.githubusercontent.com/assets/9322661/11036593/7d3e8d7c-86c7-11e5-818a-713fa52f6ca8.png
[azureml link]: https://github.com/Azure/Azure-MachineLearning-ClientLibrary-Python

In [1]:
from azureml import Workspace
ws = Workspace(
 workspace_id='b2bbeb56a1d04e1599d2510a06c59d87',
 authorization_token='a3978d933cd84e64ab583a616366d160',
 endpoint='https://studioapi.azureml.net'
)
experiment = ws.experiments['b2bbeb56a1d04e1599d2510a06c59d87.f-id.911630d13cbe4407b9fe408b5bb6ddef']
ds = experiment.get_intermediate_dataset(
 node_id='a0a931cf-9fb3-4cb9-83db-f48211be560c-323',
 port_name='Results dataset',
 data_type_id='GenericCSV'
)
frame = ds.to_dataframe()

In [2]:
# frame # commented out to avoid printing
frame.head()

Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,black,lstat,medv
0,0.00632,18,2.31,0,0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98,24.0
1,0.02731,0,7.07,0,0.469,6.421,78.9,4.9671,2,242,17.8,396.9,9.14,21.6
2,0.02729,0,7.07,0,0.469,7.185,61.1,4.9671,2,242,17.8,392.83,4.03,34.7
3,0.03237,0,2.18,0,0.458,6.998,45.8,6.0622,3,222,18.7,394.63,2.94,33.4
4,0.06905,0,2.18,0,0.458,7.147,54.2,6.0622,3,222,18.7,396.9,5.33,36.2


## 3 Linear regression with sklearn
Now let's assign the data to a new variable "mydata" and use it to develop a linear model. In practice, you may want to spend more time on things like feature engineering and variable selection. In this example, however, we will just fit a model using all variables. 

In [3]:
import pandas as pd
from sklearn.linear_model import LinearRegression

# assign data to mydata
mydata = frame

# create X and y
feature_cols = ['crim', 'zn', 'indus', 'chas', 'nox', 'rm', 'age', 'dis', 'rad', 'tax', 'ptratio', 'black', 'lstat']
X = mydata[feature_cols]
y = mydata.medv

# initiate the linear model and fit with data
lm = LinearRegression()
lm.fit(X, y)

# print the R-squared
print("The R-squared value is: {0:0.4f} \n".format(lm.score(X, y)))

# save intercept and coefficients
param_df = pd.DataFrame({"Features": ['intercept'] + feature_cols, "Coef": [lm.intercept_] + list(lm.coef_)})
cols = param_df.columns.tolist()
cols = cols[-1:]+cols[:-1]
param_df = param_df[cols]
print(param_df)

The R-squared value is: 0.7406 

     Features       Coef
0   intercept  36.459488
1        crim  -0.108011
2          zn   0.046420
3       indus   0.020559
4        chas   2.686734
5         nox -17.766611
6          rm   3.809865
7         age   0.000692
8         dis  -1.475567
9         rad   0.306049
10        tax  -0.012335
11    ptratio  -0.952747
12      black   0.009312
13      lstat  -0.524758


Next we'll use the model to make predictions. Typically, predictions are done on a validation dataset. Here, however, the training dataset is used for illustraton purposes.

In [4]:
newX = X
newY = y

predicted = lm.predict(newX)
predicted_df = pd.DataFrame({"predicted": predicted})
mydata_with_pd = newX.join(newY).join(predicted_df)
mydata_with_pd.head()

Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,black,lstat,medv,predicted
0,0.00632,18,2.31,0,0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98,24.0,30.003843
1,0.02731,0,7.07,0,0.469,6.421,78.9,4.9671,2,242,17.8,396.9,9.14,21.6,25.025562
2,0.02729,0,7.07,0,0.469,7.185,61.1,4.9671,2,242,17.8,392.83,4.03,34.7,30.567597
3,0.03237,0,2.18,0,0.458,6.998,45.8,6.0622,3,222,18.7,394.63,2.94,33.4,28.607036
4,0.06905,0,2.18,0,0.458,7.147,54.2,6.0622,3,222,18.7,396.9,5.33,36.2,27.943524


With the predicted values, we can calculate metrics such as mean absolute error, root mean squared error, relative absolute error, and relative squared error. These metrics are standard outputs from the "Evaluate Model" module in Azure ML and you'll notice that the results here are the same as those in [Data Scientists' Guide][guide link].

[guide link]: https://gallery.cortanaanalytics.com/Experiment/Tutorial-for-Data-Scientists-3

In [5]:
import numpy as np
obs = mydata_with_pd.medv
pred = mydata_with_pd.predicted

mae = np.mean(abs(pred-obs))
rmse = np.sqrt(np.mean((pred-obs)**2))
rae = np.mean(abs(pred-obs))/np.mean(abs(obs-np.mean(obs)))
rse = np.mean((pred-obs)**2)/np.mean((obs-np.mean(obs))**2)

print("Mean Absolute Error: {0:0.6f}".format(mae))
print("Root Mean Squared Error: {0:0.6f}".format(rmse))
print("Relative Absolute Error: {0:0.6f}".format(rae))
print("Relative Squared Error: {0:0.6f}".format(rse))

Mean Absolute Error: 3.270863
Root Mean Squared Error: 4.679191
Relative Absolute Error: 0.492066
Relative Squared Error: 0.259357


## 4 Web service

### 4.1 Set up a web service
After developing the model, we can deploy it as a service so others can use it. The "azureml" package's services subpackage can be used for this purpose. The following lines of code are setting up a web service named "demoservice". Notice that the same workspace\_id and authorization\_token as the data step are used.

In [6]:
from azureml import services
@services.publish('b2bbeb56a1d04e1599d2510a06c59d87', 'a3978d933cd84e64ab583a616366d160')
@services.types(crim=float, zn=float, indus=float, chas=float, nox=float, rm=float, age=float, 
                dis=float, rad=float, tax=float, ptratio=float, black=float, lstat=float)
@services.returns(float)
def demoservice(crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio, black, lstat):
    # predict the label
    feature_vector = [crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio, black, lstat]
    return lm.predict(feature_vector)


### 4.2 Consume a web service
After running the above code to set up a web service, we can consume it like the following example. This way of consuming the web service works only during the current session.

In [7]:
demoservice(0.00632, 18, 2.31, 0, 0.538, 6.575, 65.2, 4.09, 1, 296, 15.3, 396.9, 4.98)

array([ 30.00384338])

To consume the web service out of the current session, we can use Python scripts by specifying the service url and the api\_key. The following lines of code print the url, the api\_key, the help\_url, and the service\_id.

In [8]:
demoservice.service.url

u'https://ussouthcentral.services.azureml.net/workspaces/b2bbeb56a1d04e1599d2510a06c59d87/services/ecd2564ad6e042dda752f127ff71efea/execute?api-version=2.0'

In [9]:
demoservice.service.api_key

u'PkNHYOO55Q5pZ2XLW5mTPXXoMJTzjCmItEvixlLCLqEXzCeO53TBOhaBpLrEOlCN1zKP45Ij/rMTopdivEQaXQ=='

In [10]:
demoservice.service.help_url

u'https://studio.azureml.net/apihelp/workspaces/b2bbeb56a1d04e1599d2510a06c59d87/webservices/639f774c52e0429181a62d4b5ccbf353/endpoints/ecd2564ad6e042dda752f127ff71efea/score'

In [11]:
demoservice.service.service_id

'639f774c52e0429181a62d4b5ccbf353'

The help\_url contains, among others, sample code written in C#, Python, and R for consuming the web service. And we'll use the Python sample code as a starting point in this step. Copy the help\_url value to a new browser and open the web page. Scroll down the newly opened page till you see the section "Sample Code" as in [Figure 3][pic 3]. Click on the Python tab, copy the code and paste it below. Then make the following two changes: 1) replace the api\_key with the value printed above, and 2) enter the values for the first two records. Notice that the order of the columns here may be different from that in your definition of the web service. After running the code, you should see the predictions.  

[![Python Script][pic 3]][pic 3] Figure 3

[pic 3]: https://cloud.githubusercontent.com/assets/9322661/10396166/f67a4396-6e6e-11e5-85ef-7ec5c59df283.PNG

In [12]:
import urllib2
# If you are using Python 3+, import urllib instead of urllib2

import json 

data =  {

        "Inputs": {

                "input1":
                {
                    "ColumnNames": ["crim", "zn", "lstat", "age", "tax", "rad", "black", 
                                    "chas", "nox", "rm", "indus", "ptratio", "dis"],
                    "Values": [ [ "0.00632", "18", "4.98", "65.2", "296", "1", "396.9", 
                                 "0", "0.538", "6.575", "2.31", "15.3", "4.09" ], 
                               ["0.02731", "0", "9.14", "78.9", "242", "2", "396.9", 
                                "0", "0.469", "6.421", "7.07", "17.8", "4.9671"],
                            ]
                },        },
            "GlobalParameters": {
}
    }

body = str.encode(json.dumps(data))

url = 'https://ussouthcentral.services.azureml.net/workspaces/b2bbeb56a1d04e1599d2510a06c59d87/\
services/ecd2564ad6e042dda752f127ff71efea/execute?api-version=2.0'
api_key = 'PkNHYOO55Q5pZ2XLW5mTPXXoMJTzjCmItEvixlLCLqEXzCeO53TBOhaBpLrEOlCN1zKP45Ij/rMTopdivEQaXQ==' \
# Replace this with the API key for the web service
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}

req = urllib2.Request(url, body, headers) 

try:
    response = urllib2.urlopen(req)

    # If you are using Python 3+, replace urllib2 with urllib.request in the above code:
    # req = urllib.request.Request(url, body, headers) 
    # response = urllib.request.urlopen(req)

    result = response.read()
    print(result) 
except urllib2.HTTPError, error:
    print("The request failed with status code: " + str(error.code))

    # Print the headers - they include the requert ID and the timestamp, 
    # which are useful for debugging the failure
    print(error.info())

    print(json.loads(error.read()))                 

{"Results":{"output1":{"type":"table","value":{"Values":[["30.0038433770168"],["25.0255623790533"]]}},"output2":{"type":"table","value":{"Values":[["data:text/plain,","data:text/plain,",null]]}}},"ContainerAllocationDurationMs":0,"ExecutionDurationMs":96,"IsWarmContainer":true}


### 4.3 Re-publish a web service
By default, when you run the code in section 4.1 multiple times different services will be generated. They will have different url, api\_key, help\_url, and service\_id. If you just want to re-publish the same service, you need to specify the service\_id as shown below. This way, you can use the same code for consumption without having to change the url and api\_key.

In [13]:
from azureml import services
@services.publish('b2bbeb56a1d04e1599d2510a06c59d87', 'a3978d933cd84e64ab583a616366d160')
@services.service_id('639f774c52e0429181a62d4b5ccbf353')
@services.types(crim=float, zn=float, indus=float, chas=float, nox=float, rm=float, age=float, 
                dis=float, rad=float, tax=float, ptratio=float, black=float, lstat=float)
@services.returns(float)
def demoservice(crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio, black, lstat):
    # predict the label
    feature_vector = [crim, zn, indus, chas, nox, rm, age, dis, rad, tax, ptratio, black, lstat]
    return lm.predict(feature_vector)

## 5 Conclusion
Through this example, you've learned how to access data from an Azure ML experiment, fit a model, deploy the model on Azure, and consume the service, all using Azure ML's Jupyter notebook. 

In addition to carrying out the entire model development process with Azure ML notebook, you can also use it as a supplement to Azure ML's experiments. Using Azure ML experiments you can develop a wide variety of models and easily compare their performance. However, certain tasks may not be accomplished there. Examples include certain feature selection techniques, advanced visualization options, a wide variety of GBM models, and time series analysis. For these tasks, Python offers a good alternative and Azure ML's Jupyter notebook allows you to write and run Python programs on the cloud.

As another example of using Jupyter noteebook, you can test out Python code that will be used in the "Execute Python Script" module in an Azure ML Studio experiment.

---  
Created by a Microsoft Employee.  
Copyright © Microsoft. All Rights Reserved.