# Deploy a Python model as a realtime web service in Machine Learning Server 

                               ***Applies to: Machine Learning Server 9.2***

In this notebook, we use the attitude dataset to build a linear model to predict the `rating` based on other columns. 

## 1. Read in the mtcars dataset

From your local machine, let's begin by reading in the data we will use to build our linear model. We will use the dataset `attitude`. 

In [1]:
# -- Import the dataset from the microsoftml package
from microsoftml.datasets.datasets import DataSetAttitude
attitude = DataSetAttitude()

# -- Represent the dataset as a dataframe.
attitude = attitude.as_df().drop('Unnamed: 0', axis = 1).astype('double')

# -- print top rows of data to inspect the data
attitude.head()

Unnamed: 0,rating,complaints,privileges,learning,raises,critical,advance
0,43.0,51.0,30.0,39.0,61.0,92.0,45.0
1,63.0,64.0,51.0,54.0,63.0,73.0,47.0
2,71.0,70.0,68.0,69.0,76.0,86.0,48.0
3,61.0,63.0,45.0,47.0,54.0,84.0,35.0
4,81.0,78.0,56.0,66.0,71.0,83.0,47.0


## 2. Authenticate and initiate the  `DeployClient`

There are several ways to authentication against Machine Learning Server from your local machine. The method you choose should match the authentication defined by your administrator. Please contact your administrator for authentication credentials. 

For simplicity, this example uses the local 'admin' account for authentication.  

1. Begin by importing the DeployClient and MLServer classes from the [azureml-model-management-sdk package](https://docs.microsoft.com/en-us/r-server/python-reference/azureml-model-management-sdk/azureml-model-management-sdk) so you can connect to Machine Learning Server (`use=MLServer`).

1. Then, **fill in your own connection details** for the host and context into the corresponding fields. Learn more in the article "[Connecting to Machine Learning Server in Python](https://docs.microsoft.com/en-us/r-server/operationalize/python/how-to-authenticate-in-python )."

In [2]:
# -- Import the DeployClient and MLServer classes from the azureml-model-management-sdk package.
from azureml.deploy import DeployClient
from azureml.deploy.server import MLServer

# -- Define the location of the ML Server --
# -- for local onebox for Machine Learning Server: http://localhost:12800
# -- Replace with connection details to your instance of ML Server. 
HOST = 'http://localhost:12800'
context = ('admin', 'YOUR_ADMIN_PASSWORD')
client = DeployClient(HOST, use=MLServer, auth=context)

You are now authenticated. 

The **DeployClient** can interact with the web service management APIs to deploy, list, consume and so on. 

## 3. Create and run a linear model locally

Now that you have built the authentication logic into your application, you can interact with the core APIs using functions in azureml-model-management-sdk to create a model and publish it as a web service.

Create a GLM model called `model` using the `attitude` dataset we imported before. Using data, this model estimates the attitude rating.

We use the [rx_lin_mod](https://docs.microsoft.com/en-us/r-server/python-reference/revoscalepy/rx-lin-mod) function from the [revoscalepy package](https://docs.microsoft.com/en-us/r-server/python-reference/revoscalepy/revoscalepy-package) to build the model.

Then, you can make a prediction locally using this model.

In [3]:
# -- import the needed classes and functions
from revoscalepy import rx_lin_mod, rx_predict

# -- using rx_lin_mod from revoscalepy package
# -- create glm model with `attitude` dataset
df = attitude
form = "rating ~ complaints + privileges + learning + raises + critical + advance"
model = rx_lin_mod(form, df, method = 'regression')

# -- provide some sample inputs to test the model
myData = df.head(1)

# -- predict locally
print(rx_predict(model, myData))

Rows Read: 30, Total Rows Processed: 30, Total Chunk Time: 0.002 seconds 
Computation time: 0.007 seconds.
Rows Read: 1, Total Rows Processed: 1, Total Chunk Time: 0.001 seconds 
   rating_Pred
0    51.110295


Examine the results of the locally executed code. You can compare these results to the results of the web service in this next step.

## 4. Publish the model as a realtime web service

Now let's:
+ Serialize the model
+ Publish the linear model as a Python web service

To publish any model as a realtime service, you must serialize the model object using the `rx_serialize_model` function from revoscalepy. Other serialization method are unsupported.

In [4]:
# Import the needed classes and functions
from revoscalepy import rx_serialize_model

# Serialize the model with rx_serialize_model
s_model = rx_serialize_model(model, realtime_scoring_only=True)

To publish the realtime service, we first initiate a `realtimeDefinition` object. Then, you can annotate the object with other parameters.

Now, publish it as a realtime service to Machine Learning Server. 

In [5]:
service = client.realtime_service("LinModService") \
        .version('1.0') \
        .serialized_model(s_model) \
        .description("This is a realtime model.") \
        .deploy()


**To learn more about deploying a web service into Machine Learning Server, [see here](https://docs.microsoft.com/en-us/machine-learning-server/operationalize/python/how-to-deploy-manage-web-services).**

## 5. Explore and consume the published web service

Now let's:
+ Use the help function to explore the published service. You can call the `help` function on any `azureml-model-management-sdk` functions, even those we dynamically generated ones to learn more about them.

+ Print the capabilities that define the service holdings: service name, version, descriptions, inputs, outputs, and the name of the function to be consumed.

+ Predict an outcome.

+ Download the Swagger-based JSON file.  This file is auto-generated at deploy time.  You can share it with any authenticated users so they can test and consume the service.   **You can share the resulting file with application developers or others testing your service.** Learn more about [exploring and consuming in this notebook](https://github.com/Microsoft/ML-Server-Python-Samples/blob/master/web-services/deploy-consume/Explore_Consume_Python_Web_Services.ipynb).

In [6]:
print(help(service))

Help on LinmodserviceService in module azureml.deploy.server.service object:

class LinmodserviceService(Service)
 |  Service object from metadata.
 |  
 |  Method resolution order:
 |      LinmodserviceService
 |      Service
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, service, http_client)
 |      Constructor
 |      
 |      :param service:
 |      :param http_client:
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  batch(self, records, parallel_count=10)
 |      Register a set of input records for batch execution on this service.
 |      
 |      :param records: The `data.frame` or `list` of
 |             input records to execute.
 |      :param parallel_count: Number of threads used to process entries in
 |             the batch. Default value is 10. Please make sure not to use too
 |             high of a number because it might negatively impact performance.
 |      :return: The `Batch` object to control service batching
 |           

Explore all available functions on the service object by calling `capabilities`.

In [7]:
service.capabilities()

{'api': '/api/LinModService/1.0',
 'artifacts': [],
 'creation_time': '2017-10-18T21:55:46.085642',
 'description': 'This is a realtime model.',
 'inputs': [{'name': 'inputData', 'type': 'data.frame'}],
 'inputs_encoded': [{'name': 'inputData', 'type': 'pandas.DataFrame'}],
 'name': 'LinModService',
 'operation_id': 'consume',
 'outputs': [{'name': 'outputData', 'type': 'data.frame'}],
 'outputs_encoded': [{'name': 'outputData', 'type': 'pandas.DataFrame'}],
 'public-functions': {'batch': 'batch(records, parallel_count=10)',
  'capabilities': 'capabilities()',
  'consume': 'consume(self,inputData)',
  'get_batch': 'get_batch(execution_id)',
  'list_batch_execution': 'list_batch_execution()',
  'swagger': 'swagger(json=True)'},
 'published_by': 'admin',
 'runtime': 'Realtime',
 'snapshot_id': 'e787c12e-79e6-4d4a-85d8-59bd9f9a0015',
 'swagger': 'http://localhost:12800/api/LinModService/1.0/swagger.json',
 'version': '1.0'}

Since you are in the same session as the one you in which you deployed, you can consume it using the _service api_ object returned from `.deploy()`. You can verify that the results are as expected and that they match the results obtained when the model was run locally earlier. 

To consume the realtime service, call `.consume` on the realtime service object directly unless an alias was defined. In the case of an alias, call that alias instead. 

In [8]:
# -- Consume the service. Pluck out the named output, outputData. --

print(service.consume(df.head(1)).outputs['outputData'])

   rating_Pred
0    51.110295


Now you can grab the Swagger-based JSON file, which defines the service.

In [9]:
# -- Retrieve the URL of the swagger file for this service.
cap = service.capabilities()
swagger_URL = cap['swagger']
print(swagger_URL)

http://localhost:12800/api/LinModService/1.0/swagger.json


### Learn more about how to [list, get, explore and consume web services in this notebook](https://github.com/Microsoft/ML-Server-Python-Samples/blob/master/operationalize/Explore_Consume_Python_Web_Services.ipynb).


## 6. Delete services

You can call `delete_service` on the `DeployClient` object to delete a specific service on the running Machine Learning Server.

In [10]:
# Uncomment this line if you want to delete the service now.
#client.delete_service('LinModService', version='1.0')

True