Please ensure you have run all previous notebooks in sequence before running this.

Let's assume the last exercise generated a model we are happy with.  We'd now like to deploy it for inferencing.  We want to do singleton scoring, not batch, in this case.  In other words, when a new "row" is available we want to use something like a webservice to "score" it.  This isn't exactly a good use case for DBX, so instead we'll deploy our model to something more suited for inferencing.  ACI.  

We are going to deploy our model to ACI programmatically.

In [3]:
from azureml.core import Workspace
import azureml.core

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

#'''
ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')
#'''

In [4]:
##NOTE: service deployment always gets the model from the current working dir.
import os

model_name = "BikeBuyer.mml" # OR BikeBuyer_runHistory.mml
model_name_dbfs = os.path.join("/dbfs", model_name)

print("copy model from dbfs to local")
model_local = "file:" + os.getcwd() + "/" + model_name
dbutils.fs.cp(model_name, model_local, True)

In [5]:
#Register the model
from azureml.core.model import Model
mymodel = Model.register(model_path = model_name, # this points to a local file
                       model_name = model_name, # this is the name the model is registered as, am using same name for both path and name.                 
                       description = "ADB trained model by me",
                       workspace = ws)

print(mymodel.name, mymodel.description, mymodel.version)


The model can now be seen in the Azure Portal.  

What happens if you run the above cell again?  When might this be useful?  

In the next cell we are going to _manually_ create a scoring file python file.  This is a bit of a bizarre way to do it.  Why do it this way?  Mostly to speed up the workshop.  What we are doing is building and writing python via a multi-line string.  This is just a little easier than having to upload a separate scoring file.  

## What is a scoring file?  

A scoring file is the python that takes input (in this case json), loads our model and runs it, and then does something with the output.  We will wrap the scoring file with our model, conda dependencies file, and deploy it to ACI in this notebook.

In [7]:
#%%writefile score_sparkml.py
score_sparkml = """

import json

def init():
    try:
        # One-time initialization of PySpark and predictive model
        import pyspark
        from azureml.core.model import Model
        from pyspark.ml import PipelineModel
        
        global trainedModel
        global spark
        
        spark = pyspark.sql.SparkSession.builder.appName("ADB and AML notebook by Darwin").getOrCreate()
        model_name = "{model_name}" #interpolated
        model_path = Model.get_model_path(model_name)
        trainedModel = PipelineModel.load(model_path)
    except Exception as e:
        trainedModel = e
    
def run(input_json):
    if isinstance(trainedModel, Exception):
        return json.dumps({{"trainedModel":str(trainedModel)}})
      
    try:
        sc = spark.sparkContext
        input_list = json.loads(input_json)
        input_rdd = sc.parallelize(input_list)
        input_df = spark.read.json(input_rdd)
    
        # Compute prediction
        prediction = trainedModel.transform(input_df)
        #result = prediction.first().prediction
        predictions = prediction.collect()

        #Get each scored result
        preds = [str(x['prediction']) for x in predictions]
        result = ",".join(preds)
    except Exception as e:
        result = str(e)
    return json.dumps({{"result":result}})
    
""".format(model_name=model_name)

exec(score_sparkml)

with open("score_sparkml.py", "w") as file:
    file.write(score_sparkml)

In [8]:
from azureml.core.conda_dependencies import CondaDependencies 

myacienv = CondaDependencies.create(conda_packages=['scikit-learn','numpy','pandas'])

with open("mydeployenv.yml","w") as f:
    f.write(myacienv.serialize_to_string())

In [9]:
#deploy to ACI
from azureml.core.webservice import AciWebservice, Webservice

myaci_config = AciWebservice.deploy_configuration(
    cpu_cores = 1, 
    memory_gb = 1, 
    tags = {'name':'BikeBuyer Databricks Azure ML ACI'}, 
    description = 'This is for ADB and AML example. Azure Databricks & Azure ML SDK demo with ACI by Darwin.')

The above cell is just 'configuration'.  Nothing is actually deployed...yet.  

The next cell:

* builds an "image".  This is a docker container image.  This is visible in the azure portal under your AMLS workspace.  This is a standard docker image.  Note that it is "persisted" in an ACR that is "attached" to your AMLS workspace.  This may not be how you want to do this long-term.  You may instead want to use your corporate ACR instead.  
* the image also lists which model is deployed to that image.  
* after you get a success message below from the image creation you should see the webservice deployment start if you continue to refresh and monitor in the Azure Portal.  
* when the webservice is complete you should be able to find the scoring uri in the azure portal.  In the real world we would want to put APIM in front of our resource so we can manage upgrades to the webservice, etc.

In [11]:
# this will take 10-15 minutes to finish

service_name = "bikebuyeraciws"
runtime = "spark-py" 
driver_file = "score_sparkml.py"
my_conda_file = "mydeployenv.yml"

# image creation
from azureml.core.image import ContainerImage
myimage_config = ContainerImage.image_configuration(execution_script = driver_file, 
                                    runtime = runtime, 
                                    conda_file = my_conda_file)

# Webservice creation
myservice = Webservice.deploy_from_model(
  workspace=ws, 
  name=service_name,
  deployment_config = myaci_config,
  models = [mymodel],
  image_config = myimage_config
    )

myservice.wait_for_deployment(show_output=True)

In [12]:
# more information on the above commands
help(ContainerImage)

Suppose we don't want to use the portal to manage our resources...

In [14]:
# List images by ws

for i in ContainerImage.list(workspace = ws):
    print('{}(v.{} [{}]) stored at {} with build log {}'.format(i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))

In [15]:
#for using the Web HTTP API 
print(myservice.scoring_uri)

Let's test the webservice

In [17]:
import json

#get the some sample data
test_data_path = "BikeBuyerTest"
test = spark.read.parquet(test_data_path).limit(5)

test_json = json.dumps(test.toJSON().collect())

print(test_json)

In [18]:
#using data defined above predict if we will sell any bikes
myservice.run(input_data=test_json)

In [19]:
#comment to not delete the web service
#myservice.delete()