# Azure Machine Learning Service Recommender
---
## Introduction to Azure Machine Learning  
The **[Azure Machine Learning service (AzureML)](https://docs.microsoft.com/azure/machine-learning/service/overview-what-is-azure-ml)** provides a cloud-based environment you can use to prep data, train, test, deploy, manage, and track machine learning models. By using Azure Machine Learning service, you can start training on your local machine and then scale out to the cloud. With many available compute targets, like [Azure Machine Learning Compute](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute) and [Azure Databricks](https://docs.microsoft.com/en-us/azure/azure-databricks/what-is-azure-databricks), and with [advanced hyperparameter tuning services](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-tune-hyperparameters), you can build better models faster by using the power of the cloud.

Data scientists and AI developers use the main [Azure Machine Learning Python SDK](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/intro?view=azure-ml-py) to build and run machine learning workflows with the Azure Machine Learning service. You can interact with the service in any Python environment, including Jupyter Notebooks or your favorite Python IDE. The Azure Machine Learning SDK allows you the choice of using local or cloud compute resources, while managing and maintaining the complete data science workflow from the cloud.
![AzureML Workflow](https://docs.microsoft.com/en-us/azure/machine-learning/service/media/overview-what-is-azure-ml/aml.png)

### Advantages of using AzureML:
- Manage cloud resources for monitoring, logging, and organizing your machine learning experiments.
- Train models either locally or by using cloud resources, including GPU-accelerated model training.
- Easy to scale out when dataset grows - by just creating and pointing to new compute target

---
## Criteo Data
* Criteo is a personalized retargeting company that works with Internet retailers to serve personalized online display advertisements to consumers who have previously visited the advertiser's website.
* The company provided an obfuscated data set for an online Machine Learning competition through Kaggle.
* We added mocked up headers to simulate the types of data that would be in the original data set.
     * Google Analytics or other website metrics.
     * Demographic information
     * Product information
---
## LightGBM: A Highly Efficient Gradient Boosting Decision Tree
This notebook will give you an example of how to train a LightGBM model to estimate click-through rates on an e-commerce advertisement. We will train a LightGBM based model on the Criteo dataset.

[LightGBM](https://github.com/Microsoft/LightGBM) is a gradient boosting framework that uses tree-based learning algorithms. It is designed to be distributed and efficient with the following advantages:
* Fast training speed and high efficiency.
* Low memory usage.
* Great accuracy.
* Support of parallel and GPU learning.
* Capable of handling large-scale data.
---
## Prerequisities
   - **Azure Subscription**
     - If you don’t have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning service today](https://azure.microsoft.com/en-us/free/services/machine-learning/).
     - You get credits to spend on Azure services, which will easily cover the cost of running this example notebook. After they're used up, you can keep the account and use [free Azure services](https://azure.microsoft.com/en-us/free/). Your credit card is never charged unless you explicitly change your settings and ask to be charged. Or [activate MSDN subscriber benefits](https://azure.microsoft.com/en-us/pricing/member-offers/credit-for-visual-studio-subscribers/), which give you credits every month that you can use for paid Azure services.
--- 

## Walkthrough

#### Set up the Development Environment
* Initialize the Workspace
* Initialize an Experiment
* Create a directory for Python code
* Create a Compute Cluster
* Environment Setup and Creation

#### Train the Predictive Model
* Create the Training script
* Submit the Training job to the Compute Cluster
* Register the Model

#### Deploy the Model to AMLS
* Create the Scoring script
* Deploy in Azure Container Instance
* Test the Deployed Service
---

## Set up Development Environment

### Initialzize the Workspace

* Import base Azure ML packages
* Check the SDK version
* Connect to the workspace

In [1]:
import azureml.core
from azureml.core import Workspace

In [2]:
# check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.0.55


In [3]:
# load workspace configuration from the config.json file in the current folder.
ws = Workspace.from_config()
print(ws.name, ws.location, ws.resource_group, ws.location, sep='\t')

Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code BWYDESS3J to authenticate.




Interactive authentication successfully completed.
aiwebinaramls	westus2	aawebinar-demo-rg	westus2


### Initialize an Experiment

In [4]:
experiment_name = 'recommender'

from azureml.core import Experiment
exp = Experiment(workspace=ws, name=experiment_name)

### Create a directory for the Training script and any custom Python code.

In [5]:
#Code Directory
import os
script_folder = os.path.join(os.getcwd(), "MLWebinarDemo")
os.makedirs(script_folder, exist_ok=True)

ds = ws.get_default_datastore()
print(ds.datastore_type, ds.account_name, ds.container_name)

### Create or Attach an existing compute resource

I've added two sets of code to create the Compute Cluster. The first cell is a simple version that uses defaults to create the cluster. The second cell is an examle of a more configurable version.

In [7]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# Choose a name for your CPU cluster
cpu_cluster_name = "MLWebinarDemo"

# Verify that cluster does not exist already
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',
                                                           max_nodes=4)
    cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)

cpu_cluster.wait_for_completion(show_output=True)

Found existing cluster, use it.
Succeeded
AmlCompute wait for completion finished
Minimum number of nodes requested have been provisioned


### Environment Setup and Creation

This section outlines how to get the environment set up. This section MUST be included and filled out. Please be aware that there are two sections. Conda dependencies and PIP dependencies. Proper identification of where the packages are installed from is important.

In [8]:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies

myenv = Environment("myenv")

myenv.docker.enabled = True
myenv.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn',
                                                                           'pandas',
                                                                           'numpy',
                                                                           'seaborn',
                                                                           'category_encoders',
                                                                           'lightgbm',
                                                                           'papermill'
                                                                          ])
myenv.python.conda_dependencies.add_pip_package("inference-schema[numpy-support]")
myenv.python.conda_dependencies.save_to_file(".", "myenv.yml")

'myenv.yml'

## Train the Predictive Model

### Create Training Script

This section creates a training script to be used by the Experiment to build the Machine Learning Model. The output is a pickle file that is used to create a web service for making predictions using the ML model.

In [9]:
%%writefile $script_folder/train.py
import argparse
import sys, os
sys.path.append("../../")
import numpy as np
import lightgbm as lgb
import papermill as pm
import pandas as pd
import category_encoders as ce
from tempfile import TemporaryDirectory
from sklearn.metrics import roc_auc_score, log_loss
from sklearn.externals import joblib

import reco_utils.recommender.lightgbm.lightgbm_utils as lgb_utils
import reco_utils.dataset.criteo as criteo
import pickle

from azureml.core import Run

# let user feed in 2 parameters, the location of the data files (from datastore), and the regularization rate of the logistic regression model
parser = argparse.ArgumentParser()
parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
parser.add_argument('--regularization', type=float, dest='reg', default=0.01, help='regularization rate')
args = parser.parse_args()

MAX_LEAF = 64
MIN_DATA = 20
NUM_OF_TREES = 100
TREE_LEARNING_RATE = 0.15
EARLY_STOPPING_ROUNDS = 20
METRIC = "auc"
SIZE = "sample"

params = {
    'task': 'train',
    'boosting_type': 'gbdt',
    'num_class': 1,
    'objective': "binary",
    'metric': METRIC,
    'num_leaves': MAX_LEAF,
    'min_data': MIN_DATA,
    'boost_from_average': True,
    #set it according to your cpu cores.
    'num_threads': 20,
    'feature_fraction': 0.8,
    'learning_rate': TREE_LEARNING_RATE,
}

nume_cols = ["I" + str(i) for i in range(1, 14)]
cate_cols = ["C" + str(i) for i in range(1, 27)]
label_col = "Label"

header = [label_col] + nume_cols + cate_cols
with TemporaryDirectory() as tmp:
    all_data = criteo.load_pandas_df(size=SIZE, local_cache_path=tmp, header=header)

# split data to 3 sets    
length = len(all_data)
train_data = all_data.loc[:0.8*length-1]
valid_data = all_data.loc[0.8*length:0.9*length-1]
test_data = all_data.loc[0.9*length:]

ord_encoder = ce.ordinal.OrdinalEncoder(cols=cate_cols)

def encode_csv(df, encoder, label_col, typ='fit'):
    if typ == 'fit':
        df = encoder.fit_transform(df)
    else:
        df = encoder.transform(df)
    y = df[label_col].values
    del df[label_col]
    return df, y

train_x, train_y = encode_csv(train_data, ord_encoder, label_col)
valid_x, valid_y = encode_csv(valid_data, ord_encoder, label_col, 'transform')
test_x, test_y = encode_csv(test_data, ord_encoder, label_col, 'transform')

lgb_train = lgb.Dataset(train_x, train_y.reshape(-1), params=params, categorical_feature=cate_cols)
lgb_valid = lgb.Dataset(valid_x, valid_y.reshape(-1), reference=lgb_train, categorical_feature=cate_cols)
lgb_test = lgb.Dataset(test_x, test_y.reshape(-1), reference=lgb_train, categorical_feature=cate_cols)
lgb_model = lgb.train(params,
                      lgb_train,
                      num_boost_round=NUM_OF_TREES,
                      early_stopping_rounds=EARLY_STOPPING_ROUNDS,
                      valid_sets=lgb_valid,
                      categorical_feature=cate_cols)

label_col = 'Label'
num_encoder = lgb_utils.NumEncoder(cate_cols, nume_cols, label_col)
train_x, train_y = num_encoder.fit_transform(train_data)
valid_x, valid_y = num_encoder.transform(valid_data)
test_x, test_y = num_encoder.transform(test_data)
del num_encoder

lgb_train = lgb.Dataset(train_x, train_y.reshape(-1), params=params)
lgb_valid = lgb.Dataset(valid_x, valid_y.reshape(-1), reference=lgb_train)
lgb_model = lgb.train(params,
                      lgb_train,
                      num_boost_round=NUM_OF_TREES,
                      early_stopping_rounds=EARLY_STOPPING_ROUNDS,
                      valid_sets=lgb_valid)

f = open('recommender.pkl', 'wb')
pickle.dump(lgb_model, f)
f.close()

print('Import the model from model.pkl')
f2 = open('recommender.pkl', 'rb')
clf2 = pickle.load(f2)

X_new = [[154, 54, 35]]
print('New Sample:', X_new)
print('Predicted class:', clf2.predict(X_new))

os.makedirs('outputs', exist_ok=True)
# note file saved in the outputs folder is automatically uploaded into experiment record
joblib.dump(value=lgb_model, filename='outputs/recommender.pkl')

Overwriting /mnt/azmnt/code/Users/AzureMLWebinar/MLWebinarDemo/train.py


### Submit the job to the Compute Cluster

Run the experiment by submitting the estimator object. And you can navigate to Azure portal to monitor the run.

In [10]:
from azureml.core import ScriptRunConfig
from azureml.core.runconfig import DEFAULT_CPU_IMAGE

src = ScriptRunConfig(source_directory=script_folder, script='train.py')

# Set compute target to the one created in previous step
src.run_config.target = cpu_cluster.name

# Set environment
src.run_config.environment = myenv
 
run = exp.submit(config=src)
run

Experiment,Id,Type,Status,Details Page,Docs Page
recommender,recommender_1566523359_1e3191c9,azureml.scriptrun,Starting,Link to Azure Portal,Link to Documentation


Model training happens in the background. You can use `wait_for_completion` to block and wait until the model has completed training before running more code.

In [11]:
%%time
# specify show_output to True for a verbose log
run.wait_for_completion(show_output=True) 

RunId: recommender_1566523359_1e3191c9
Web View: https://mlworkspace.azure.ai/portal/subscriptions/d33098f5-4a90-4784-be1a-db39e054a7b0/resourceGroups/aawebinar-demo-rg/providers/Microsoft.MachineLearningServices/workspaces/aiwebinaramls/experiments/recommender/runs/recommender_1566523359_1e3191c9

Streaming azureml-logs/20_image_build_log.txt

2019/08/23 01:22:48 Downloading source code...
2019/08/23 01:22:56 Finished downloading source code
2019/08/23 01:22:57 Using acb_vol_d1e55aa5-dec2-4411-93c5-2743fb1fefca as the home volume
2019/08/23 01:22:57 Creating Docker network: acb_default_network, driver: 'bridge'
2019/08/23 01:22:58 Successfully set up Docker network: acb_default_network
2019/08/23 01:22:58 Setting up Docker configuration...
2019/08/23 01:22:59 Successfully set up Docker configuration
2019/08/23 01:22:59 Logging in to registry: aiwebinaramla84350bc.azurecr.io
2019/08/23 01:23:01 Successfully logged into aiwebinaramla84350bc.azurecr.io
2019/08/23 01:23:01 Executing step 

{'endTimeUtc': '2019-08-23T01:57:24.044967Z',
 'logFiles': {'azureml-logs/20_image_build_log.txt': 'https://aiwebinaramls4449883837.blob.core.windows.net/azureml/ExperimentRun/dcid.recommender_1566523359_1e3191c9/azureml-logs/20_image_build_log.txt?sv=2018-11-09&sr=b&sig=sPSY%2B2e7nbsJUUiKc5feAH1a58rQRR22YIPxMHkeVvg%3D&st=2019-08-23T01%3A47%3A24Z&se=2019-08-23T09%3A57%3A24Z&sp=r',
  'azureml-logs/55_batchai_execution.txt': 'https://aiwebinaramls4449883837.blob.core.windows.net/azureml/ExperimentRun/dcid.recommender_1566523359_1e3191c9/azureml-logs/55_batchai_execution.txt?sv=2018-11-09&sr=b&sig=a8rmMpdR6u6%2BWFxaGC9aWfRO%2FSzrkJsqT%2FKl6SsCy3E%3D&st=2019-08-23T01%3A47%3A24Z&se=2019-08-23T09%3A57%3A24Z&sp=r',
  'azureml-logs/55_batchai_stdout-job_post.txt': 'https://aiwebinaramls4449883837.blob.core.windows.net/azureml/ExperimentRun/dcid.recommender_1566523359_1e3191c9/azureml-logs/55_batchai_stdout-job_post.txt?sv=2018-11-09&sr=b&sig=3%2FiR3WWQcSRsdF0Gx406qeaKnBAdTVKrCbhreT6kZLk%3D&st=

In [13]:
print(run.get_file_names())

['azureml-logs/20_image_build_log.txt', 'azureml-logs/55_batchai_execution.txt', 'azureml-logs/55_batchai_stdout-job_post.txt', 'azureml-logs/55_batchai_stdout-job_prep.txt', 'azureml-logs/55_batchai_stdout.txt', 'azureml-logs/56_batchai_stderr.txt', 'azureml-logs/70_driver_log.txt', 'logs/azureml/130_azureml.log', 'logs/azureml/azureml.log', 'outputs/recommender.pkl']


### Register model

Register the model in the workspace so that you (or other collaborators) can later query, examine, and deploy this model.

In [14]:
# register model 
model = run.register_model(model_name='recommender', model_path='outputs/recommender.pkl')
print(model.name, model.id, model.version, sep='\t')

recommender	recommender:7	7


In [15]:
from azureml.core import Workspace
from azureml.core.model import Model
import os 
ws = Workspace.from_config()
model=Model(ws, 'recommender')

model.download(target_dir=os.getcwd(), exist_ok=True)

# verify the downloaded model file
file_path = os.path.join(os.getcwd(), "recommender.pkl")

os.stat(file_path)

os.stat_result(st_mode=33279, st_ino=10520424122700267520, st_dev=45, st_nlink=1, st_uid=0, st_gid=0, st_size=213424, st_atime=1564433037, st_mtime=1566525446, st_ctime=1566525446)

## Deploy the Model to AMLS

### Create and write out Scoring script

Create the scoring script, called score.py, used by the web service call to show how to use the model.

You must include two required functions into the scoring script:
* The `init()` function, which typically loads the model into a global object. This function is run only once when the Docker container is started. 

* The `run(input_data)` function uses the model to predict a value based on the input data. Inputs and outputs to the run typically use JSON for serialization and de-serialization, but other formats are supported.

In [16]:
%%writefile score.py
import json
import numpy as np
import os
import pickle
import pandas as pd
from sklearn.externals import joblib
from sklearn.svm import SVC

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType

from azureml.core.model import Model

def init():
    global model
    model_path = Model.get_model_path('recommender')
    model = joblib.load(model_path)

input_sample = pd.DataFrame(data=[{"GA_visitStartTime": 5.5,
                                    "GA_date": 5,
                                    "GA_bounces": 5.5,
                                    "GA_hits": 5.5,
                                    "GA_newVisits": 5.5,
                                    "GA_pageviews": 5.5,
                                    "GA_screenviews": 5.5,
                                    "GA_timeOnScreen": 5.5,
                                    "GA_timeOnSite": 5.5,
                                    "GA_totalTransactionRevenue": 5.5,
                                    "GA_transactionRevenue": 5.5,
                                    "GA_transactions": 5.5,
                                    "GA_UniqueScreenViews": 5.5,
                                    "GA_TrafficSource": 5,
                                    "GA_NetworkType": 5,
                                    "GA_isVideoAd": 5,
                                    "GA_Campaign": 5,
                                    "GA_Keyword": 5,
                                    "GA_Medium": 5,
                                    "GA_Source": 5,
                                    "GA_Channel": 5,
                                    "CL_HHIncome": 5,
                                    "CL_HHComposition": 5,
                                    "CL_LifestageGroup": 5,
                                    "CL_SocialGroup": 5,
                                    "CL_Urbanicity": 5,
                                    "CL_Homeownership": 5,
                                    "CL_EmploymentLevels": 5,
                                    "CL_Education": 5,
                                    "AC_Product": 5,
                                    "AC_ProductCategory": 5,
                                    "AC_ProductLine": 5,
                                    "AC_Brand": 5,
                                    "AC_Banner": 5,
                                    "AC_Vendor": 5,
                                    "AC_ProductType": 5,
                                    "AC_LineOfBusiness": 5,
                                    "AC_Vertical": 5,
                                    "AC_Merchant": 5
                                  }])

output_sample = np.array([0])

@input_schema('data', PandasParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))

def run(data):
    try:
        result = model.predict(data)
        # you can return any datatype as long as it is JSON-serializable
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

Overwriting score.py


### Deploy in Azure Container Instance (ACI)

Configure the image and deploy. The following code goes through these steps:

1. Build an image using:
   * The scoring file
   * The environment file
   * The model file
1. Register that image under the workspace. 
1. Send the image to the ACI container.
1. Start up a container in ACI using the image.
1. Get the web service HTTP endpoint.

In [17]:
from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={"data": "LightGBM",  "method" : "sklearn"}, 
                                               description='Predict Click Through Rate with LightGBM')

In [18]:
%%time
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage
from azureml.exceptions import WebserviceException

# configure the image
image_config = ContainerImage.image_configuration(execution_script="score.py", 
                                                  runtime="python", 
                                                  conda_file="myenv.yml")

service_name = 'recommender'

# delete service if it exists
try:
    service = Webservice(ws, name=service_name)
    if service:
        service.delete()
except WebserviceException as e:
    print()
    
service = Webservice.deploy_from_model(workspace=ws, 
                                       name=service_name, 
                                       deployment_config=aciconfig, 
                                       models=[model], 
                                       image_config=image_config)

service.wait_for_deployment(show_output=True)

Creating image
Running....................................................................
Succeeded
Image creation operation finished for image recommender:7, operation "Succeeded"
Creating service
Running.................
SucceededACI service creation operation finished, operation "Succeeded"
CPU times: user 751 ms, sys: 48.4 ms, total: 799 ms
Wall time: 7min 19s


In [19]:
print(service.scoring_uri)

http://4696fb6a-a1df-42c0-acde-9e8c5eda4459.westus2.azurecontainer.io/score


### Test deployed service

Earlier you scored all the test data with the local version of the model. Now, you can test the deployed model.  

The following code goes through these steps:
1. Send the data as a JSON array to the web service hosted in ACI. 

1. Use the SDK's `run` API to invoke the service. You can also make raw calls using any HTTP tool such as curl.

1. Print the returned predictions.

In [20]:
import requests
import json

headers = {'Content-Type':'application/json'}

if service.auth_enabled:
    headers['Authorization'] = 'Bearer '+service.get_keys()[0]

print(headers)
    
test_sample = json.dumps({'data': [[1.0,1,5.0,0.0,1382.0,4.0,15.0,2.0,181.0,1.0,2.0,3.0,2.0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]]})

response = requests.post(service.scoring_uri, data=test_sample, headers=headers)
print(response.status_code)
print(response.elapsed)
print(response.json())

{'Content-Type': 'application/json'}
200
0:00:00.095279
[0.5410261685734266]
