# Lab04 - Deploying a real-time scoring endpoint with managed compute

Azure Machine Learning allows you to set a **real-time endpoint**, when you need to get an immediate response when using your model (Online recommendation, fraud detection, etc...).

For a real-time solution, very often, you will need to deploy it on some 24/7 up and running compute instance properly dimensioned. So you will need to choose carefully on which compute to deploy.

*To illustrate this*, in this lab we will:

 . **Train** a model

 . **Test** this model with a local deployment

 . **Deploy** your model for real-life scenario.


 *More specifically, the deployment part of this Lab will cover*:
- The notion of Azure ML `Environments`, that allows to deploy on:
    - 1. Reproductible contexts.
    - 2. Customizable contexts.
        - Via custom Docker images
        - Via a requirement.txt file
- [Which compute](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=azcli#choose-a-compute-target) depending on the purpose of your deployment
    - Local deployment for development purposes.
    - Deployment on scalable environments, as we are setting here a real-time service endpoint, so that your model can be used remotely, and respond immediately


# Exercise 1 - Train a Classifier

## Task 1 - Let's observe and prepare the data

**Import required libraries**

In [None]:
import subprocess
import sys
import os
import pip
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import requests
import sklearn

def installFromNotebook(package):
    subprocess.check_call([sys.executable, "-m", "pip", "install", package])

def upgradeFromNotebook(package):
    subprocess.check_call([sys.executable, "-m", "pip", "install", "--upgrade", package])

#Verification that the SDK is up to date
upgradeFromNotebook("azureml-sdk")
import azureml.core
from azureml.core import Dataset
from azureml.core import Workspace
ws = Workspace.from_config()

#Installing other dependancies
installFromNotebook("plotly")
import plotly.express as px
import plotly.graph_objs as go
import plotly.figure_factory as ff
plt.style.use("classic")

#Create path to have outputs folder and outputs/models to store our models
try:
    os.makedirs(r'./outputs/models/')
except:
     pass
try:
    os.makedirs(r'./sources/')
except:
     pass
try:
    os.makedirs(r'./data/')
except:
     pass

#Sanity check on version and Workspace name
print("Libs version control --", "Azure ML SDK Version:", azureml.core.VERSION, "Pandas:", pd.__version__, "pip:", pip.__version__,
"seaborn:", sns.__version__, 'Scikit-kearn:', sklearn.__version__, 'Workspace:', ws.name)

# Dataset overview: Covid-19 Case Surveillance Public Use Dataset

## General considerations

https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf

The COVID-19 case surveillance system database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and states. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020 to clarify the interpretation of antigen detection tests and serologic test results within the case classification. The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and shared voluntarily with CDC.

For more information: wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-covid-19/case-definition/2020/08/05/.

The dataset contains 13.4 million rows of deidentified patient data (more than 3 Go). To run this lab smoothly, we will use an extract of 100.000 data point.

## Data points description

| Variable        | Description      | Source        | Values        | Type	        | Calculation (if applicable)      |
| ------|-----|-----|-----|-----|-----|
| **cdc_report_dt**  	| Date case was first reported to the CDC 	| Calculated 	| YYYY-MM-DD 	| Date 	| Deprecated; CDC recommends researchers use cdc_case_earliest_dt in time series and other analyses. This date was populated using the date at which a case record was first submitted to the database. If missing, then the report date entered on the case report form was used. If missing, then the date at which the case first appeared in the database was used. If none available, then left blank.  	|
| **cdc_case_earliest_dt**  	| The earlier of the Clinical Date (date related to the illness or specimen collection) or the Date Received by CDC 	| Calculated 	| YYYY-MM-DD 	| Date 	| Cdc_case_earliest_dt uses the best available date from the set of dates related to illness/specimen collection and the set of dates related to when a case is reported. It is an option to end-users who need a date variable with optimized completeness. The logic of cdc_case_earliest_dt is to use the non-null date of one variable when the other is null and to use the earliest valid date when both dates are available.  If no date available, then left blank. 	|
| **pos_spec_dt**  	| Date of first positive specimen collection 	| Case Report Form 	| YYYY-MM-DD 	| Date 	|  	|
| **onset_dt**  	| Date of symptom onset 	| Case Report Form 	| YYYY-MM-DD 	| Date 	|  	|
| **current_status**  	| What is the current status of this person? 	| Case Report Form 	| Laboratory-confirmed case Probable case 	| String 	| Please see latest CSTE case definition for more information. 	|
| **sex**  	| Gender 	| Case Report Form 	| [Male - Female - Unknown - Other - Missing - NA] 	| String 	|  	|
| **age_group**  	| Age group categories 	| Calculated 	| [0 - 9 Years - 10 - 19 Years - 20 - 39 Years - 40 - 49 Years - 50 - 59 Years - 60 - 69 Years - 70 - 79 Years - 80 + Years - Missing - NA] 	| String 	| The age group categorizations were populated using the age value that was reported on the case report form. Date of birth was used to fill in missing/unknown age values using the difference in time between date of birth and onset date. 	|
| **race_ethnicity_combined**  	| Race and Ethnicity (combined) 	| Calculated 	| [American Indian/Alaska Native, Non-Hispanic - Asian, Non-Hispanic - Black, Non-Hispanic - Multiple/Other, Non-Hispanic - Native Hawaiian/Other Pacific Islander, Non-Hispanic - White, Non-Hispanic - Hispanic/Latino - Unknown - Missing - NA] 	| String 	| If more than race was reported, race was categorized into multiple/other races. 	|
| **hosp_yn**  	| Was the patient hospitalized? 	| Case Report Form 	| [Yes - No - Unknown - Missing] 	| Character 	|  	|
| **icu_yn**  	| Was the patient admitted to an intensive care unit (ICU)? 	| Case Report Form 	| [Yes - No - Unknown - Missing] 	| Character 	|  	|
| **death_yn**  	| Did the patient die as a result of this illness? 	| Case Report Form 	| [Yes - No - Unknown - Missing] 	| Character 	|  	|
| **medcond_yn**  	| Pre-existing medical conditions? 	| Case Report Form 	| [Yes - No - Unknown - Missing] 	| Character 	|  	|

## First overview of raw data

Here we will grab some raw data from the dataset, quickly available online, to have a first overview.

In [None]:
#Get a fraction of the dataset (to observe the first rows rapidly)
Covid19_CSPUD_JSON = requests.get('https://data.cdc.gov/resource/vbim-akqf.json')
data = pd.read_json(Covid19_CSPUD_JSON.text)
#Observe top 5 lines
data.head(5)

## Load the full extract of the dataset and observe in more detail

Now we will get the extract of 100.000 lines from the full dataset, and look at:
 - How many null values for each variable, in order to see how many which one have missing data, and at which proportion.
 - How many single values for each variable to validate which variable should be categorical.

In [None]:
url = 'https://solliancepublicdata.blob.core.windows.net/azure-ml-datascience/COVID-19_Case_Surveillance_Public_Use_Data_shuffled_100000.csv'
data = pd.read_csv(url, error_bad_lines=False)

#View columns where many data is missing
print('>> Null count by variable \r\n')
print(data.isnull().sum())

#Now let's have a summarized look at our data
print('\r\n>> Data general description (including unique values) \r\n')
print(data.describe().transpose())

## Remove uncomplete or deprecated columns (variables)

In the data description, we have learned that 'cdc_report_dt' is deprecated and 'pos_spec_dt' and 'onset_dt' are way too incomplete, in regard to a total of 100.000 rows, to be used. So we remove those columns.

In [None]:
data = data.drop(['cdc_report_dt', 'pos_spec_dt', 'onset_dt'], axis=1)

## Observe the distributions among some categorical variables

With Azure Machine Learning Python SDK, you can plot many types of chart to observe your dataset.

We will plot for observation purposes the distributions of:
 - Statuses
 - Genders
 - Ages
 - Ethnicities

We will also have a look at the death status by gender.

Note that those charts also have some interactive features.

![Charts interactivity](./images/LAB04-Charts-Interactivity.png)

In [None]:
##CONFIRMED CASES
values = data['current_status'].value_counts().tolist()
#Print unique status values
print('\r\n\t', '>> Unique cases status values:', data['current_status'].unique())
#Use shorte names for thos values for display purposes
names = ['Confirmed', 'Probable']

#Make pie chart
fig = px.pie(
    names=names,
    values=values,
    title="Case Status Pie Chart",
    color_discrete_sequence=px.colors.sequential.RdBu,
)
#Display pie chart
fig.show()

##GENDER DISTRIBUTION
values = data['sex'].value_counts().tolist()
print('\r\n\t', '>> Unique cases gender values:', data['sex'].unique())
names = ['Female', 'Male', 'Unknown', 'Missing', 'Other']
fig = px.pie(
    names=names,
    values=values,
    title="Gender Status Distribution",
    color_discrete_sequence=px.colors.sequential.Bluyl_r,
)
fig.show()

##AGE GROUP DISTRIBUTION
values = data['age_group'].value_counts().tolist()
print('\r\n\t', '>> Age beans:', data['age_group'].unique())
names = ['20 - 29 Years', '30 - 39 Years', '40 - 49 Years', '50 - 59 Years', '60 - 69 Years', '10 - 19 Years', '70 - 79 Years', '80+ Years', '0 - 9 Years', 'Unknown']
fig = px.bar(
    x=names,
    y=values,
    title="Age Group Distribution",
    labels={
        'x': 'Age Group',
        'y': 'Number of Patients'
    },
    color=values
)
fig.show()

#ETHNICITY
values = data['race_ethnicity_combined'].value_counts().tolist()
print('\r\n\t', '>> Race and ethnicity (combined):', data['race_ethnicity_combined'].unique())
names = ['Unkown', 'White, Non-Hispanic', 'Hispanic/Latino', 'Black, Non-Hispanic', 'Missing', 'Multiple/Other, Non-Hispanic', 'Asian, Non-Hispanic', 'American Indian/Alaska Native, Non-Hispanic', 'Native Hawaiian/Other Pacific Islancer, Non-Hispanic']
fig = px.pie(
    names=names,
    values=values,
    title="Distribution of Races and Ethnicities ",
    color_discrete_sequence=px.colors.sequential.Electric,
)
fig.show()

#DEATH STATUS BY GENDER
plt.figure(figsize=(9, 7))
plt.style.use("fivethirtyeight")
sns.countplot(y="death_yn", hue ='sex', data=data[data['sex'].isin(['Male', 'Female', 'Other'])])
plt.xlabel("Count")
plt.ylabel("Death Status")
plt.title('Death Status by Gender')
plt.show()

## Encoding the data for the machine

Now we will encode the categorical data, assigning a numeric value to each element of the catagorical data.

For instance, we have 10 distinct age groups, whose values are strings (e.g.: '0 - 9 Years').

To facilitate the use of those variables for some machine learning algorithms, '0 - 9 Years' could be turned to 0 and '10 - 19 Years' could be turned to 1, to use numeric values for those categorical variables, instead of strings.

In [None]:
from sklearn.preprocessing import LabelEncoder
lb_encoder = LabelEncoder()
data['current_status'] = lb_encoder.fit_transform(data['current_status'])
data['age_group'] = lb_encoder.fit_transform(data['age_group'])
data['race_ethnicity_combined'] = lb_encoder.fit_transform(data['race_ethnicity_combined'])
data['sex'] = lb_encoder.fit_transform(data['sex'])
data['hosp_yn'] = lb_encoder.fit_transform(data['hosp_yn'])
data['icu_yn'] = lb_encoder.fit_transform(data['icu_yn'])
data['death_yn'] = lb_encoder.fit_transform(data['death_yn'])
data['medcond_yn'] = lb_encoder.fit_transform(data['medcond_yn'])

## Storing the dataset once prepared

### Store in a variable for immadiate reuse

In [None]:
#Store exploitable data variable to use it in other NoteBooks
%store data
data.head(5)

### Register the dataset for more permanent purposes

For proper handeling and reuse. Once registered, you can find your dataset in the "Datasets" menu of the interface.

![Registered datasets](./images/LAB04-Registered-datasets.png)

In [None]:
datastore = ws.get_default_datastore()

data.to_csv('./data/COVID_19_CSPUD_100k.csv')

datastore.upload(src_dir='./data/', target_path='./data/')

dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, './data/')])

COVID_19_CSPUD_100k = dataset.register(workspace=ws,
                                 name='COVID_19_CSPUD_100k',
                                 description='COVID-19 Case Surveillance Public Use Data')

## Task 2 - Train models and record the best one

### Load models from the Python library Scikit Learn

In [None]:
#Scikit-learn machine learning models
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC, LinearSVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.linear_model import Perceptron
from sklearn.linear_model import SGDClassifier
from sklearn.tree import DecisionTreeClassifier
#Scikit-learn data split
from sklearn.model_selection import train_test_split
from sklearn import metrics
#To serialize models
import pickle as pkl

from azureml.core.model import Model

### Load the data and split them in training set and test set

Note that we will not take the 'Unknown' or 'Missing' death status to train our model.

Also, we will not take ethnicity, who ads a lot of complexity, for only 80k lines (i.e. 80k data points)

Once loaded, we will check the dimentions of each set in order to validate we fit the expected format.

In [None]:
#Get the data variable stored at the end of the data preparation
%store -r data
data.head(5)

X = data.query('death_yn in [0, 1]')[['current_status', 'sex', 'age_group', 'hosp_yn','icu_yn','medcond_yn']]
y = data.query('death_yn in [0, 1]')['death_yn']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=2)

#Dimensions sanity check
print('> X_train dimensions:', X_train.shape, '\r\n> X_test dimensions', X_test.shape, '\r\n> y_train dimensions', y_train.shape, '\r\n> y_test dimensions', y_test.shape)

### Instantiate the ML algorithms we have chosen

In [None]:
log_reg = LogisticRegression(solver='lbfgs', max_iter=1500) #lbfgs is default but to ensure it remains the same, 
                                                            #playing around with iterations, for convergence purposes
svc = SVC()
lin_svc = LinearSVC(max_iter=10000)
rfc = RandomForestClassifier()
knn = KNeighborsClassifier()
gnb = GaussianNB()
perceptron = Perceptron()
sdg = SGDClassifier()
dtc = DecisionTreeClassifier()

### Let's learn

Each algorithm will now try to determine a function f(X) -> y that best fits the Xs and ys.

*Note*: **It make take around 6 minutes**.

In [None]:
#Algorithms fitting/learning
log_reg.fit(X_train, y_train)
svc.fit(X_train, y_train)
lin_svc.fit(X_train, y_train)
rfc.fit(X_train, y_train)
knn.fit(X_train, y_train)
gnb.fit(X_train, y_train)
perceptron.fit(X_train, y_train)
sdg.fit(X_train, y_train)
dtc.fit(X_train, y_train)

### Evaluate the models

#### Step 1: Make predictions on the test set

Now each model has determined a function, we will try to evaluate the most valuable function according to our needs.

To do so we will make, for each function, some predictions on the test dataset and score those predictions.

So we will use the F1-score (weighted average between our two relatively balanced classes) to choose our "best" model, and the F1 given by the sklearn metrics we used in the Notebook takes by default a good balance between precision and recall.

*Note*: Once we will use the classifier for a real case scenario, you can use the .predict_proba() method in order to get an intuition about the level of risk for this patient

In [None]:
#Make predictions on the test dataset for a first model
y_pred = log_reg.predict(X_test)
#Then score this model
print('> Classification report Logistic regression\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

#Then we do this for all the models
y_pred = svc.predict(X_test)
print('> Classification report Support Vector Machine (optim 1)\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = lin_svc.predict(X_test)
print('> Classification report Support Vector Machine (optim 2)\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = rfc.predict(X_test)
print('> Classification report Random Forest\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = knn.predict(X_test)
print('> Classification report k-Nearest Neighbours\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = gnb.predict(X_test)
print('> Classification report Gaussian Naïve Bayes\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = perceptron.predict(X_test)
print('> Classification report Perceptron\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = sdg.predict(X_test)
print('> Classification report Mixed Linear Models\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

y_pred = dtc.predict(X_test)
print('> Classification report Decision Tree\r\n', metrics.classification_report(y_test, y_pred), '\r\n')

#### Step 2: Plot the ROC curves for each model

As the F1-score is the same for some of the models, and as it (the F1-score) is applicable on any point of the ROC curve, then we will look at the AUC (area under the ROC curve) and take the biggest AUC, sort of generalization of the F1 for all precision/recall threshold values.

In [None]:
#Verify our data are balanced
print('> Data balance', data.query('death_yn in [0, 1]')['death_yn'].mean(), '\r\n')

#Plot the ROC curve
metrics.plot_roc_curve(log_reg, X_test, y_test)
metrics.plot_roc_curve(svc, X_test, y_test)
metrics.plot_roc_curve(lin_svc, X_test, y_test)
metrics.plot_roc_curve(rfc, X_test, y_test)
metrics.plot_roc_curve(knn, X_test, y_test)
metrics.plot_roc_curve(gnb, X_test, y_test)
metrics.plot_roc_curve(perceptron, X_test, y_test)
metrics.plot_roc_curve(sdg, X_test, y_test)
metrics.plot_roc_curve(dtc, X_test, y_test)

#Precision: How many (which proportion) selected elements are pertinents
#Recall: How many (which proportion) pertinent elements has been selected


### Store a serialized version of your best model

Once chosen a model, you can record a serialized version of it with the Python's `Pickle` library, and save it as a file you can reuse later on.

In [None]:
#SAVE THE BEST MODEL---------------------------------
#Save a serialized version of your models
pkl.dump(dtc, open(r'./outputs/models/DecisionTreeClassifier.pkl', 'wb'))
#----------------------------------------------------

### Register your model

You can now record this model, properly handled and version in the "Models" section of this interface.

![Registered models](./images/LAB04-Registered-models.png)

[From Microsoft documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py): Registering a model creates a logical container for the one or more files that make up your model. In addition to the content of the model file itself, a registered model also stores model metadata, including model description, tags, and framework information, that is useful when managing and deploying the model in your workspace. For example, with tags you can categorize your models and apply filters when listing models in your workspace. After registration, you can then download or deploy the registered model and receive all the files and metadata that were registered.

In [None]:
model = Model.register(model_path=r'./outputs/models/DecisionTreeClassifier.pkl',
                       model_name="DecisionTreeClassifier",
                       tags={'area': 'covid19', 'type': 'classification'},
                       description='Survival to Covid-19 classification and estimation',
                       workspace=ws)

# Exercise 2 - Create an Azure ML prebuilt Docker container image for scoring

## Task 1 - Test locally with Local Web service Compute target

#### Script to test the deployment and the model

Note that %%writefile is NOT a Python magic but a command of Jupyter Notebook. Avoid commenting before the %%writefile command, it would break the behavior of this command.

In [None]:
%%writefile ./sources/scoring_script.py

import json
import pickle as pkl
import pandas as pd
import numpy as np

def init():
    from azureml.core.model import Model
    global model
    model_name = 'DecisionTreeClassifier'
    model_path = Model.get_model_path(model_name=model_name)
    model = pkl.load(open(model_path, 'rb'))

def run(data):
    test = json.loads(data)
    result = model.predict(np.array(test["X_test"]).reshape(-1, 6)).tolist()
    return result

#### Set an environment for your service

In order to get reproductible development and testing experiences, Azure Machine Learning let you define some [Environments](https://docs.microsoft.com/en-us/azure/machine-learning/concept-environments) that will encapsulate all the elements you need to develop or/and run your applications within a well defined and reproductible context.

Azure Machine Learning propose plug and play curated environments, but you can also create our own and personalize everything.

![Environment description](./images/LAB04-Environment-description.png)

In [None]:
#Environment
from azureml.core import Environment
from azureml.core.environment import CondaDependencies

env = Environment(name='lab04_environment')
conda_dep = CondaDependencies()
conda_dep.add_conda_package("scikit-learn")
conda_dep.add_conda_package("pandas")
env.python.conda_dependencies=conda_dep
#Register your custom environment for further use
env.register(workspace=ws)

Now **look at the output field "baseImage"**. You will see that Azure Machine Leaning used defaultly a docker image to create your environment.

You can also observe the "Environments" page of the interface and observe the last version created. On the top right side of the screen you will see the default Docker image used for this environment.

![Environment Docker Image](./images/LAB04-Environment-Docker-Image.png)

#### Inference configuration

Here you will define on which environment your inference service will take place, and what script should be used to run it.

In [None]:
#Inference configuration (docker image, environment of the inference service)
from azureml.core.model import InferenceConfig

lab04_inference_config = InferenceConfig(
    environment=env,
    source_directory='./sources/',
    entry_script='scoring_script.py',
)

#### Compute configuration

Now you have to choose on which compute / machine(s) you want your service to run on.

Be careful and choose at least some compute that will be coherent - i.e. big enough - with the environment and inference configuration you have chosen before.

Also, depending on the fact that it's fort testings purposes, or live production service, you may want to choose different types and different sizes of computes in order to ensure your service will manage properly the expected workload.

*Note*: In case you do not have the [compute target](https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-target) you need already set, if your subscription is properly provisioned, Azure Machine Learning will allow you to create this compute target right away with just a few lines of Python code.

The purpose right now is to test the service, so we will use our local compute which will be big enough.

In [None]:
#deployment configuration (machine size, CPU, GPU, i.e. the compute target requested by your Web Service)
from azureml.core.webservice import LocalWebservice

deployment_config = LocalWebservice.deploy_configuration(port=6789)

#### Deploy your service

Here we deploy locally for testings purposes

In [None]:
#Deployment : Genreates a Docker build context
service = Model.deploy(
    ws,
    'covid19-survival-pred-local-test',
    [model],
    lab04_inference_config,
    deployment_config,
    overwrite=True,
)
service.wait_for_deployment(show_output=True)

#Deployment logs
print(service.get_logs())

Now **open the terminal and run the following command**:

*curl -X POST -d '{"X_test":\[0, 1, 2, 1, 0, 1\]}' -H "Content-Type: application/json" http://localhost:6789/score*

Now try with someone way older (change the "2" - age bean 20-30 by an "8" - age bean for the 80+ years old) and you will observe that the odds about survival are not the same.
*curl -X POST -d '{"X_test":\[0, 1, 8, 1, 0, 1\]}' -H "Content-Type: application/json" http://localhost:6789/score*

*Note*: Just as a reminder, the values passed in the list for the inference are (in this order): \[current_status, sex, age_group, hosp_yn, icu_yn, medcond_yn\]

## Task 2 - Deploy on managed compute and test live

Now you are fine with your model, you can deploy it on production so that the client can start using it.

The recommended Compute targets for this are [Azure Kubernetes Service (AKS)](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service?tabs=python) (AKS) for high-scale production deployments, or [Azure Container Instances](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-container-instance) (ACI) for low-scale scenarios.

We want to deploy some real-time scoring solutions. Machines have to be up-and-running 24/7 and give immediate responses. So will choose AKS. When the deployment is done with the Python Notebook, you can observe your endpoint and its settings.

*Note*: **It make take between 10 and 25 minutes**.

In [None]:
from azureml.core.compute import AksCompute, ComputeTarget
from azureml.core.webservice import AksWebservice

# Use the default configuration (you can also provide parameters to customize this)
prov_config = AksCompute.provisioning_configuration()

aks_name = 'akscovidcluster'
# Create the cluster
aks_target = ComputeTarget.create(workspace = ws,
                                    name = aks_name,
                                    provisioning_configuration = prov_config)

# Wait for the create process to complete
aks_target.wait_for_completion(show_output = True)

aks_target = AksCompute(ws, aks_name)
deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
service = Model.deploy(ws, 'akscovidservice', [model], lab04_inference_config, deployment_config, aks_target)
service.wait_for_deployment(show_output = True)
print(service.state)
print(service.get_logs())

Now **go to "Compute", then open the "Inference clusters" tab** from the interface.

You can see that a new compute named "akscovidcluster", dedicated to your real-time inference service, has been deployed and is now up and running. We will test in the next paragraph, that your service works properly on it.

![Inference Cluster Compute](./images/LAB04-Inference-cluster-compute.png)

#### Test quickly from the interface

Now go to the test interface of your newly created real-time endpoint, named "akscovidservice", choose the "Test" tab, and copy / paste the following data in the capture box: *{"X_test":\[0, 1, 2, 1, 0, 1\]}*

*Note*: Just as a reminder, the values passed in the list for the inference are (in this order): \[current_status, sex, age_group, hosp_yn, icu_yn, medcond_yn\]

![Test real-time endpoint directly from the Azure interface](./images/LAB04-13-Test-real-time-endpoint-AKS.png)

#### Test as an integrated solution: call your Web Service in Python

**BEWARE**: You will have to **change the api_key with your own one**. You can find it when you edit your endpoint and open the "Consume" tab. The api_key is named under "Primary key". You can for security purposes regenerate it. When you do that, you can use the second key temporarily while the primary one is re-generating, in order to not interrupt your service.

![LAB04 Api key to use endpoint](./images/LAB04-Api-key-to-use-endpoint.png)

In [None]:
import urllib.request
import json
import os
import ssl

def allowSelfSignedHttps(allowed):
    # bypass the server certificate verification on client side
    if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None):
        ssl._create_default_https_context = ssl._create_unverified_context

allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service.

# Request data goes here
data = {"X_test":[0, 1, 2, 1, 0, 1]}

body = str.encode(json.dumps(data))

url = 'http://20.61.110.142:80/api/v1/service/akscovidservice/score' # Replace this with the URL for the web service. This one is fake, for illustrative purposes only.
api_key = 'lL6jj02r4L5CVZ2PiLAhbjyAybtNAJ3T' # Replace this with the API key for the web service. This one is fake, for illustrative purposes only.
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key)}

req = urllib.request.Request(url, body, headers)

try:
    response = urllib.request.urlopen(req)

    result = response.read()
    print(result)
except urllib.error.HTTPError as error:
    print("The request failed with status code: " + str(error.code))

    # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
    print(error.info())
    print(json.loads(error.read().decode("utf8", 'ignore')))

# Exercise 3 - Deploy custom Docker container image with managed compute

## Task 1 - Deploy custom Docker container image with managed compute

Now that we have seen local deployments for development purposes.

Now we also have seen how to deploy your service to production environments.

We will see that those environments can be easily customized.

### Create a custom Docker container image

*Note* that you can also cherry-pick among the Docker images provided by Azure. We recommend custom Docker images in case you need to install non-Python packages as dependencies

In [None]:
#WRITE YOUR DOCKER FILE------------------------------
#Note that the dockerfile is here written as a string and not as a file.
#It works as well, and for more permanent, usual and collabotrative process
#we would rather recommend that you write it down as a file containing all
#these steps in a more permanent and "portable" location. Anyway it will be done
#when you will register your environment
dockerfile = r"""
FROM mcr.microsoft.com/mlops/python:latest
#Install the dependancies
RUN pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir azureml-defaults && \
    pip install --no-cache-dir azureml-core==1.27.0 &&\
    pip install --no-cache-dir scikit-learn==0.22.2 && \
    pip install --no-cache-dir pandas==0.25.3
"""

### Define your environment

In [None]:
#Environment
from azureml.core import Environment
from azureml.core.runconfig import DockerConfiguration

env = Environment(name='lab04_environment')
#Creates the environment inside a Docker container
env.docker.enabled = True #Still working but deprecated
#Set base image to None, because the image is defined by dockerfile
env.docker.base_image = None
env.docker.base_dockerfile = dockerfile
#To deactivate Conda and only use the packages you need/want
env.python.user_managed_dependencies=True
#With a custom Docker image you have to specify the inferencing stack version added to the image which is the latest
env.inferencing_stack_version='latest'
#Register your custom environment for further use
env.register(workspace=ws)

Now **look at the output field "baseDockerfile"**. You cas observe that your customizations has been taken in account.

### Define your inference configuration

In [None]:
#Inference configuration (docker image, environment of the inference service)
from azureml.core.model import InferenceConfig

lab04_inference_config = InferenceConfig(
    environment=env,
    source_directory='./sources/',
    entry_script='scoring_script.py',
)

### Deploy on your custom environment

#### Deployment configuration

Machine size, CPU, GPU, i.e. the compute target requested by your Web Service

In [None]:
from azureml.core.webservice import LocalWebservice

deployment_config = LocalWebservice.deploy_configuration(port=6789)

#### Deployment itself

Genreates a Docker build context

*Note*: **It make take between 1 and 2 minutes**.

*Note*: If you want to store and manage your custom Docker containers - for instance using push and pull Docker commands to store them then get them back, you will have to use a new resource: the [Azure container registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal).

In [None]:
service = Model.deploy(
    ws,
    'covid19-survival-pred-local-test',
    [model],
    lab04_inference_config,
    deployment_config,
    overwrite=True,
)
service.wait_for_deployment(show_output=True)

print(service.get_logs())

#### Quick testing of your service in this environment

Now open the terminal and run the following command :

*curl -X POST -d '{"X_test":\[0, 1, 2, 1, 0, 1\]}' -H "Content-Type: application/json" http://localhost:6789/score*

Now try with someone way older (change the "2" - age bean 20-30 by an "8" - age bean for the 80+ years old) and you will observe that the odds about survival are not the same.
*curl -X POST -d '{"X_test":\[0, 1, 8, 1, 0, 1\]}' -H "Content-Type: application/json" http://localhost:6789/score*

*Note*: Just as a reminder, the values passed in the list for the inference are (in this order): \[current_status, sex, age_group, hosp_yn, icu_yn, medcond_yn\]

## Task 2 - Pinning packages versions with requirements.txt

### Define packages versions

You can pin each package versions in a requirements.txt file

Pinning the exact version of the packages you use allows you to run some reproductible experiences, collaborate with others avoiding environment discrepancies, or issues if a library you use releases a new version that breaks your running scripts.

In [None]:
%%writefile requirements.txt
    azureml-defaults
    azureml.core==1.27.0
    pip==20.1.1
    scikit-learn==0.22.2
    pandas==0.25.3

### Now let's use requirements.txt to pin your packages versions, build and deploy your environment

In [None]:
#Environment using requirements.txt
from azureml.core import Environment
from azureml.core.model import InferenceConfig

env = Environment.from_pip_requirements(name='lab04_environment', file_path='requirements.txt')
env.register(workspace=ws)
lab04_inference_config = InferenceConfig(
    environment=env,
    source_directory='./sources/',
    entry_script='scoring_script.py',
)
from azureml.core.webservice import LocalWebservice
deployment_config = LocalWebservice.deploy_configuration(port=6789)
service = Model.deploy(
    ws,
    'covid19-survival-pred-local-test',
    [model],
    lab04_inference_config,
    deployment_config,
    overwrite=True,
)
service.wait_for_deployment(show_output=True)
print(service.get_logs())

#### Quick testing of your service in this environment

Now open the terminal and run the following command :

*curl -X POST -d '{"X_test":\[0, 1, 2, 1, 0, 1\]}' -H "Content-Type: application/json" http://localhost:6789/score*

Now try with someone way older (change the "2" - age bean 20-30 by an "8" - age bean for the 80+ years old) and you will observe that the odds about survival are not the same.
*curl -X POST -d '{"X_test":\[0, 1, 8, 1, 0, 1\]}' -H "Content-Type: application/json" http://localhost:6789/score*

*Note*: Just as a reminder, the values passed in the list for the inference are (in this order): \[current_status, sex, age_group, hosp_yn, icu_yn, medcond_yn\]