# MSA 2023 Phase 2 - Part 3 (Example)

In this notebook, the well-known Iris dataset is used to build a classifier, which is then deployed onto Azure via the Azure Machine Learning Software Development Kit (SDK) and called via a real-time endpoint.

### 1. Load necessary packages and modules

**Before running the code below for your own model**:
1. Make sure that you're using scikit-learn version 1.1.* (* = any number). You can check this by running `import sklearn; print(sklearn.__version__)`. This is to ensure your model will work on Azure.
    - If you don't, then install version 1.1.* into your virtual environment by running `pip install scikit-learn~=1.1.0`, or recreate your virtual environment using the [requirements.txt](../requirements.txt) file in this repository.
    - After doing the above, restart your kernel.

In [9]:
# Azure Machine Learning SDK core
from azureml.core import Workspace
from azureml.core.model import Model

# Scikit-learn and others
from sklearn import datasets; iris = datasets.load_iris()
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import pickle

### 2. Train and save a model

Model training here is provided only as an example, it assumes you don't have a trained model already.

**Before running the code below for your own model**:
1. Load in your trained model, or copy the last line of the code below into the notebook where your trained model is located.
2. Save your trained model as a .pkl file called model.pkl, as shown below.

In [4]:
# Split data into a training set and a test set (test set not used in this example)
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train and save model
rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)
pickle.dump(rfc, open("model.pkl", "wb"))

### 3. Load and connect to workspace

**Before running the code below for your own model**:
1. Replace the path below with the path to your own config.json file that you downloaded in step 5 of [Getting Started with Azure Machine Learning](../0.%20Resources/docs/getting-started-with-azure-ml.md).

After running the code below, a browser window should open where you have to sign in to Azure.

In [5]:
ws = Workspace.from_config(path="config.json")

### 4. Register model onto Azure

**Before running the code below for your own model**:
1. Replace the model name below with an appropriate name for your model.
2. Replace the model path below with the path to your own .pkl file.

In [8]:
model = Model.register(ws, model_name="iris-random-forest", model_path="model.pkl")

Registering model iris-random-forest


### 5. Create entry script

Registering a model only uploads it to Azure and doesn't let us run the model yet, so an entry script needs to be created that will run when the model receives data. For the Iris dataset, the entry script is in a file called [score.py](../3.%20Microsoft%20Azure/score.py) provided in this repository.

**Before running the code below for your own model**:
1. Create your own entry script by changing the variables below in [score.py](../3.%20Microsoft%20Azure/score.py):
    - __model_name__: The name of the .pkl file to your one (assuming you've named it something other than model.pkl).
    - __classes__: The classes dictionary to contain keys and values which match the way you have encoded labels (for classification), or to not use it and just return the result (for regression).

**You don't need to change any other parts of the script** unless you want to do any other post-processing of results or receive any other errors, in which case please browse the Azure Machine Learning [troubleshooting documentation](https://learn.microsoft.com/en-gb/azure/machine-learning/how-to-troubleshoot-online-endpoints) or ask on Discord.

### 6. Create real-time endpoint

After registering your model and creating an entry script, you can configure and deploy a real-time endpoint to run your model over the Internet and allow others to do as well. 

For Phase 2, we will do this in the Machine Learning Studio itself in a simplified manner as described below, but note that this can be done using the Azure Machine Learning SDK libraries as well.

To create a real-time endpoint:

1. Go into your [Model List](https://ml.azure.com/model/list) in the Machine Learning Studio (it should look something similar to [this list](../0.%20Resources/images/model_list.png) if you have registered your model as described above), and select the model you want to deploy.

2. In the menu above, click _Deploy_ then _Real-time endpoint_. A wizard should open with a series of steps to complete.

3. Click _Next_ on the Endpoint step.

4. Click _Next_ on the Model step.

5. Click _Next_ on the Deployment step.

6. On the Environment step, under _Select scoring file and dependencies_, upload your entry script.

7. On the Environment step, under _Dependencies_, search for the environment called _sklearn-1:1:7_ and select it.

8. Click _Next_ on the Environment step.

9. Click _Next_ on the Compute step.

10. On the Traffic step, ensure that 100% of traffic goes to the model you want to deploy (in case you have deployed multiple models to the same endpoint).

11. Click _Next_ on the Traffic step.

12. On the Review step, check that you have configured everything as required (it should look similar to this [configuration](../0.%20Resources/images/model_deployment.png)).

13. Click _Create_, then wait until your model has been deployed (this can take up to 15 minutes). Once it's deployed, you should see something similar to these [endpoint details](../0.%20Resources/images/model_deployment_complete.png). If the deployment fails, check the deployment logs (located in the _Deployment logs_ section, as shown in the previous image) for the exact error that caused it to fail (it may be something in your entry script, your .pkl file, or elsewhere), fix the error, then re-run the above steps.

### 7. Test and share endpoint for marking

To test and share your real-time endpoint for marking:
1. Go to the _Consume_ tab of your endpoint
2. Copy the Python code under the _Consumption option_ section into the notebook you plan to submit for this part
2. In your Python code:
    - Add your secondary key into the api_key variable by copying it from the _Consume_ tab
    - Add input data into the data variable
        - Note that in the [score.py](./score.py) entry script we have provided, the input data is an array of numbers which come from the value of a dictionary with the key being `data`, so the format of your input data will depend on how you've written your entry script
        - Please refer to the code below to see how the input data is formatted
3. Ensure that your input data and secondary key are present within the notebook so that the MSA team can test it for marking purposes

The code below is copied from the _Consumption option_ section and updated as described above to call the deployed model for the Iris dataset. If you make the changes to the code below as described above, run it, and it returns a prediction, then you have successfully deployed your model!

In [15]:
import urllib.request
import json
import os
import ssl

def allowSelfSignedHttps(allowed):
    # bypass the server certificate verification on client side
    if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None):
        ssl._create_default_https_context = ssl._create_unverified_context

allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service.

# Request data goes here
# The example below assumes JSON formatting which may be updated
# depending on the format your endpoint expects.
# More information can be found here:
# https://docs.microsoft.com/azure/machine-learning/how-to-deploy-advanced-entry-script
data = {
    "data": [[6.1, 2.8, 4.7, 1.2]]
}

body = str.encode(json.dumps(data))

url = 'https://msa2023-phase2-azure-bmftu.australiaeast.inference.ml.azure.com/score'
# Replace this with the primary/secondary key or AMLToken for the endpoint
api_key = ''
if not api_key:
    raise Exception("A key should be provided to invoke the endpoint")

# The azureml-model-deployment header will force the request to go to a specific deployment.
# Remove this header to have the request observe the endpoint traffic rules
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key), 'azureml-model-deployment': 'iris-random-forest-1' }

req = urllib.request.Request(url, body, headers)

try:
    response = urllib.request.urlopen(req)

    result = response.read()
    print(result)
except urllib.error.HTTPError as error:
    print("The request failed with status code: " + str(error.code))

    # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
    print(error.info())
    print(error.read().decode("utf8", 'ignore'))

b'["versicolor"]'


### 8. Cleanup

Please do not delete or turn off any resources until the MSA team has completed marking. Once finished marking, this will be announced on Discord, after which (to avoid incurring unnecessary costs):
1. Delete your models and endpoints from [Machine Learning Studio](https://ml.azure.com/).
1. Delete or turn off any resources you created, as explained in [this section](https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-azure-ml-in-a-day?view=azureml-api-2#delete-all-resources) of the Azure Machine Learning documentation.