## Connect to the workspace

Before you dive in the code, you'll need to connect to your Azure ML workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.

We're using `DefaultAzureCredential` to get access to workspace. 
`DefaultAzureCredential` is used to handle most Azure SDK authentication scenarios. 

Reference for more available credentials if it doesn't work for you: [configure credential example](../../configuration.ipynb), [azure-identity reference doc](https://docs.microsoft.com/python/api/azure-identity/azure.identity?view=azure-python).

In [None]:
# Handle to the workspace
from azure.ai.ml import MLClient

# Authentication package
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()

In the next cell, enter your Subscription ID, Resource Group name and Workspace name. To find these values:

1. In the upper right Azure Machine Learning studio toolbar, select your workspace name.
1. Copy the value for workspace, resource group and subscription ID into the code.  
1. You'll need to copy one value, close the area and paste, then come back for the next one.

![Credentials.png](https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/screenshots/Credentials.png)

In [None]:
# Enter details of your AML workspace
subscription_id_ws = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"

In [None]:
# Get a handle to the workspace
ml_client = MLClient(
    credential=credential,
    subscription_id=subscription_id_ws,
    resource_group_name=resource_group,
    workspace_name=workspace,
)

## Create a compute resource to run your job

You already have a compute instance you're using to run the notebook.  But now you'll add another type, a **compute cluster** that you'll use to run your training job. The compute cluster can be single or multi-node machines with Linux or Windows OS, or a specific compute fabric like Spark.

You'll provision a Linux compute cluster. See the [full list on VM sizes and prices](https://azure.microsoft.com/pricing/details/machine-learning/) .

For this example, you only need a basic cluster, so you'll use a Standard_DS11_v2 model with 2 vCPU cores, 14-GB RAM. If you have already created a compute cluster, please specify its name in **cpu_compute_target** variable.

In [None]:
from azure.ai.ml.entities import AmlCompute

# Name assigned to the compute cluster
cpu_compute_target = "cpu-cluster"

try:
    # let's see if the compute target already exists
    cpu_cluster = ml_client.compute.get(cpu_compute_target)
    print(
        f"You already have a cluster named {cpu_compute_target}, we'll reuse it as is."
    )

except Exception:
    print("Creating a new cpu compute target...")

    # Let's create the Azure ML compute object with the intended parameters
    cpu_cluster = AmlCompute(
        name=cpu_compute_target,
        # Azure ML Compute is the on-demand VM service
        type="amlcompute",
        # VM Family
        size="STANDARD_DS11_V2",
        # Minimum running nodes when there is no job running
        min_instances=0,
        # Nodes in cluster
        max_instances=1,
        # How many seconds will the node running after the job termination
        idle_time_before_scale_down=180,
        # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination
        tier="Dedicated",
    )
    print(
         f"AMLCompute with name {cpu_cluster.name} will be created, with compute size {cpu_cluster.size}"
          )
    # Now, we pass the object to MLClient's create_or_update method
    cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)



## What is a command job?

You'll create an Azure ML *command job* to train a model for credit default prediction. The command job is used to run a *training script* in a specified environment on a specified compute resource.  You've already created the environment and the compute resource.  Next you'll create the training script.

The *training script* handles the data preparation, training and registering of the trained model. In this tutorial, you'll create a Python training script.

Command jobs can be run from CLI, Python SDK, or studio interface. In this tutorial, you'll use the Azure ML Python SDK v2 to create and run the command job.

After running the training job, you'll deploy the model, then use it to produce a prediction.

As you can see in this script, once the model is trained, the model file is saved and registered to the workspace. Now you can use the registered model in inferencing endpoints.

## Configure the command

Now that you have a script that can perform the desired tasks, you'll use the general purpose **command** that can run command line actions. This command line action can be directly calling system commands or by running a script. 

Here, you'll create input variables to specify the input data, split ratio, learning rate and registered model name.  The command script will:
* Use the compute created earlier to run this command.
* Use the environment created earlier - you can use the `@latest` notation to indicate the latest version of the environment when the command is run.
* Configure some metadata like display name, experiment name etc. An *experiment* is a container for all the iterations you do on a certain project. All the jobs submitted under the same experiment name would be listed next to each other in Azure ML studio.
* Configure the command line action itself - `python main.py` in this case. The inputs/outputs are accessible in the command via the `${{ ... }}` notation.

In [None]:
#### TODO:

from azure.ai.ml import command
from azure.ai.ml import Input

registered_model_name = "deposit-prediction-model"

job = command(
    inputs=dict(
        data=Input(
            type="uri_file",
            path="https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/data/bank.csv",
        ),
        test_train_ratio=0.2,
        # Specify hyperparameters of Random Forest
        max_depth=10,
        n_estimators=300,
        registered_model_name=registered_model_name,
    ),
    code="./src/",  # location of source code
    command="python main.py --data ${{inputs.data}} --test_train_ratio ${{inputs.test_train_ratio}} --max_depth ${{inputs.max_depth}} --registered_model_name ${{inputs.registered_model_name}}",
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
    compute="cpu-cluster",
    experiment_name="train_model_deposit_prediction",
    display_name="deposit-prediction",
)

## Submit the job 

Using the MLClient created earlier, we will now run this Command as a job in the workspace.

In [None]:
ml_client.create_or_update(job)

# Deploy model
Once the job is completed, you would see a model in the **Models** tab
![Logs](https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/screenshots/Models.png)

Click on deposit-prediction-model, Deploy tab and select Real-time endpoint
![Logs](https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/screenshots/Realtime_endpoint.png)

Use the following configuration, but keep in mind that the **Endpoint name** and **Deployment name** are not required to be the same as shown in the screenshot. This deployment requires a virtual machine to host the endpoint which would use a specified environment in conda.yml and Python script main.py.
![Endpoint_configuration](https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/screenshots/Endpoint_configuration.png)

Make sure to remove this endpoint once you are finished using it

## Test endpoint
To test the endpoint, go to Endpoints tab and click on the created endpoint.
![Logs](https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/screenshots/Endpoint.png)

Click the Test tab, enter the following JSON schema, and press the Test button.
![Test_endpoint](https://raw.githubusercontent.com/Khaninsi/Azure-MLOps/master/screenshots/Test_endpoint.png)
If no error messages appear, the deployment was successful and completed. Congrats!

In [3]:
# ### NOTE: this cell cannot run within Azure ML Studio but can run in your local machine 

# # Convert the data to the types where the Azure ML endpoint could consume
# import json

# df_bank = pd.read_csv("bank.csv")

# # Select Features
# feature = df_bank.drop('deposit', axis=1)

# # Azure ML endpoint could consume JSON Schema in orient="split" format
# result = feature.iloc[300:302].to_json(orient="split")
# parsed = json.loads(result)

# # Copy this Json schema as a value of the "input_data" key
# print(json.dumps(parsed, indent=4))

In [4]:
# {
#   "input_data": {
#     "columns": [
#         "age",
#         "job",
#         "marital",
#         "education",
#         "default",
#         "balance",
#         "housing",
#         "loan",
#         "contact",
#         "day",
#         "month",
#         "duration",
#         "campaign",
#         "pdays",
#         "previous",
#         "poutcome"
#     ],
#     "index": [
#         300,
#         301
#     ],
#     "data": [
#         [
#             36,
#             "blue-collar",
#             "divorced",
#             "secondary",
#             "no",
#             638,
#             "yes",
#             "no",
#             "unknown",
#             16,
#             "jun",
#             1395,
#             2,
#             -1,
#             0,
#             "unknown"
#         ],
#         [
#             48,
#             "unemployed",
#             "single",
#             "tertiary",
#             "no",
#             3229,
#             "no",
#             "no",
#             "unknown",
#             16,
#             "jun",
#             1089,
#             1,
#             -1,
#             0,
#             "unknown"
#         ]
#     ]
# }
# }