# This is a standalone notebook, used for development purposes

This is a sample notebook to execute a command in the AML workspace. This notebook can be used for development purposes by the Data scientists, to try out their experiments.

## Pre-requisites

- AML Workspace needs to be configured
- `azure-cli` should be present in the machine, where this notebook is being executed
- `azure-cli` should be logged in and the default subscription should be set to the subscription, where the AML workspace is present

## Setup variables

For loading secrets it's recommended to load them as an environment variable and use them in the notebook as `os.environ.get('MY_SECRET')`. You can load them using `python-dotenv`(https://pypi.org/project/python-dotenv/) or other similar libraries.

In [None]:
subscription_id = "add-your-subscription-id-here"
resource_group_name = "jupyter-notebook-test"
workspace_name = "jupyter-test-2fe8"

# compute related variables
cluster_name = "jupytertest2fe8"
cluster_size = "Standard_DS11_v2"
cluster_region = "centralus"
min_instances = 1
max_instances = 1
idle_time_before_scale_down = 0

# environment related variables
env_base_image_name = "mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04"
conda_path = "path to the conda file"
environment_name = "test2fe8"
description = "test-env-name"

#command related
experiment_name = "exp2fe8"
display_name = "Experiment 2fe8"

In [None]:
#install packages

%pip install azure-ai-ml==1.7.2
%pip install azure-identity==1.13.0

## Create an AML Client to interact with the workspace

In [None]:
from azure.ai.ml import MLClient
from azure.identity import AzureCliCredential


client = MLClient(
        AzureCliCredential(),
        subscription_id=subscription_id,
        resource_group_name=resource_group_name,
        workspace_name=workspace_name)

## Create a compute, or get a compute if already exists to run the command

The following block of code will help in creating a compute instance within the AML workspace. If the workspace already has a compute instance, it will return the existing compute instance.

In [None]:
from azure.ai.ml.entities import AmlCompute

def create_or_get_compute():
    compute_object = None
    try:
        try:
            compute_object = client.compute.get(cluster_name)
            print(f"Found existing compute target {cluster_name}, so using it.")
        except:
            print(f"{cluster_name} is not found! Trying to create a new one.")
            compute_object = AmlCompute(
                name=cluster_name,
                type="amlcompute",
                size=cluster_size,
                location=cluster_region,
                min_instances=min_instances,
                max_instances=max_instances,
                idle_time_before_scale_down=idle_time_before_scale_down,
            )
            compute_object = client.compute.begin_create_or_update(
                compute_object
            ).result()
            print(f"A new cluster {cluster_name} has been created.")
    except Exception as ex:
        print("Oops!  invalid credentials.. Try again...")
        raise
    return compute_object

compute = create_or_get_compute()

## Create an environment, or get an existing environment from the workspace

The following block of code will help in creating an environment within the AML workspace. If the workspace already has a environment, it will return the existing environment.

In [None]:
from azure.ai.ml.entities import Environment

def create_or_get_environment():
    try:
        print(f"Checking {environment_name} environment.")
        env_docker_conda = Environment(
            image=env_base_image_name,
            conda_file=conda_path,
            name=environment_name,
            description=description,
        )
        environment = client.environments.create_or_update(env_docker_conda)
        print(f"Environment {environment_name} has been created or updated.")
        return environment

    except Exception as ex:
        print(
            "Oops! invalid credentials or error while creating ML environment.. Try again..."
        )
        raise

environment = create_or_get_environment()

## Define the command to be executed in AML

In the below section, we are creating a sample command job that will be executed in the AML compute. You can change the command as per your requirement. This command job needs to be written to a file and then referred while creating the command.

In [None]:
%%writefile data_prep.py
import argparse

def main(raw_data_path, prep_data_path):
    print(f"function to process raw data from: {raw_data_path} and prep data from: {prep_data_path}")
    # perform the data prep activity

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--raw_data_path", type=str, default="../data/raw_data", help="Path to raw data",
    )
    parser.add_argument(
        "--prep_data_path", type=str, default="../data/prep_data", help="Path to prep data"
    )
    args = parser.parse_args()
    main(args.raw_data_path, args.prep_data_path)


## Create a command job and submit it to the AML compute

In [None]:
from azure.ai.ml import command

command_job = command(
        experiment_name=experiment_name,
        display_name=display_name,
        code="./",
        command="python ./data_prep.py --raw_data ../data/raw_data --prep_data ../data/prep_data",
        environment=environment,
        compute=cluster_name,
        environment_variables={
            "ENV_VARIABLES_FOR_COMMAND": ""
        }
    )

returned_job = client.jobs.create_or_update(command_job)

## Wait for the job to complete and print the logs - (Optional)

In [None]:
import time

total_wait_time_in_sec = 21600
current_wait_time = 0
job_status = [
    "NotStarted",
    "Queued",
    "Starting",
    "Preparing",
    "Running",
    "Finalizing",
    "Provisioning",
    "CancelRequested",
    "Failed",
    "Canceled",
    "NotResponding",
]

while returned_job.status in job_status:
    if current_wait_time <= total_wait_time_in_sec:
        time.sleep(20)
        returned_job = client.jobs.get(returned_job.name)

        current_wait_time = current_wait_time + 15

        if (
            returned_job.status == "Failed"
            or returned_job.status == "NotResponding"
            or returned_job.status == "CancelRequested"
            or returned_job.status == "Canceled"
        ):
            break
    else:
        break
if (
    returned_job.status == "Completed"
    or returned_job.status == "Finished"
):
    print("job completed")
else:
    print("Exiting job with failure")
    raise Exception("Sorry, exiting job with failure..")