# Train remotely in Azure Machine Learning

The Iris dataset is often used in machine learning and data science courses, because it’s simple to understand and well-defined, yet interesting enough to present real challenges to new learners. This tutorial will classify the Iris dataset into one of three flower species: Setosa, Versicolor, or Virginica.

### What is the Iris dataset?
The iris data consisted of 150 samples of three species of Iris. The first column represented sepal length, the second column represented sepal width, the third column represented petal length, and the fourth column represented petal width. I'm going to use sci-kit-learn to classify these instances according to their species of Iris, which will be distinguished based on their measurements. 
In fact, three of these iris species look similar, but the difference in measurements can be used to classify them. This data set is a classic example of supervised learning. The input variables are sepal length and width and petal length and width; each row represents an instance or observation. The output variable is Iris-setosa, Iris-versicolor, or Iris-virginica; each column represents a class label.

The picture of the Iris species is given below:

![Iris flower](../../media/iris_flowers.png "Iris")
### Tutorial explanation
In this tutorial we will use Azure ML with MLFlow to train a model for the iris dataset.
We will:
* Create a compute resource in Azure ML
* Upload the dataset to Azure ML
* Use the code in /src/main.py to train the model creating an experiment 
* Register the model using MLFlow

## Set up the pipeline resources

The Azure ML framework can be used from CLI, Python SDK, or studio interface. In this example, you'll use the AzureML Python SDK v2 to create a pipeline. 

Before creating the pipeline, you'll set up the resources the pipeline will use:

* The dataset for training
* The software environment to run the pipeline

## Connect to the workspace

Before we dive in the code, you'll need to connect to your Azure ML workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.

We are using `DefaultAzureCredential` to get access to workspace. 
`DefaultAzureCredential` should be capable of handling most Azure SDK authentication scenarios. 

Reference for more available credentials if it does not work for you: [configure credential example](../../configuration.ipynb), [azure-identity reference doc](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity?view=azure-python).

In [1]:
from dotenv import load_dotenv
import os

load_dotenv()
SUBSCRIPTION_ID = os.getenv("SUBSCRIPTION_ID") 
RESOURCE_GROUP = os.getenv("RESOURCE_GROUP")
AML_WORKSPACE_NAME = os.getenv("AML_WORKSPACE_NAME")

In [2]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

# authenticate
credential = DefaultAzureCredential()

# Get a handle to the workspace
ml_client = MLClient(
    credential=credential,
    subscription_id=SUBSCRIPTION_ID,
    resource_group_name=RESOURCE_GROUP,
    workspace_name=AML_WORKSPACE_NAME,
)


Class WorkspaceHubOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


In [8]:
#create  or get compute
from azure.ai.ml.entities import AmlCompute

cpu_compute_name = "iris-cpu-cluster"

try:
    compute = ml_client.compute.get(cpu_compute_name)
except Exception:
    print("Creating a new cpu compute target...")
    compute = AmlCompute(
        name=cpu_compute_name, size="STANDARD_D2_V2", min_instances=0, max_instances=4
    )
    ml_client.compute.begin_create_or_update(compute).result()

In [9]:
# create or get dataset
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

# local filesystem
data_asset_name = "Iris"
my_path = "./data/iris.csv"
# set the version number of the data asset
v1 = "1"

my_data = Data(
    name=data_asset_name,
    version=v1,
    description="Iris data set",
    path=my_path,
    type=AssetTypes.URI_FILE,
)

## create data asset if it doesn't already exist:
try:
    data_asset = ml_client.data.get(name=data_asset_name, version=v1)
    print(
        f"Data asset already exists. Name: {my_data.name}, version: {my_data.version}"
    )
except:
    data_asset = ml_client.data.create_or_update(my_data)
    print(f"Data asset created. Name: {my_data.name}, version: {my_data.version}")

Data asset already exists. Name: Iris, version: 1


In [10]:
from azure.ai.ml import command, Input

# define the command
command_job = command(
    code="./src",
    command="python main.py --iris-csv ${{inputs.iris_csv}} --learning-rate ${{inputs.learning_rate}} --boosting ${{inputs.boosting}}",
    environment="AzureML-lightgbm-3.2-ubuntu18.04-py37-cpu@latest",
    inputs={
        "iris_csv": Input(
            type="uri_file",
            path=data_asset.path,
        ),
        "learning_rate": 0.9,
        "boosting": "gbdt",
    },
    compute=cpu_compute_name,
)

In [11]:
# submit the command
returned_job = ml_client.jobs.create_or_update(command_job)
# get a URL for the status of the job
returned_job.studio_url

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


'https://ml.azure.com/runs/purple_battery_9wt86dfsmq?wsid=/subscriptions/ec967cb5-f2b0-43c2-9ba2-4a2eb94bbacc/resourcegroups/MLOps-demo-azureml/workspaces/modelops-demo-azureml&tid=16b3c013-d300-468d-ac64-7eda0820b6d3'

## Register the trained model

In [None]:
from azure.ai.ml.entities import Model
from azure.ai.ml.constants import AssetTypes

run_model = Model(
    path="azureml://jobs/{}/outputs/artifacts/paths/model/".format(returned_job.name),
    name="iris-remote-run-model-example",
    description="Model created from run.",
    type=AssetTypes.MLFLOW_MODEL
)

ml_client.models.create_or_update(run_model)