Dev Environment Setup
---------------------
This is the *one-time* setup. 

1. create the conda env with proper SDK installed `azure` and use the conda environment to run the notebook 
2. down load the workspace config to `~/.azure/ygong/config.json` so that we can instantiate workspace from the config without needing supply the subscription, resourceGroup, etc all the time

In [2]:
# set up the conda environment for AzureCli
conda create  -n azure python=3.10
conda activate azure
pip install azure-cli # 2.55.0 on Feb 2024 installation
pip install azure-ai-ml
pip install azure-identity
pip install azureml-core

# pip install azureml-sdk # will fail with the depedency issue


SyntaxError: invalid syntax (3197167884.py, line 2)

Analysis Goal
-------------
Competitive Analysis For AzureML Training CUJ on 
* compute resource creation and management 
* how do they manage runtime environment
* workflow integration
* interactive development experience
* metrics and monitoring

In [9]:
from azureml.core import Workspace
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

ws = Workspace.from_config("~/.azure/ygong")


If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


In [11]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

cluster_name = "cpu-test"
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',
                                                           max_nodes=4)
    cpu_cluster = ComputeTarget.create(ws, cluster_name, compute_config)
cpu_cluster.wait_for_completion(show_output=True)

Found existing cluster, use it.
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


In [12]:
from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies

# Add necessary packages to define the environment
myenv = Environment(name="myenv")
conda_dep = CondaDependencies()
conda_dep.add_conda_package("numpy")
conda_dep.add_conda_package("scikit-learn")
conda_dep.add_pip_package("azureml-sdk")
myenv.python.conda_dependencies=conda_dep

# This object specifies the script to run
from azureml.core import ScriptRunConfig
src = ScriptRunConfig(source_directory=".",
                      script="train.py",
                      compute_target=cluster_name,
                      environment=myenv)

# submit the experiment
from azureml.core import Experiment
exp = Experiment(workspace=ws, name="my-experiment")
run = exp.submit(config=src)
run.wait_for_completion(show_output=True)