Set up a Python development environment for Azure Machine Learning <br>
<a href = "https://docs.microsoft.com/en-in/azure/machine-learning/how-to-configure-environment#local">Set up development environment</a>

Train a model with azure Machine Learning <br><a href = "https://docs.microsoft.com/en-in/azure/machine-learning/tutorial-train-models-with-aml">Train a model with azure ML</a>

In [19]:
! az login

[93mThe default web browser has been opened at https://login.microsoftonline.com/organizations/oauth2/v2.0/authorize. Please continue the login in the web browser. If no web browser is available or if the web browser fails to open, use device code flow with `az login --use-device-code`.[0m
Opening in existing browser session.
[17962:17962:0100/000000.820984:ERROR:sandbox_linux.cc(374)] InitializeSandbox() called with multiple threads in process gpu-process.
[
  {
    "cloudName": "AzureCloud",
    "homeTenantId": "fb44495d-8da7-44d4-b597-6199eb799ccc",
    "id": "b1997ed0-2373-4622-83fc-102035f13ae5",
    "isDefault": true,
    "managedByTenants": [],
    "name": "Azure for Students",
    "state": "Enabled",
    "tenantId": "fb44495d-8da7-44d4-b597-6199eb799ccc",
    "user": {
      "name": "bhavyakumawat99@gmail.com",
      "type": "user"
    }
  }
]


### Import packages

In [20]:
# Import Python packages
import numpy as np
import matplotlib.pyplot as plt
import os 
import azureml.core
from azureml.core import Workspace

# check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.36.0


### Create workspace resources you need to get started with Azure Machine Learning
<a href = "https://docs.microsoft.com/en-in/azure/machine-learning/quickstart-create-resources">Workspace resources</a>

In [21]:
# Create the workspace

ws = Workspace.create(name='mask-detector',
               subscription_id='b1997ed0-2373-4622-83fc-102035f13ae5',
               resource_group='Mask-Detection',
               create_resource_group=True,
               location='eastus' # VM size - Standard_NC6 is only available in East U.S. and South Central U.S. locations
               )



Deploying AppInsights with name maskdeteinsights9a1cb6c6.
Deployed AppInsights with name maskdeteinsights9a1cb6c6. Took 8.21 seconds.
Deploying StorageAccount with name maskdetestorage15e951a21.
Deploying KeyVault with name maskdetekeyvaultcbf87f24.
Deployed KeyVault with name maskdetekeyvaultcbf87f24. Took 24.85 seconds.
Deployed StorageAccount with name maskdetestorage15e951a21. Took 29.07 seconds.
Deploying Workspace with name mask-detector.
Deployed Workspace with name mask-detector. Took 28.42 seconds.


In [22]:
print(ws.name, ws.location, ws.resource_group, sep='\t')

mask-detector	eastus	Mask-Detection


### Create experiment

In [23]:
# Create an experiment to track the runs in your workspace.
experiment_name = 'mask_detection_experiments'

from azureml.core import Experiment
exp = Experiment(workspace=ws, name=experiment_name)

### Create or Attach existing compute resource

In [24]:
# By using Azure Machine Learning Compute, you can train machine learning models on clusters of Azure virtual machines.
# create Azure Machine Learning Compute as your training environment

from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget

# choose a name for your cluster
compute_name = "cpu-cluster"
compute_min_nodes = 0
compute_max_nodes = 4

# This example uses CPU VM, set SKU to STANDARD_D2_V2. For using GPU VM, set SKU to STANDARD_NC6
vm_size = "STANDARD_NC6"


if compute_name in ws.compute_targets:
    compute_target = ws.compute_targets[compute_name]
    if compute_target and type(compute_target) is AmlCompute:
        print("found compute target: " + compute_name)
else:
    print("creating new compute target...")
    provisioning_config = AmlCompute.provisioning_configuration(vm_size = vm_size,
                                                                min_nodes = compute_min_nodes, 
                                                                max_nodes = compute_max_nodes)
    # create the cluster
    compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
    
    # can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it will use the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
    
     # For a more detailed view of current AmlCompute status, use get_status()
    print(compute_target.get_status().serialize())

creating new compute target...
InProgress...
SucceededProvisioning operation finished, operation "Succeeded"
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Resizing', 'allocationStateTransitionTime': '2021-11-19T06:54:08.022000+00:00', 'errors': None, 'creationTime': '2021-11-19T06:54:07.583227+00:00', 'modifiedTime': '2021-11-19T06:54:23.215307+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT1800S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_NC6'}


### Dataset

* A storage account, blob storage containers and a Datastore in ML workspace is already created for you by Azure ML service


* There are two dataset types that can be used in Azure Machine Learning training workflows : FileDatasets and TabularDatasets. <br>
<a href= "https://docs.microsoft.com/en-in/azure/machine-learning/how-to-create-register-datasets#create-a-filedataset">Create a FileDataset with the Python SDK or the Azure Machine Learning studio</a> <br>



In [None]:
# If you want to upload all the files from a local directory, create a FileDataset in a single method with 
# upload_directory(). This method uploads data to your underlying storage, and as a result incur storage costs.

from azureml.core import Datastore, Dataset
from azureml.data.datapath import DataPath

datastore = ws.get_default_datastore() # Datastore.get(ws, 'workspaceblobstore')
src_dir = os.path.join(os.getcwd(), 'resized_Dataset') 

# upload directory
ds = datastore.upload(src_dir, target_path='/dataset', overwrite=False, show_progress=False) 

In [26]:
# create FileDataset object
datastore_paths = (datastore, '/dataset/**')
mask_ds = Dataset.File.from_files(path=datastore_paths)

In [27]:
# you register the Dataset to your workspace for easy retrieval during training.
mask_dataset = mask_ds.register(workspace=ws,
                                name='mask_dataset',
                                create_new_version=True)

In [28]:
# mask_dataset = Dataset.get_by_name(ws, name='mask_dataset')

### Train on a remote cluster

For this task, you submit the job to run on the remote training cluster you set up earlier. To submit a job you:
* Create a directory
* Create a training script
* Create a script run configuration
* Submit the job

In [29]:
# Create a directory to deliver the necessary code from your computer to the remote resource.
script_folder = os.path.join(os.getcwd(), "Mask_Detection_Directory")
os.makedirs(script_folder, exist_ok=True)

### Create a training script

In [30]:
script_folder = os.path.join(os.getcwd(), "Mask_Detection_Directory")

In [31]:
%run ./Training_Script.ipynb

Writing /usr/local/bin/machine_learning_projects/iNeuron/Face_Mask_Detector/Mask_Detection_Directory/train.py


### Configure the training job

Create a ScriptRunConfig object to specify the configuration details of your training job, including your training script, environment to use, and the compute target to run on. Configure the ScriptRunConfig by specifying:

* The directory that contains your scripts. All the files in this directory are uploaded into the cluster nodes for execution.
* The compute target. In this case, you use the Azure Machine Learning compute cluster you created.
* The training script name, train.py.
* An environment that contains the libraries needed to run the script.
* Arguments required from the training script.


<a href = "https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-tensorflow#set-up-the-experiment"> More on curated environments</a>

In [38]:
from azureml.core.environment import Environment
from azureml.core.conda_dependencies import CondaDependencies

# Azure ML provides prebuilt, curated environments if you don't want to define your own environment.
curated_env_name = 'AzureML-TensorFlow-2.2-GPU'
env = Environment.get(workspace=ws, name=curated_env_name)


In [44]:
# see the packages included in the curated environment
env.save_to_directory(path=curated_env_name)

If the curated environment does not includes all the dependencies required by your training script, you will have to modify the environment to include the missing dependencies. And if the environment is modified, you will have to give it a new name, as the 'AzureML' prefix is reserved for curated environments.

In [46]:
env = Environment.from_conda_specification(name='mask-env', file_path='./AzureML-TensorFlow-2.2-GPU/conda_dependencies.yml')

In [47]:
# Create the ScriptRunConfig by specifying the training script, compute target and environment.
from azureml.core import ScriptRunConfig

args = ['--data-folder', mask_dataset.as_mount(), '--training-lr', 1e-04, '--training-epochs', 10,
        '--fine-tuning-lr', 1e-05, '--fine-tuning-epochs', 5]

src = ScriptRunConfig(source_directory=script_folder,
                      script='train.py', 
                      arguments=args,
                      compute_target=compute_target,
                      environment=env)

### Submit the job to the cluster

In [48]:
# Run the experiment by submitting the ScriptRunConfig object
# the call is asynchronous, it returns a Preparing or Running state as soon as the job is started
run = exp.submit(config=src)
run

Experiment,Id,Type,Status,Details Page,Docs Page
mask_detection_experiments,mask_detection_experiments_1637311987_1e2318e8,azureml.scriptrun,Preparing,Link to Azure Machine Learning studio,Link to Documentation


### Monitor a remote run
In total, the first run takes about 10 minutes. But for subsequent runs, as long as the script dependencies don't change, the same image is reused. So the container startup time is much faster.

What happens while you wait:

* Image creation: A Docker image is created that matches the Python environment specified by the Azure ML environment. The image is uploaded to the workspace. Image creation and uploading takes about five minutes.

* This stage happens once for each Python environment because the container is cached for subsequent runs. During image creation, logs are streamed to the run history. You can monitor the image creation progress by using these logs.

* Scaling: If the remote cluster requires more nodes to do the run than currently available, additional nodes are added automatically. Scaling typically takes about five minutes.

* Running: In this stage, the necessary scripts and files are sent to the compute target. Then datastores are mounted or copied. And then the entry_script is run. While the job is running, stdout and the ./logs directory are streamed to the run history. You can monitor the run's progress by using these logs.

* Post-processing: The ./outputs directory of the run is copied over to the run history in your workspace, so you can access these results.

You can check the progress of a running job in several ways. This tutorial uses a Jupyter widget and a wait_for_completion method.

In [49]:
# Jupyter widget
# Watch the progress of the run with a Jupyter widget. Like the run submission, the widget is asynchronous 
# and provides live updates every 10 to 15 seconds until the job finishes

from azureml.widgets import RunDetails
RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

In [52]:
run.wait_for_completion(show_output=False)  

{'runId': 'mask_detection_experiments_1637311987_1e2318e8',
 'target': 'cpu-cluster',
 'status': 'Completed',
 'startTimeUtc': '2021-11-19T09:03:24.824647Z',
 'endTimeUtc': '2021-11-19T10:14:22.668278Z',
 'services': {},
 'properties': {'_azureml.ComputeTargetType': 'amlcompute',
  'ContentSnapshotId': 'c7e8af1f-8ba1-4761-813f-d59041fb1624',
  'ProcessInfoFile': 'azureml-logs/process_info.json',
  'ProcessStatusFile': 'azureml-logs/process_status.json'},
 'inputDatasets': [{'dataset': {'id': '3b7f846b-a4f1-44fa-96f4-6a4647f66299'}, 'consumptionDetails': {'type': 'RunInput', 'inputName': 'input__3b7f846b', 'mechanism': 'Mount'}}],
 'outputDatasets': [],
 'runDefinition': {'script': 'train.py',
  'command': '',
  'useAbsolutePath': False,
  'arguments': ['--data-folder',
   'DatasetConsumptionConfig:input__3b7f846b',
   '--training-lr',
   '0.0001',
   '--training-epochs',
   '10',
   '--fine-tuning-lr',
   '1E-05',
   '--fine-tuning-epochs',
   '5'],
  'sourceDirectoryDataStore': None,


In [53]:
# Display run results
print(run.get_metrics())

{'training learning rate': 0.0001, 'training epochs': 10.0, 'fine tuning learning rate': 1e-05, 'fine tuning epochs': 5.0, 'accuracy': 'array([0.96973091, 0.99047083, 0.99019057, 0.9955157 , 0.99327356])', 'val_accuracy': 'array([0.953125 , 0.9776786, 0.9921875, 0.9910714, 0.9921875])', 'loss': 'array([0.07360248, 0.02582102, 0.0216311 , 0.01282537, 0.01763402])', 'val_loss': 'array([0.08866281, 0.06216094, 0.02249909, 0.01635629, 0.0238467 ])'}


### Register model

Outputs is a special directory in that all content in this directory is automatically uploaded to your workspace. This content appears in the run record in the experiment under your workspace.

In [54]:
# register model
model = run.register_model(model_name='mask_detection_model',
                           model_path='outputs/1')
print(model.name, model.id, model.version, sep='\t')

mask_detection_model	mask_detection_model:1	1


### Download model locally

In [None]:
# Create a model folder in the current directory
os.makedirs('./model', exist_ok=True)
run.download_files(prefix='outputs/my_model.h5', output_directory='./model', append_prefix=False)
