#

# Scenario 3: Training a model

In this notebook, we train a machine learning model on remote AzureML compute resources. We'll be using the Padchest(insert link)
dataset and upload 1000 images in an XNAT project called `Padchest1000`

## XNAT Setup
We have a project named PadChest 1000 where we've uploaded scans and label data for each scan. We then mounted this project as an AzureML dataset. For each scan in the project we assume there's a corresponding `LABEL` directory with a `label.json` file as shown in Scenario 2

# Import packages
Import Python packages you need in this session. Also display the Azure Machine Learning SDK version.


In [8]:
# Import Azure Machine Learning SDKimport azureml
import azureml
from azureml.core import Workspace, Experiment, Environment, ScriptRunConfig
from azureml.core.compute import ComputeTarget
from azureml.widgets import RunDetails
from azureml.core.dataset import Dataset
from azureml.core.resource_configuration import ResourceConfiguration
from azureml.core.conda_dependencies import CondaDependencies 

# check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)


Azure ML SDK Version:  1.34.0


# Connect to workspace
Create a workspace object from the existing workspace. `Workspace.from_config()` reads the file config.json and loads details into an object named `ws`.

In [9]:
# Load workspace from config file
# The workspace is the top-level resource for Azure Machine Learning, 
# providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.
# Documentation: https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace
ws = Workspace.from_config(path='./')
print("Workspace:",ws.name)

# Create environment
# An Environment defines Python packages, environment variables, and Docker settings that are used in machine learning experiments,
# including in data preparation, training, and deployment to a web service.
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment.environment?view=azure-ml-py
env = Environment(name='xnat_scenario_3')

# Install required packages from 'environment.yml' file
env.python.conda_dependencies = CondaDependencies(conda_dependencies_file_path="./environment.yaml")
# Register environment. This allows to track the environment's versions, and reuse them in future runs
env.register(workspace = ws)    


Workspace: ganesh-xnat-workspace


{
    "databricks": {
        "eggLibraries": [],
        "jarLibraries": [],
        "mavenLibraries": [],
        "pypiLibraries": [],
        "rcranLibraries": []
    },
    "docker": {
        "arguments": [],
        "baseDockerfile": null,
        "baseImage": "mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210806.v1",
        "baseImageRegistry": {
            "address": null,
            "password": null,
            "registryIdentity": null,
            "username": null
        },
        "enabled": false,
        "platform": {
            "architecture": "amd64",
            "os": "Linux"
        },
        "sharedVolumes": true,
        "shmSize": null
    },
    "environmentVariables": {
        "EXAMPLE_ENV_VAR": "EXAMPLE_VALUE"
    },
    "inferencingStackVersion": null,
    "name": "xnat_scenario_3",
    "python": {
        "baseCondaEnvironment": null,
        "condaDependencies": {
            "channels": [
                "defaults",
                "pytorch"
  

# Create an experiment
Create an experiment to track the runs in your workspace. A workspace can have multiple experiemnts

In [11]:
experiment = Experiment(workspace=ws, name='xnat-scenario3-padchest-demo')
print("Experiment:",experiment.name)

Experiment: xnat-scenario3-padchest-demo


# Create or attach an existing compute resource
Azure machine learning compute is a managed service that allows data scientists to train machine learning models on clusters of Azure virtual machines,
including VMs with GPU support.


In [13]:
# Connect to compute cluster
# The compute cluster is a resource that can be shared with other users in your workspace
# The compute scales up automatically when a job is submitted and shuts down when is no used
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.computetarget?view=azure-ml-py#constructor
compute_target_name = 'xnat-gpu-cluster'
compute_target = ComputeTarget(workspace=ws, name=compute_target_name)
print("Compute Target:",compute_target.name)

Compute Target: xnat-gpu-cluster


# Reference XNAT project as an AzureML Dataset
Here we reference the `padchest` project created in XNAT as a an AzureML Dataset. 
We use XNAT's internal directory structure to mount 
only the files pertaining to this project. 

Note: Data-scientists should not be allowed to create or view other datasets in this workspace.

#

In [14]:
# A dataset is a named view of data that simply points or references the data you want to use as inputs
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py
padchest_ds = Dataset.get_by_name(ws, name='XNAT PadChest1000 Dataset')
padchest_ds.as_named_input('padchest')

<azureml.data.dataset_consumption_config.DatasetConsumptionConfig at 0x7ff4b28951d0>

# Configure the training job
Create a `ScriptRunConfig` object to specify configuration details for the training job. This includes
- The directory that contains your script. All files in this directory are uploaded to the cluster nodes for execution
- The compute target
- The training script name
- An environment containing the libraries needed for the script
- Any other arguments required for the training script

Here we mount our dataset as a directory

In [15]:
project_dir = './scripts'
args = [
    '--dataset_dir', padchest_ds.as_download(),
    '--num_epochs', 25, '--azure_ml', True, '--batch_size_per_gpu', 64,
    '--threads', 0
]
# ScriptRunConfig packages together the configuration information needed to submit a run in Azure ML,
# including the script, compute target, environment, and any distributed job-specific configs
# Once a script run is configured and submitted with the submit, a ScriptRun is returned
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.scriptrunconfig?view=azure-ml-py
config = ScriptRunConfig(
    source_directory = project_dir, 
    script = 'main.py', 
    compute_target=compute_target,
    environment = env,
    arguments=args,
)

# Submit the job to the cluster

In [16]:
run = experiment.submit(config)


# Register the model

In [9]:
model = run.register_model(model_name='padchest1000_xnat_model',
                           model_path='outputs/pc-densenet-densenet-best.pt',
                           model_framework='PyTorch',
                           model_framework_version='1.8.0',
                           description="Padchest XRay Image Classifier (From XNAT project Padchest1000)",
                           tags={"data": "XNAT PadChest1000 Dataset", "model": "classification", 
                           "class_names": ['Pleural_Effusion', 'Consolidation', 'Atelectasis', 'Pleural_Abnormalities',
                                           'Cardiomegaly', 'No_Finding', 'Pneumonia', 'Opacity'],
                            "number_classes": "8" },
                           resource_configuration=ResourceConfiguration(cpu=1, memory_in_gb=2))

print("Model '{}' version {} registered ".format(model.name,model.version))

Model 'padchest1000_xnat_model' version 2 registered 
