### Train a model to predict pathologies from [PadChest:](https://pubmed.ncbi.nlm.nih.gov/32877839/) a large chest x-ray image dataset.

1. Connect to workspace, create and register environment
2. Set up experiment, compute target and data
3. Submit training job to compute cluster
    * Mount directory data
    * Set training script
4. Register the model

In [9]:
# Import Azure Machine Learning SDK
# Please set Python kernel to 'Python 3.8 - AzureML'
# We use 'Azure ML SDK Version:  1.36.0'
import azureml
from azureml.core import Workspace, Experiment, Environment, ScriptRunConfig
from azureml.core.compute import ComputeTarget
from azureml.widgets import RunDetails
from azureml.core.dataset import Dataset
from azureml.core.resource_configuration import ResourceConfiguration
from azureml.core.conda_dependencies import CondaDependencies 

# check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.36.0


### 1. Connect to workspace, create and register environment



In [10]:
# Load workspace from config file
# The workspace is the top-level resource for Azure Machine Learning, 
# providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.
# Documentation: https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace
ws = Workspace.from_config(path='../')
print("Workspace:",ws.name)

# Create environment
# An Environment defines Python packages, environment variables, and Docker settings that are used in machine learning experiments,
# including in data preparation, training, and deployment to a web service.
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment.environment?view=azure-ml-py
env = Environment(name='rsna_2021_demo')

# Install required packages from 'environment.yml' file
env.python.conda_dependencies = CondaDependencies(conda_dependencies_file_path="../environment.yml")
# Register environment. This allows to track the environment's versions, and reuse them in future runs
env.register(workspace = ws)    
print("packages", env.python.conda_dependencies.serialize_to_string())

# Important: Set environment variable to use common runtime
env.environment_variables={"AZUREML_COMPUTE_USE_COMMON_RUNTIME":"false" }
print(env.environment_variables)

# TIP: You can also load workspace outside of AzureML notebooks like so:
# subscription_id = '<>'
# resource_group = '<>'
# workspace_name = '<>'
# ws = Workspace(subscription_id = subscription_id, resource_group = resource_group, workspace_name = workspace_name)
# ws.write_config(path="../", file_name="ws_config.json")

Workspace: RSNA2021AMLDemo
packages name: rsna_2021_demo
channels:
- defaults
- pytorch
dependencies:
- pip=20.1.1
- python=3.7.3
- pytorch=1.8.0
- python-blosc==1.7.0
- torchvision=0.9.0
- pip:
  - azure-mgmt-resource==12.1.0
  - azureml-contrib-dataset==1.34.0
  - azure-mgmt-datafactory==1.1.0
  - azureml-mlflow==1.32.0
  - azureml-sdk==1.32.0
  - azureml-tensorboard==1.32.0
  - azureml-widgets==1.34.0
  - ipykernel==6.3.1
  - jupyter==1.0.0
  - jupyter-client==6.1.5
  - lightning-bolts==0.3.4
  - matplotlib==3.3.0
  - mlflow==1.17.0
  - mypy==0.812
  - numba==0.51.2
  - numpy==1.19.1
  - numba==0.51.2
  - medcam==0.1.21
  - opencv-python-headless==4.5.1.48
  - pandas==1.1.0
  - param==1.9.3
  - pillow==8.2.0
  - pydicom==2.0.0
  - scikit-image==0.17.2
  - scikit-learn==0.23.2
  - scipy==1.5.2
  - seaborn==0.10.1
  - simpleitk==1.2.4
  - six==1.15.0
  - tabulate==0.8.7
  - tensorboard==2.3.0
  - tensorboardX==2.1
  - torchprof==1.3.3
  - torchmetrics==0.4.1
  - torchxrayvision==0.0.3

### 2. Experiment, compute cluster and data

In [11]:
# Connect to experiment
# An Experiment is a container of trials that represent multiple model runs
# Represents the main entry point for creating and working with experiments in Azure ML
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.experiment.experiment?view=azure-ml-py#constructor
experiment = Experiment(workspace=ws, name='Padchest-PyTorch')
print("Experiment:",experiment.name)

# Connect to compute cluster
# The compute cluster is a resource that can be shared with other users in your workspace
# The compute scales up automatically when a job is submitted and shuts down when is no used
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.computetarget?view=azure-ml-py#constructor
compute_target = ComputeTarget(workspace=ws, name="NC24sv3Dedicated")
print("Compute Target:",compute_target.name)

# Connect to dataset
# A dataset is a named view of data that simply points or references the data you want to use as inputs
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py
padchest_ds = Dataset.get_by_name(ws, name='padchest')
padchest_ds.as_named_input('padchest')

Experiment: Padchest-PyTorch
Compute Target: NC24sv3Dedicated


<azureml.data.dataset_consumption_config.DatasetConsumptionConfig at 0x17928ee1c88>

### 3. Submit Job

In [14]:
# Training scripts are based on: https://github.com/mlmed/torchxrayvision/blob/master/scripts/train_model.py
# The folder './trainingscripts' contains the python scripts that will be submitted to the compute cluster
# the main file is: 'train_model.py', pathologies are defined in 'padchest_config.py', 
# dataset is in file 'padchest_dataset.py' and training/testing loops in: 'train_utils.py'
# Please note: 
#   1. We select, views as "PA" over 26 pathologies, sampling 91170 images.
#   2. You can customize the csv input file.
#   3. To display performance metrics in Azure ML, we create a Run object.
#      For more details, please see file 'train_utils.py', lines: 20-23 and 143-146
project_folder = "./trainingscripts"

# We set mounting point for dataset and parameters
args = [
    '--dataset_dir', padchest_ds.as_named_input('padchest').as_mount(),
    '--custom_padchest_csv_file','PADCHEST_chest_x_ray_images_labels_160K_01.02.19.csv',
    '--num_epochs', 30, '--azure_ml', True, '--batch_size_per_gpu', 64, '--model', 'densenet'
]


# ScriptRunConfig packages together the configuration information needed to submit a run in Azure ML,
# including the script, compute target, environment, and any distributed job-specific configs
# Once a script run is configured and submitted with the submit, a ScriptRun is returned
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.scriptrunconfig?view=azure-ml-py
config = ScriptRunConfig(
    source_directory = project_folder, 
    script = 'train_model.py', 
    compute_target=compute_target,
    environment = env,
    arguments=args,
)

In [15]:
# Submit an experiment and return the active created run to execute a trial on local or remote compute. 
# This will automatically prepare your execution environments, execute your code,
# and capture your source code and results in the experiment's run history.
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.experiment.experiment?view=azure-ml-py#submit-config--tags-none----kwargs-
run = experiment.submit(config)

In [None]:
# Show progress
# RunDetails Class is a Jupyter notebook widget used to view the progress of model training.
# A widget is asynchronous and provides updates until training finishes.
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-widgets/azureml.widgets.rundetails?view=azure-ml-py
RunDetails(run).show()

In [7]:
# You should wait until you have a trained model before proceeding since the next steps rely on the model you've trained.
# At this point we have a trained model. Let's load a run and retrieve a model that has been trained within this run. 
# We will register the model and use it for inferencing later.
# TIP: 'runID' can be found in the JSON file associated to the run
runID = "Padchest-PyTorch_1639505418_fa0ebd64"
run = [r for r in experiment.get_runs() if r.id == runID][0]
print(run.id)
RunDetails(run).show()

Padchest-PyTorch_1639505418_fa0ebd64


_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

### 4. Register the model

In [8]:
# A registered model is a logical container for one or more files that make up your model
# After you register the files, you can then download or deploy the registered model and receive all the files that you registered.
# The model can come from Azure ML or from somewhere else. 
# When registering a model, you can optionally provide metadata about the model. 
# The tags and properties that you apply to a model registration can then be used to filter models.
# Documentation: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.run(class)?view=azure-ml-py#register-model-model-name--model-path-none--tags-none--properties-none--model-framework-none--model-framework-version-none--description-none--datasets-none--sample-input-dataset-none--sample-output-dataset-none--resource-configuration-none----kwargs-
model = run.register_model(model_name='padchest',
                           model_path='outputs/pc-densenet-densenet-best.pt',
                           model_framework='PyTorch',
                           model_framework_version='1.8.0',
                           description="Padchest XRay Image Classifier (From Jupyter Notebook)",
                           tags={"data": "padchest", "model": "classification", "label_dataset": "rsna2021", 
                           "class_names": ['Granuloma', 'Mass', 'Hemidiaphragm Elevation', 'Hilar Enlargement', 'Hernia', 'Cardiomegaly',
                            'Aortic Atheromatosis','Bronchiectasis', 'Fibrosis', 'Tuberculosis', 'Fracture', 'Atelectasis', 'Pleural_Thickening',
                            'Costophrenic Angle Blunting', 'Pneumothorax', 'Edema', 'Nodule', 'Pneumonia','Flattened Diaphragm', 'Emphysema',
                            'Infiltration', 'Effusion', 'Aortic Elongation', 'Consolidation', 'Air Trapping', 'Scoliosis'],
                            "number_classes": "26" },
                           resource_configuration=ResourceConfiguration(cpu=1, memory_in_gb=2))

print("Model '{}' version {} registered ".format(model.name,model.version))



Model 'padchest' version 7 registered 


In [None]:
# You can download the model to the local system
model.download(exist_ok=True)