Skip to content

This repository contains the necessary code to train PyTorch 2D-CNN models in Azure Machine Learning. Hyperspectral Imaging management is done to feed CNN models. When models are trained, their are registered in an Azure Machine Learning workspace, which are then used as a web service using Azure Kubernetes Service. These web service are used to…

Notifications You must be signed in to change notification settings

AlbertoMartinPerez/CNN_HSI_AzureML

Repository files navigation

Convolutional Neural Network model deployment using Azure Machine Learning and Docker for intraoperative brain tumor classification

Final master thesis code and results for the Master of Science in Internet of Things taught by Universidad Politécnica de Madrid (2020/21). Developed by Alberto Martín Pérez in colaboration with GDEM research group and the NEMESIS-3D-CM project.


1. Installation and set-up

1.1. Visual Studio Code extensions:

Azure extensions

Python extensions

Optional extensions

1.2. Microsoft Azure resources:

Use the links to learn how to create these resources

  • Microsoft Azure account and suscription with credit.
  • Machine Learning workspace - Used to train, test and deploy models. You can create a Resource group during the creation of the workspace. It will also create a Storage account, a Container registry, a Key vault and an Application Insights resource.
  • Compute Clusters - Used to run the experiments to train and test models.
  • Kubernetes service - Used to deploy containerized models as web services.

1.3. Folder structure and files:

Folders

  • Examples: Folder containing Python scripts with examples of how to use the most basic classes from the hsi_manager.py library.
  • Libraries: Folder containing all necessary Python files to train and measure PyTorch CNN, manage and preprocess HSI data.
  • NEMESIS_images: Folder where all hyperspectral images and ground truth maps are saved:
    • datasets: Sub-folder containing files of type 'IDXXXXCYY_dataset.mat'
    • GroundTruthMaps: Sub-folder containing files of type 'SNAPgtIDXXXXCYY_cropped_Pre-processed.mat'
    • preProcessedImages: Sub-folder containing files of type 'SNAPimagesIDXXXXCYY_cropped_Pre-processed.mat'
    • tif: Sub-folder containing files of type 'IDXXXXCYY.tif' (raw HSI), 'IDXXXXCDYY.tif' (dark HSI reference) and 'IDXXXXCWYY.tif' (white HSI reference).
  • Results: Folder where all results will be stored.
    • Classification_maps: Sub-folder containing '.png' images with classification maps. Names follow the next criteria: CNN architecture name _ ID patient classified _ if cross-validation was used during training _ version _ number .png Example: Conv2DNet_ID0018C09_noCV_version_1.png
    • Model_deployment: Sub-folder containing information regarding time executions during model deployment and consumption. Measures where saved manually.
    • Training_metrics: Sub-folder containing information regarding classification and time metrics during training executions. Measures saved after gathering all stored data in the Azure portal. The file '7_azure_read_metrics.py' was used to automatically collect the metrics.

Files

  • 1_azure_connection.py: Shows how to establish connection to an Azure Machine learning workspace using the config.json file downloaded from the Azure ML studio page.
  • 2_azure_data_upload.py: Shows how to upload data to the default Datastore from the Azure Blob storage of the Azure Machine Learning workspace.
  • 3_azure_create_dataset.py: Shows how to create datasets from the default Datastore containing the uploaded files.
  • 4_azure_download_dataset.py: Shows how to download created datasets.
  • 5_azure_control_train.py: Shows how to define an environment to run experiments in an Azure compute cluster. Uses one of these two training scripts:
    • azure_train_experiments.py: Which trains Conv2DNet models using the 5-fold double cross-validation implementation.
    • azure_train_noCV_experiments.py: Which trains a Conv2DNet model without using the 5-fold double cross-validation implementation.
  • 6_azure_deploy_use_model.ipynb: Shows how to deploy and consume a registered model using Azure Kubernetes Service and the Azure SDK for Python (no HTTP). Uses the folowing scoring script for the web service:
    • score_brain.py: Scoring script that takes a registered model, preprocess a hyperspectral cubes and returns a predicted classification map with a JSON object.
  • 7_azure_read_metrics.ipynb: Shows how to automatically store registered metrics from the experiments run in Azure Machine learning into local .csv files.

1.4. Conda and pip packages for Python

Python version used has been 3.8.10, since at the time, azureml-core did not support Python versions >= 3.9 These packages and versions have been used during the development of this thesis. They will install their corresponding dependencies.

Package Version
azureml-core 1.31.0
matplotlib 3.4.2
numpy 1.19.3
pandas 1.3.0
scikit-learn 0.24.2
scipy 1.7.0
torch 1.9.0+cu111
tqdm 4.61.1

About

This repository contains the necessary code to train PyTorch 2D-CNN models in Azure Machine Learning. Hyperspectral Imaging management is done to feed CNN models. When models are trained, their are registered in an Azure Machine Learning workspace, which are then used as a web service using Azure Kubernetes Service. These web service are used to…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published