Skip to content

Latest commit

 

History

History
108 lines (88 loc) · 9.34 KB

Day1-AzureML.md

File metadata and controls

108 lines (88 loc) · 9.34 KB

AMLWorkshop-IotEdge-DevOps

ML/IoT/DevOps Hands on workshop

Agenda

Day 1

공통

  • 09:30-10:00 Workshop overview, scope, expectations
    • Process flow and architecture (pdf)
    • DevOps pipeline (pdf)

ML Track

  • 10:00-10:50 Dev environment setup: Azure ML service Workspace and Azure Notebooks, authenticate, prepare compute (Azure ML Compute)

    1. Install

    2. Check Azure subscription

      • All attendee should be able to sign in
    3. Create an AML service workspace

      • region: East US
      • resource group: new (one per person for practice)
      • after creation, check Usage + quotas, Standard NC Family vCPUs: should have 100+ available dedicated cores for this workshop (e.g., 5 people * 6 cores * 4 nodes = 120 cores)
    4. (optional) Add users in Access Control (IAM)

    5. From Azure ML service Workspace Overview tab, click Download config.json, save locally.

    6. Set up Notebook environment. In this workshop, use Option 1 to practice.

      • Option 1: using Azure Notebooks
      • Option 2: using Notebook VMs from Azure ML service Workspace
        • go to Notebook VMs, create a new VM (STANDARD_D3_V2)
        • Click JupyterLab, click Terminal
        • From the current directory /mnt/azmnt/code/Users/, cd (or mkdir if needed), git clone with git clone https://github.com/Azure/MachineLearningNotebooks
        • Note that the config.json is already automatically added to /mnt/azmnt/, you do not need to upload it manually.
        • From Notebook VMs, click Jupyter, and you can run notebooks there
    7. Create Azure ML Compute: To do that, open configuration.ipynb

      • Skip creating config.json, because you already have it
      • Instead, add a cell and run following script to load the config.json and authenticate.
        from azureml.core import Workspace
        ws = Workspace.from_config()
        )
      • Proceed to create Azure ML Compute
        • cpucluster STANDARD_D2_V3, 0 to 4 nodes
        • gpucluster STANDARD_NC6, 0 to 4 nodes
  • 11:00-11:50 Train first DL model on Azure Notebooks using Azure ML Compute

    1. Open sample notebook train-hyperparameter-tune-deploy-with-keras.ipynb under how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras (find this notebook from your notebook environment)
    2. Run (before run.wait_for_complettion() cell)
    3. Monitor the Jupyter widget, and the Workspace (from Azure Portal - check Experiment and Compute)
    4. Additionally, note that files in ./outputs and ./logs are automatically uploaded to the Workspace. Tensorboard logs should also be saved in this ./logs. Refer to how to train models and TensorBoard integration sample.
    5. Try to understand how the model files are moving, from AML Compute, to Workspace, to local environment.
    6. Continue running the notebook and try hyperparameter tuning.
      • Set max_concurrent_job parameter to the maximum number of nodes in your Azure ML Compute cluster.
      • Run, monitor the Jupyter widget and Azure Portal (AML service Workspace), evaluate the results

        Note: Generally when you open the Notebook, you can see the last run results of the code cells, but Jupyter widget results are not shown. So in order to review last Widget run status without running the experiment again, you should find and load the run before using the widget. Sample notebook to do this is here.

    7. Stop here. You may continue and deploy to ACI from this notebook, but we will cover deployment in the afternoon.
  • 13:00-14:50 Distributed training with Horovod on AML Compute, explore AML Workspace

    1. Open sample notebook distributed-pytorch-with-horovod.ipynb under how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod (find this notebook from your notebook environment)
    2. Run all: consider using 4 nodes when available instead of 2 as node_count.
    3. Questions and answers, or proceed to the next step.
  • 15:00-16:50 Create container images, deploy to Azure Container Instance (and/or Azure Kubernetes Service)

    1. We will continue from morning's sample, train-hyperparameter-tune-deploy-with-keras.ipynb. Open the notebook, and run the latter part, creating container image and deploying to ACI.
    2. Explore Workspace from Azure Portal.
    3. Refresh the concepts of MLOps from concept-model-management-and-deployment
  • 17:00-17:50 Questions and answers