Azure ML half-day Workshop

Shorter version of AzureML Hands on workshop

Scope

4시간 짜리 요약 버전으로 핵심 내용 위주로 진행합니다.
Azure에 데이터를 올리는 가장 간단한 방법부터 시작해서, Azure ML workspace를 생성하고 첫 실험을 수행하는 부분, Azure ML의 주요 기능 중에서 선별하여 실습 또는 데모를 통해 체험을 합니다.
Inference 부분은 시간이 부족하여 MLOps 관점에서 데모 중심으로 커버합니다.
본 버전에서는 Client에 설치를 최소화하고 Cloud 환경에서 대부분의 작업을 진행합니다.
본 세션으로 Azure ML의 모든 것을 알 수는 없으나 최소한의 개념을 일부라도 직접 체험하고 향후에 기능별로 알아보기 위한 기초를 마련하도록 합니다.
시간이 허용된다면 고급 기능의 일부를 알아봅니다:
- Experiment e2e tracking
- Automated ML, HyperDrive
- MLOps
- Distributed Training with SR-IOV
- DeepSpeed
- Enterprise Readiness

Agenda

Before the session

Prepare
1. Check Azure subscription
  - All attendee should be able to sign in
2. Install
  - Internet browser of your choice (Edge is fine, Chrome is also good)
  - Azure Storage Explorer
    1. 설치 후 Azure Storage Explorer를 실행하고 왼쪽 상단의 사람모양 아이콘 클릭 후 Add an account 클릭
    2. Add an Azure Account 선택 후 Next 클릭
    3. Azure 계정으로 로그인
    4. 로그인 후 계정 연동 확인
  - (optional) Visual Studio Code
  - (optional) GitHub Desktop

Day 1 (Half)

Basic

14:00-14:50 Workshop overview, scope, expectations and getting started
1. Create an AML service workspace
  - region: Korea Central
  - resource group: new (one per person for practice)
  - after creation, check Usage + quotas, Standard NC Family vCPUs: should have enough available dedicated cores for this workshop (e.g., 5 people * 6 cores * 4 nodes = 120 cores), if we're trying Deep Learning
2. (optional) Add users in Access Control (IAM)
  - FYI: Manage users and roles - create custom roles
3. Use Storage Explorer to upload files
  - Note it leverages AzCopy for parallel faster loading
4. Key concepts
  - Data flow and architecture (pdf)
    - Data Factory
      - AWS S3 connector
  - (Optional) DevOps pipeline (pdf)
  - Azure ML Overview "AI 프로젝트에 필요한 것들"
  - Azure ML conceptual diagram

Drilling down

15:00-15:50 Visit AML studio, create computes and try Notebooks
1. visit https://ml.azure.com/ or https://ml.azure.com?flight=azureNotebooks
2. Learn key concepts
3. Create Compute Instance. Or alternatively you can use your local environment.
  - Go to Compute > Compute Instance, create a new VM (STANDARD_D3_V2)
  - Clone sample git repo. Choose one of the following options.
    - Option 1: Use GUI
      1. You can do this even if you don't have Compute Instance running.
      2. Go to Notebooks, drill down Azure ML gallery - Samples. Choose any folder you want to clone, click ... of that folder and choose Clone.
    - Option 2: Use code
      1. You can do this only when your Compute Instance is running.
      2. Go to Compute > Compute Instance. Start the Compute Instance if it is not running. Click JupyterLab of a running Compute Instance, click Terminal
      3. From the home directory /mnt/azureuser/, cd (or mkdir if needed, name can by anything you want), git clone with git clone https://github.com/Azure/MachineLearningNotebooks
4. Run Notebooks from the cloud
  - You have two ways to do this. Choose one of the following options. Both requires running Compute Instance.
  - Option 1: Use studio (UI)
    1. Go to Notebooks > User files, choose the Jupyter Notebook you want to run.
    2. If no Compute Instance is running, you can start it from Compute > Compute Instance and come back here.
    3. You can specify on which Compute Instance you will run the Notebook.
    4. You can edit, run and document as you would normally do with Jupyter Notebook.
  - Option 2: Use Compute Instance Jupyter directly
    1. From Compute > Compute Instance, click Jupyter, and you can run notebooks there
5. (Optional) In case you want to skip AD authentication and use SSH Tunneling to access services running in the Compute Instance such as Jupyer:
  1. Create SSH Key pair using PuTTYgen or any other tools. If you use PuTTYgen,
    1. Run bash
    2. Install PuTTYgen if you haven’t, by running sudo apt install putty-tools.
    3. Run the following command to generate the key pair, save the Private Key, then retrieve the Public Key in OpenSSH rsa format which is needed for Compute Instance creation.
      puttygen -t rsa -b 2048 -C "azureuser@ci" -o private-key.ppk puttygen private-key.ppk -O public-openssh -o public-key-openssh
      If you wanted to set passphrase, refer to the documentation shared above.
  2. Create Compute Instance using SSH Public Key above
  3. Use PuTTY or any other tools to create SSH Tunnel between localhost and remote Compute Instance
  4. Now you can browse https://localhost:8000/ to get to Jupyter (and https://localhost:8000/lab for JupyterLab) running remotely in Compute Instance.
6. Create Azure ML Compute: To do that, open configuration.ipynb under Notebooks
  - Proceed to create Azure ML Compute
    - cpucluster STANDARD_D2_V3, 0 to 4 nodes
    - gpucluster STANDARD_NC6, 0 to 4 nodes
7. Try a Notebook: sample - first ml experiment
16:00-16:50 Try Automated ML
1. (Demo) Regression example
  1. Open sample notebook auto-ml-regression-hardware-performance-explanation-and-featurization under how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization (find this notebook from your notebook environment)
  2. Run (before run.wait_for_complettion() cell)
  3. Monitor the Jupyter widget, and the Workspace (from Azure Portal - check Experiment and Compute)
  4. Additionally, note that files in ./outputs and ./logs are automatically uploaded to the Workspace. Tensorboard logs should also be saved in this ./logs. Refer to how to train models and TensorBoard integration sample.
  5. Try to understand how the model files are moving, from AML Compute, to Workspace, to local environment.
2. (Your turn) Let's try Regression with UI
  1. Upload the sample orage juice CSV
    - Option 1: Use Storage Explorer to upload data to Storage Account attached to Azure Machine Learning
    - Option 2: Use Azure ML studio > Datasets > Create dataset. Check header. Try Profile and Explore.
  2. Go to Automated ML, click New Automated ML run and try regression.
    1. Pick the dataset you uploaded, pick target column.
    2. Choose the Compute you have created (e.g., CPU cluster)
    3. Choose Regression if that is what you want to run.
    4. Check out View feaurization settings
    5. Check out View additional configuration settings
      - Primary Metrics
      - Automatic Featurization
      - Explain best model
      - Blocked algorithms
      - Exit criterion
      - Validation
      - Concurrency
    6. Click Submit
    7. Check Automated ML (Data guadrails, Models), Experiments, Compute
    8. Check Explainer
    9. You may try Deploy best model from the run.
17:00-17:50 Check out Designer and MLOps
1. Designer
  1. Try starting a new Pipeline Draft
  2. Try opening a sample
  3. Check out quick demos
2. MLOps
  1. Revist the architecture diagram
  2. Check out quick demos
    1. Demo Generator
3. Some more additional features
  - Build 2019 updates: New Azure Machine Learning updates simplify and accelerate the ML lifecycle
  - visual-interface (preview)
  - automated ml with GUI (preview)
  - interpretability-explainability
  - onnx
  - fpga
  - pipelines
  - Enterprise Readiness
  - Distributed Training with SR-IOV
  - DeepSpeed
  - Model Inference Optimization (private)
  - custom vision
4. Further questions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Day1-AzureML-halfday.md

Day1-AzureML-halfday.md

Azure ML half-day Workshop

Scope

Agenda

Before the session

Day 1 (Half)

Basic

Drilling down

Files

Day1-AzureML-halfday.md

Latest commit

History

Day1-AzureML-halfday.md

File metadata and controls

Azure ML half-day Workshop

Scope

Agenda

Before the session

Day 1 (Half)

Basic

Drilling down