Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

#  Train using Azure Arc-enabled Machine Learning with NFS-mounted data

This example notebook demonstrates how to train a simple Machine Learning model from data stored on an NFS server.
For this example we will be training a simple model using scikit-learn.

* Setup an NFS Server
* Copy the Iris data to the NFS Server
* Configure NFS Server mounts on your Kubernetes Cluster
* Setup your connection to Azure Machine Learning
* Create the necessary Azure Machine Learning objects
* Submit a Training Run

## Setup an NFS Server
This notebook assumes that you either have access to an existing NFS server or know how to set one up.  Setting up and configuring NFS is beyond
the scope of this example.  To complete this notebook you will need to know the address of your NFS server and know how to copy files onto it.

In [None]:
nfs_mount_path = '/nfs_share'

import os
iris_dir = os.path.join(nfs_mount_path, 'iris')
os.makedirs(iris_dir, exist_ok=True)

## Copy the Iris data to the NFS Server
The iris.csv file (located in this directory) contains the training data.  The following code copies this file into your NFS share.

In [None]:
import shutil

shutil.copyfile('iris.csv', os.path.join(iris_dir, 'iris.csv'))

## Cofigure NFS Server mounts on your Kubernetes Cluster

Follow the instructions [here](../amlarc-nfs-setup/README.md) to configure your Azure Arc-enabled Machine Learning cluster to mount your NFS server.

## Setup your connection to Azure Machine Learning

In [1]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)

Failure while loading azureml_run_type_providers. Failed to load entrypoint hyperdrive = azureml.train.hyperdrive:HyperDriveRun._from_run_dto with exception (azureml-core 1.32.0 (/disks/4TB/code/e2e/reinforcement-learning/lib/python3.6/site-packages), Requirement.parse('azureml-core~=1.30.0')).


SDK version: 1.32.0


In [2]:
# Connect to the Workspace described by local configuration
from azureml.core import Workspace

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


dasommer-ml-eus
dasommer
eastus
4aaa645c-5ae2-4ae9-a17a-84b9023bc56a


## Create the necessary Azure Machine Learning objects

In [4]:
# Create an Experiment
from azureml.core import Experiment
experiment_name = 'train-on-amlarc-with-nfs'
experiment = Experiment(workspace = ws, name = experiment_name)

In [5]:
# Create a Docker-based environment with scikit-learn installed
from azureml.core import Environment
from azureml.core.runconfig import DockerConfiguration
from azureml.core.conda_dependencies import CondaDependencies

myenv = Environment("myenv")
myenv.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn', 'packaging'])

# Enable Docker
docker_config = DockerConfiguration(use_docker=True)

In [12]:
# Specify the name of an existing Azure Arc-enabled Machine Learning compute target
amlarc_cluster = 'amlarc'

## Submit a Training Run

In [16]:
from azureml.core import ScriptRunConfig

# Configure the run.  For this example we will use the NFS data path set above.
src = ScriptRunConfig(source_directory='scripts', 
                      script='train.py', 
                      compute_target=amlarc_cluster,
                      environment=myenv,
                      arguments=['--data-dir', iris_dir],
                      docker_runtime_config=docker_config)
 
run = experiment.submit(config=src)
run

Experiment,Id,Type,Status,Details Page,Docs Page
train-on-amlarc-with-nfs,train-on-amlarc-with-nfs_1629765768_2475dc17,azureml.scriptrun,Starting,Link to Azure Machine Learning studio,Link to Documentation


Note: if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run).

In [None]:
# Shows output of the run on stdout.
run.wait_for_completion(show_output=True)

In [None]:
run.get_metrics()