In [1]:
## SET UP AZUREML DETAILS
# imports
from azureml.core.authentication import InteractiveLoginAuthentication
from azureml.core import Workspace, Environment, Experiment, Dataset, ScriptRunConfig

# set up workspace
config_path = '../../utils/config_GPU.json'
tenant_id = '72f988bf-86f1-41af-91ab-2d7cd011db47'  # this is outputted post `az login`
interactive_auth = InteractiveLoginAuthentication(tenant_id=tenant_id)  # create log-in object
ws = Workspace.from_config(path=config_path, auth=interactive_auth)  # link workspace

# set up environment
# - obtain environment.yml from `conda env export > environment.yml`
env_name = 'SampleEnv'
env_path = '../../utils/environment_case_study_cuml.yml'
env = Environment.from_conda_specification(name=env_name, file_path=env_path)
# - set docker from curate environment
env.docker.enabled = True
env.docker.base_image = 'mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.0.3-cudnn8-ubuntu18.04'

# set up experiment
experiment_name = 'AnomalyDetection'
exp = Experiment(workspace=ws, name=experiment_name)

# set up dataset
dataset_path = 'http://kdd.ics.uci.edu/databases/kddcup99/kddcup.data_10_percent.gz'
ds = Dataset.File.from_files(dataset_path)

# set up run
src_dir = '../../src/case_study_cuml'
src_name = 'azure_cuml_case_B.py'
compute_name = 'EastUS2gpu112'
arguments = ['--data-path', ds.as_mount()]
src = ScriptRunConfig(source_directory=src_dir, script=src_name, compute_target=compute_name,
                      environment=env, arguments=arguments)


If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


In [2]:
## SUBMIT THE RUN
from azureml.widgets import RunDetails

run = exp.submit(src)  # submit it to the azureml platform
RunDetails(run).show()  # monitor the steps


_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

In [3]:
# show the outputs
run.wait_for_completion(show_output=True)


RunId: AnomalyDetection_1624182604_623dc294
Web View: https://ml.azure.com/experiments/AnomalyDetection/runs/AnomalyDetection_1624182604_623dc294?wsid=/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourcegroups/AzureML_UW_NLP/workspaces/East_AZ_ML

Streaming azureml-logs/55_azureml-execution-tvmps_64f253e32d1df8c6020c6ce52c177f5896c1590aab9587bf64b194586af11b6c_p.txt

2021-06-20T09:54:05Z Successfully mounted a/an Blobfuse File System at /mnt/batch/tasks/shared/LS_root/jobs/east_az_ml/azureml/anomalydetection_1624182604_623dc294/mounts/workspaceblobstore
2021-06-20T09:54:05Z Failed to start nvidia-fabricmanager due to exit status 5 with output Failed to start nvidia-fabricmanager.service: Unit nvidia-fabricmanager.service not found.
. Please ignore this if the GPUs don't utilize NVIDIA® NVLink® switches.
2021-06-20T09:54:05Z Starting output-watcher...
2021-06-20T09:54:05Z IsDedicatedCompute == False, starting polling for Low-Pri Preemption
2021-06-20T09:54:05Z Executing 'Copy AC

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labels[mask_anomaly] = 'anomaly.'  # concatenate to create binary tree
Splitting data with 748 NORMAL and 2893 ANOMALY
Data post numerical variable filtering of shape (3641, 38)
Data post categorical variable filtering of shape (3641, 45)
Data post encoding with numerical of shape (3641, 83)
[W] [09:58:09.643568] Using experimental backend for growing trees

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labels[mask_anomaly] = 'anomaly.'  # concatenate to create binary tree
Splitting data with 741 NORMAL and 2900 ANOMALY
Data post numerical variable filtering of shape (3641, 38)
Data post categorical variable filtering of s

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labels[mask_anomaly] = 'anomaly.'  # concatenate to create binary tree
Splitting data with 749 NORMAL and 2892 ANOMALY
Data post numerical variable filtering of shape (3641, 38)
Data post categorical variable filtering of shape (3641, 44)
Data post encoding with numerical of shape (3641, 82)
[W] [09:58:40.177557] Using experimental backend for growing trees

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labels[mask_anomaly] = 'anomaly.'  # concatenate to create binary tree
Splitting data with 760 NORMAL and 2881 ANOMALY
Data post numerical variable filtering of shape (3641, 38)
Data post categorical variable filtering of s

Data post categorical variable filtering of shape (3641, 47)
Data post encoding with numerical of shape (3641, 85)
[W] [09:59:04.482448] Using experimental backend for growing trees

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labels[mask_anomaly] = 'anomaly.'  # concatenate to create binary tree
Splitting data with 772 NORMAL and 2869 ANOMALY
Data post numerical variable filtering of shape (3641, 38)
Data post categorical variable filtering of shape (3641, 49)
Data post encoding with numerical of shape (3641, 87)
[W] [09:59:04.572287] Using experimental backend for growing trees

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labels[mask_anomaly] = 'anomaly.'  # con


Streaming azureml-logs/75_job_post-tvmps_64f253e32d1df8c6020c6ce52c177f5896c1590aab9587bf64b194586af11b6c_p.txt

[2021-06-20T09:59:48.904174] Entering job release
[2021-06-20T09:59:49.659541] Starting job release
[2021-06-20T09:59:49.659963] Logging experiment finalizing status in history service.[2021-06-20T09:59:49.660606] job release stage : upload_datastore starting...
[2021-06-20T09:59:49.660812] job release stage : start importing azureml.history._tracking in run_history_release.Starting the daemon thread to refresh tokens in background for process with pid = 398

[2021-06-20T09:59:49.661107] job release stage : copy_batchai_cached_logs starting...

[2021-06-20T09:59:49.661285] job release stage : copy_batchai_cached_logs completed...[2021-06-20T09:59:49.661462] job release stage : execute_job_release starting...

[2021-06-20T09:59:49.674149] Entering context manager injector.
[2021-06-20T09:59:49.676565] job release stage : upload_datastore completed...
[2021-06-20T09:59:49.778

{'runId': 'AnomalyDetection_1624182604_623dc294',
 'target': 'EastUS2gpu112',
 'status': 'Completed',
 'startTimeUtc': '2021-06-20T09:54:08.133649Z',
 'endTimeUtc': '2021-06-20T10:00:07.536444Z',
 'properties': {'_azureml.ComputeTargetType': 'amlcompute',
  'ContentSnapshotId': 'b61b6258-c1d0-462a-a479-ac76476cd25a',
  'ProcessInfoFile': 'azureml-logs/process_info.json',
  'ProcessStatusFile': 'azureml-logs/process_status.json',
  'azureml.git.repository_uri': 'https://github.com/danielgchen/MS_AZML_Anomaly_Detection.git',
  'mlflow.source.git.repoURL': 'https://github.com/danielgchen/MS_AZML_Anomaly_Detection.git',
  'azureml.git.branch': 'master',
  'mlflow.source.git.branch': 'master',
  'azureml.git.commit': '24ecf4a5fa38ef73391d758b91b826b698a8d7bb',
  'mlflow.source.git.commit': '24ecf4a5fa38ef73391d758b91b826b698a8d7bb',
  'azureml.git.dirty': 'True',
  'mlflow.param.key.random_state': '0',
  'mlflow.param.key.n_estimators': '25',
  'mlflow.param.key.max_depth': '20',
  'mlflow.