### Prerequisite
The user should have completed the Azure Machine Learning Tutorial: [Get started creating your first ML experiment with the Python SDK](https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-setup). You will need to make sure that you have a valid subscription ID, a resource group, and an Azure Machine Learning workspace. All datastores and datasets you use should be associated with your workspace.

## Set up Development Environment
The following subsections show typical steps to setup your development environment. Setup includes:

* Connecting to a workspace to enable communication between your local machine and remote resources
* Creating an experiment to track all your runs
* Creating a remote compute target to use for training

### Azure Machine Learning SDK 
Display the Azure Machine Learning SDK version.

In [1]:
import azureml.core

print("Azure Machine Learning SDK Version:", azureml.core.VERSION)

Azure Machine Learning SDK Version: 1.36.0


### Get Azure Machine Learning workspace
Get a reference to an existing Azure Machine Learning workspace.

In [2]:
from azureml.core import Workspace

ws = Workspace.from_config()
print(ws.name, ws.location, ws.resource_group, sep = ' | ')

ml-workspace | eastus | ml-rg


### Create a new compute resource or attach an existing one

A compute target is a designated compute resource where you run your training and simulation scripts. This location may be your local machine or a cloud-based compute resource. The code below shows how to create a cloud-based compute target. For more information see [What are compute targets in Azure Machine Learning?](https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-target)

> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.

**Note: Creation of a compute resource can take several minutes**. Please make sure to change `STANDARD_D2_V2` to a [size available in your region](https://azure.microsoft.com/en-us/global-infrastructure/services/?products=virtual-machines).

In [3]:
from azureml.core.compute import AmlCompute, ComputeTarget
import os

# Choose a name and maximum size for your cluster
compute_name = "cpu-cluster-d2"
compute_min_nodes = 0
compute_max_nodes = 4
vm_size = "STANDARD_D2_V2"

if compute_name in ws.compute_targets:
    print("Found an existing compute target of name: " + compute_name)
    compute_target = ws.compute_targets[compute_name]
    # Note: you may want to make sure compute_target is of type AmlCompute        
else:
    print("Creating new compute target...")
    provisioning_config = AmlCompute.provisioning_configuration(
        vm_size=vm_size,
        min_nodes=compute_min_nodes, 
        max_nodes=compute_max_nodes)
        
    # Create the cluster
    compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)

print(compute_target.get_status().serialize())

Creating new compute target...
InProgress..
SucceededProvisioning operation finished, operation "Succeeded"
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Resizing', 'allocationStateTransitionTime': '2022-01-10T21:26:57.336000+00:00', 'errors': None, 'creationTime': '2022-01-10T21:26:56.570879+00:00', 'modifiedTime': '2022-01-10T21:27:03.531588+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT1800S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_D2_V2'}


### Create Azure Machine Learning experiment
Create an experiment to track the runs in your workspace. 

In [3]:
from azureml.core.experiment import Experiment

from files import BCTrade_utils
env ='Btc_with_indicators'
experiment_name = BCTrade_utils.registerBCTrade(env)
exp = Experiment(workspace=ws, name=experiment_name)

## Train Cartpole Agent
To facilitate reinforcement learning, Azure Machine Learning Python SDK provides a high level abstraction, the _ReinforcementLearningEstimator_ class, which allows users to easily construct reinforcement learning run configurations for the underlying reinforcement learning framework. Reinforcement Learning in Azure Machine Learning supports the open source [Ray framework](https://ray.io/) and its highly customizable [RLlib](https://ray.readthedocs.io/en/latest/rllib.html#rllib-scalable-reinforcement-learning). In this section we show how to use _ReinforcementLearningEstimator_ and Ray/RLlib framework to train a cartpole playing agent. 

### Create reinforcement learning estimator

The code below creates an instance of *ReinforcementLearningEstimator*, `training_estimator`, which then will be used to submit a job to Azure Machine Learning to start the Ray experiment run.

Note that this example is purposely simplified to the minimum. Here is a short description of the parameters we are passing into the constructor:

- `source_directory`, local directory containing your training script(s) and helper modules,
- `entry_script`, path to your entry script relative to the source directory,
- `script_params`, constant parameters to be passed to each run of training script,
- `compute_target`, reference to the compute target in which the trainer and worker(s) jobs will be executed,
- `rl_framework`, the reinforcement learning framework to be used (currently must be Ray).

We use the `script_params` parameter to pass in general and algorithm-specific parameters to the training script.


In [6]:
from azureml.contrib.train.rl import ReinforcementLearningEstimator, Ray
from azureml.core.environment import Environment

training_algorithm = "PPO"
rl_environment = env
video_capture = False

algorithm_config = '\'{"num_gpus": 0, "num_workers": 1}\''
#if video_capture:
#    algorithm_config = '\'{"num_gpus": 0, "num_workers": 1, "monitor": true}\''
#else:
#    algorithm_config = '\'{"num_gpus": 0, "num_workers": 1, "monitor": false}\''

script_params = {

    # Training algorithm
    "--run": training_algorithm,
    
    # Training environment
    "--env": rl_environment,
    
    # Algorithm-specific parameters
    "--config": algorithm_config,
    
    # Stop conditions
    "--stop": '\'{"episode_reward_mean": 20, "time_total_s": 36000}\'',
    
    # Frequency of taking checkpoints
    "--checkpoint-freq": 2,
    
    # If a checkpoint should be taken at the end - optional argument with no value
    "--checkpoint-at-end": "",
    
    # Log directory
    "--local-dir": './logs'
}

xvfb_env = None
#if video_capture:
    # Ray's video capture support requires to run everything under a headless display driver called (xvfb).
    # There are two parts to this:
    # 1. Use a custom docker file with proper instructions to install xvfb, ffmpeg, python-opengl
    # and other dependencies.
   
#    with open("files/docker/Dockerfile", "r") as f:
#        dockerfile=f.read()

#    xvfb_env = Environment(name='xvfb-vdisplay')
#    xvfb_env.docker.base_image = None
#    xvfb_env.docker.base_dockerfile = dockerfile
    
    # 2.  Execute the Python process via the xvfb-run command to set up the headless display driver.
#    xvfb_env.python.user_managed_dependencies = True
#    xvfb_env.python.interpreter_path = "xvfb-run -s '-screen 0 640x480x16 -ac +extension GLX +render' python"


training_estimator = ReinforcementLearningEstimator(

    # Location of source files
    source_directory='files',
    
    # Python script file
    entry_script='BCTrade_training.py',
    
    # A dictionary of arguments to pass to the training script specified in ``entry_script``
    script_params=script_params,
    
    # The Azure Machine Learning compute target set up for Ray head nodes
    compute_target=compute_target,
    
    # Reinforcement learning framework. Currently must be Ray.
    rl_framework=Ray(),
    
    # Custom environmnet for Xvfb
    environment=xvfb_env
)

NameError: name 'compute_target' is not defined

### Training script

As recommended in RLlib documentations, we use Ray Tune API to run the training algorithm. All the RLlib built-in trainers are compatible with the Tune API. Here we use `tune.run()` to execute a built-in training algorithm. For convenience, down below you can see part of the entry script where we make this call.

This is the list of parameters we are passing into `tune.run()` via the `script_params` parameter:

- `run_or_experiment`: name of the [built-in algorithm](https://ray.readthedocs.io/en/latest/rllib-algorithms.html#rllib-algorithms), 'PPO' in our example,
- `config`: Algorithm-specific configuration. This includes specifying the environment, `env`, which in our example is the gym **[CartPole-v0](https://gym.openai.com/envs/CartPole-v0/)** environment,
- `stop`: stopping conditions, which could be any of the metrics returned by the trainer. Here we use "mean of episode reward", and "total training time in seconds" as stop conditions, and
- `checkpoint_freq` and `checkpoint_at_end`: Frequency of taking checkpoints (number of training iterations between checkpoints), and if a checkpoint should be taken at the end.

We also specify the `local_dir`, the directory in which the training logs, checkpoints and other training artificats will be recorded. 

See [RLlib Training APIs](https://ray.readthedocs.io/en/latest/rllib-training.html#rllib-training-apis) for more details, and also [Training (tune.run, tune.Experiment)](https://ray.readthedocs.io/en/latest/tune/api_docs/execution.html#training-tune-run-tune-experiment) for the complete list of parameters.

```python
import ray
import ray.tune as tune

if __name__ == "__main__":

    # parse arguments ...
    
    # Intitialize ray
    ray.init(address=args.ray_address)

    # Run training task using tune.run
    tune.run(
        run_or_experiment=args.run,
        config=dict(args.config, env=args.env),
        stop=args.stop,
        checkpoint_freq=args.checkpoint_freq,
        checkpoint_at_end=args.checkpoint_at_end,
        local_dir=args.local_dir
    )
```

### Submit the estimator to start experiment
Now we use the *training_estimator* to submit a run.

In [6]:
training_run = exp.submit(training_estimator)

### Monitor experiment

Azure Machine Learning provides a Jupyter widget to show the status of an experiment run. You could use this widget to monitor the status of the runs.

Note that _ReinforcementLearningEstimator_ creates at least two runs: (a) A parent run, i.e. the run returned above, and (b) a collection of child runs. The number of the child runs depends on the configuration of the reinforcement learning estimator. In our simple scenario, configured above, only one child run will be created.

The widget will show a list of the child runs as well. You can click on the link under **Status** to see the details of a child run. It will also show the metrics being logged.

In [7]:
from azureml.widgets import RunDetails

RunDetails(training_run).show()

_RLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 'sdk_v…

### Stop the run
To stop the run, call `training_run.cancel()`.

In [19]:
# Uncomment line below to cancel the run
training_run.cancel()

### Wait for completion
Wait for the run to complete before proceeding.

**Note: The length of the run depends on the provisioning time of the compute target and it may take several minutes to complete.**

In [9]:
training_run.wait_for_completion()

{'runId': 'BCTrade_1640902113_16a527ae',
 'status': 'Canceled',
 'startTimeUtc': '2021-12-30T22:12:38.723146Z',
 'endTimeUtc': '2021-12-30T22:39:29.407457Z',
 'services': {},
 'properties': {},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/reinforcementlearning.txt': 'https://rnikolaus.blob.core.windows.net/azureml/ExperimentRun/dcid.BCTrade_1640902113_16a527ae/azureml-logs/reinforcementlearning.txt?sv=2019-07-07&sr=b&sig=WU%2Bsv43vLY46T5273bVk%2BQFgOfDgHwLC%2FTHYTup%2B2cE%3D&skoid=e2479b93-fc73-41fb-89ce-5cfc3e486410&sktid=3ffbf2df-6ac2-41dc-95c5-2f64a8024caf&skt=2021-12-30T12%3A24%3A57Z&ske=2021-12-31T20%3A34%3A57Z&sks=b&skv=2019-07-07&st=2021-12-30T22%3A25%3A05Z&se=2021-12-31T06%3A35%3A05Z&sp=r'},
 'submittedBy': 'Raphael Nikolaus'}

### Get a handle to the child run
You can obtain a handle to the child run as follows. In our scenario, there is only one child run, we have it called `child_run_0`.

In [14]:
import time

child_run_0 = None
timeout = 30
while timeout > 0 and not child_run_0:
    child_runs = list(training_run.get_children())
    print('Number of child runs:', len(child_runs))
    if len(child_runs) > 0:
        child_run_0 = child_runs[0]
        break
    time.sleep(2) # Wait for 2 seconds
    timeout -= 2

print('Child run info:')
print(child_run_0)

Number of child runs: 1
Child run info:
Run(Experiment: BCTrade-Btc_with_indicators,
Id: BCTrade-Btc_with_indicators_1641752441_311b6d75_head,
Type: azureml.scriptrun,
Status: Canceled)


### Get access to training artifacts
We can simply use run id to get a handle to an in-progress or a previously concluded run.

In [4]:
from azureml.core import Run

#run_id = child_run_0.id # Or set to run id of a completed run (e.g. 'rl-cartpole-v0_1587572312_06e04ace_head')
run_id = 'BCTrade-Btc_with_indicators_1641850079_8e229836_head'
child_run_0 = Run(exp, run_id=run_id)
#child_run_0.get_environment()

Now we can use the Run API to download policy training artifacts (saved model and checkpoints) to local compute.

In [7]:
from os import path
from distutils import dir_util

training_artifacts_path = path.join("logs", training_algorithm)
print("Training artifacts path:", training_artifacts_path)

if path.exists(training_artifacts_path):
    dir_util.remove_tree(training_artifacts_path)

# Download run artifacts to local compute
child_run_0.download_files(training_artifacts_path)

Training artifacts path: logs/PPO


## Evaluate Trained Agent and See Results

We can evaluate a previously trained policy using the `rollout.py` helper script provided by RLlib (see [Evaluating Trained Policies](https://ray.readthedocs.io/en/latest/rllib-training.html#evaluating-trained-policies) for more details). Here we use an adaptation of this script to reconstruct a policy from a checkpoint taken and saved during training. We took these checkpoints by setting `checkpoint-freq` and `checkpoint-at-end` parameters above.
In this section we show how to use these checkpoints to evaluate the trained policy.

### Evaluate a trained policy
We need to configure another reinforcement learning estimator, `rollout_estimator`, and then use it to submit another run. Note that the entry script for this estimator now points to `cartpole-rollout.py` script.
Also note how we pass the checkpoints dataset to this script using `inputs` parameter of the _ReinforcementLearningEstimator_.

We are using script parameters to pass in the same algorithm and the same environment used during training. We also specify the checkpoint number of the checkpoint we wish to evaluate, `checkpoint-number`, and number of the steps we shall run the rollout, `steps`.

The training artifacts dataset will be accessible to the rollout script as a mounted folder. The mounted folder and the checkpoint number, passed in via `checkpoint-number`, will be used to create a path to the checkpoint we are going to evaluate. The created checkpoint path then will be passed into RLlib rollout script for evaluation.

Let's find the checkpoints and the last checkpoint number first.

In [8]:
# A helper function to find checkpoint files in a directory
def find_checkpoints(file_path):
    print("Looking in path:", file_path)
    checkpoints = []
    for root, _, files in os.walk(file_path):
        for name in files:
            if os.path.basename(root).startswith('checkpoint_'):
                checkpoints.append(path.join(root, name))
    return checkpoints

checkpoint_files = find_checkpoints(training_artifacts_path)

Looking in path: logs/PPO


In [9]:
# Find checkpoints and last checkpoint number
checkpoint_numbers = []
for file in checkpoint_files:
    file = os.path.basename(file)
    if file.startswith('checkpoint-') and not file.endswith('.tune_metadata'):
        checkpoint_numbers.append(int(file.split('-')[-1]))

print("Checkpoints:", checkpoint_numbers)

last_checkpoint_number = max(checkpoint_numbers)
print("Last checkpoint number:", last_checkpoint_number)

Checkpoints: [10, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 12, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 14, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 16, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 18, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 2, 20, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 22, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 24, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 26, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 28, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 30, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 32, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 34, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 36, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 38, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 4, 40, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 42, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 44, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 46, 460,

In [12]:
# Upload the checkpoint files and create a DataSet
from azureml.core import Dataset

datastore = ws.get_default_datastore()
checkpoint_dataref = datastore.upload_files(checkpoint_files, target_path='checkpoints_' + run_id, overwrite=True)
checkpoint_ds = Dataset.File.from_files(checkpoint_dataref)

Uploading an estimated of 866 files
Uploading logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5a4da7x/checkpoint_10/checkpoint-10.tune_metadata
Uploaded logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5a4da7x/checkpoint_10/checkpoint-10.tune_metadata, 1 files out of an estimated total of 866
Uploading logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5a4da7x/checkpoint_100/checkpoint-100
Uploaded logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5a4da7x/checkpoint_100/checkpoint-100, 2 files out of an estimated total of 866
Uploading logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5a4da7x/checkpoint_100/checkpoint-100.tune_metadata
Uploaded logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5a4da7x/checkpoint_100/checkpoint-100.tune_metadata, 3 files out of an estimated total of 866
Uploading logs/PPO/PPO_BCTrade-Btc_with_indicators_78880bdc_2022-01-08_18-11-59g5

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/dataprep/api/engineapi/engine.py", line 275, in send_message
    raise message['error']
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/dataprep/api/engineapi/engine.py", line 223, in process_responses
    response = self._read_response(caller='MultiThreadMessageChannel.process_responses')
  File "/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/dataprep/api/engineapi/engine.py", line 148, in _read_response
    raise error
RuntimeError: Engine process terminated. Please try running again. |session_id=f018534c-2eb2-4051-beff-4a75ca6caf99


Now let's configure rollout estimator. Note that we use the last checkpoint for evaluation. The assumption is that the last checkpoint points to our best trained agent. You may change this to any of the checkpoint numbers printed above and observe the effect.

In [20]:
script_params = {    
    # Checkpoint number of the checkpoint from which to roll out
    "--checkpoint-number": last_checkpoint_number,

    # Training algorithm
    "--run": training_algorithm,
    
    # Training environment
    "--env": rl_environment,
    
    # Algorithm-specific parameters
    "--config": '{}',
    
    # Number of rollout steps 
    "--steps": 2000,
    
    # If should repress rendering of the environment
    "--no-render": "",
    
    # The place where recorded videos will be stored
    "--video-dir": "./logs/video"
}

if video_capture:
    script_params.pop("--no-render")
else:
    script_params.pop("--video-dir")


# Ray's video capture support requires to run everything under a headless display driver called (xvfb).
# There are two parts to this:

# 1. Use a custom docker file with proper instructions to install xvfb, ffmpeg, python-opengl
# and other dependencies.
# Note: Even when the rendering is off pyhton-opengl is needed.

#with open("files/docker/Dockerfile", "r") as f:
#    dockerfile=f.read()

#xvfb_env = Environment(name='xvfb-vdisplay')
#xvfb_env.docker.base_image = None
#xvfb_env.docker.base_dockerfile = dockerfile
    
# 2.  Execute the Python process via the xvfb-run command to set up the headless display driver.
#xvfb_env.python.user_managed_dependencies = True
#if video_capture:
#    xvfb_env.python.interpreter_path = "xvfb-run -s '-screen 0 640x480x16 -ac +extension GLX +render' python"


rollout_estimator = ReinforcementLearningEstimator(
    # Location of source files
    source_directory='files',
    
    # Python script file
    entry_script='BCTrade_rollout.py',
    
    # A dictionary of arguments to pass to the rollout script specified in ``entry_script``
    script_params = script_params,
    
    # Data inputs
    inputs=[
        checkpoint_ds.as_named_input('artifacts_dataset'),
        checkpoint_ds.as_named_input('artifacts_path').as_mount()],
    
    # The Azure Machine Learning compute target set up for Ray head nodes
    compute_target=compute_target,
    
    # Reinforcement learning framework. Currently must be Ray.
    rl_framework=Ray(),
    
    # Custom environmnet for Xvfb
    environment=xvfb_env)

Same as before, we use the *rollout_estimator* to submit a run.

In [21]:
rollout_run = exp.submit(rollout_estimator)

And then, similar to the training section, we can monitor the real-time progress of the rollout run and its chid as follows. If you browse logs of the child run you can see the evaluation results recorded in driver_log.txt file. Note that you may need to wait several minutes before these results become available.

In [24]:
RunDetails(rollout_run).show()

_RLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 'sdk_v…

KeyError: 'log_files'

Wait for completion of the rollout run before moving to the next section, or you may cancel the run.

In [19]:
# Uncomment line below to cancel the run
#rollout_run.cancel()
rollout_run.wait_for_completion()

{'runId': 'BCTrade_1640904104_262fc15a',
 'status': 'Finalizing',
 'startTimeUtc': '2021-12-30T22:42:23.225849Z',
 'services': {},
 'properties': {},
 'inputDatasets': [],
 'outputDatasets': [],
 'logFiles': {'azureml-logs/reinforcementlearning.txt': 'https://rnikolaus.blob.core.windows.net/azureml/ExperimentRun/dcid.BCTrade_1640904104_262fc15a/azureml-logs/reinforcementlearning.txt?sv=2019-07-07&sr=b&sig=NmkQMKN%2B7cgcPyuy0rRIsiohVyuPLutuSshX%2F7b1CQs%3D&skoid=e2479b93-fc73-41fb-89ce-5cfc3e486410&sktid=3ffbf2df-6ac2-41dc-95c5-2f64a8024caf&skt=2021-12-30T12%3A18%3A27Z&ske=2021-12-31T20%3A28%3A27Z&sks=b&skv=2019-07-07&st=2021-12-30T22%3A32%3A05Z&se=2021-12-31T06%3A42%3A05Z&sp=r'},
 'submittedBy': 'Raphael Nikolaus'}

### Display movies of selected rollout episodes

To display recorded movies first we download recorded videos to local machine. Here again we create a dataset of rollout artifacts and use the helper functions introduced above to download and displays rollout videos.

In [None]:
# Get a handle to child run
child_runs = list(rollout_run.get_children())
print('Number of child runs:', len(child_runs))
child_run_0 = child_runs[0]

# Download rollout artifacts
rollout_artifacts_path = path.join("logs", "rollout")
print("Rollout artifacts path:", rollout_artifacts_path)

if path.exists(rollout_artifacts_path):
    dir_util.remove_tree(rollout_artifacts_path)

# Download videos to local compute
child_run_0.download_files("logs/video", output_directory = rollout_artifacts_path)

## Cleaning up
For your convenience, below you can find code snippets to clean up any resources created as part of this tutorial that you don't wish to retain.

In [None]:
# To archive the created experiment:
#exp.archive()

# To delete the compute target:
#compute_target.delete()

# To delete downloaded training artifacts
#if os.path.exists(training_artifacts_path):
#    dir_util.remove_tree(training_artifacts_path)

# To delete downloaded rollout videos
#if path.exists(rollout_artifacts_path):
#    dir_util.remove_tree(rollout_artifacts_path)

## Next
This example was about running Reinforcement Learning in Azure Machine Learning (Ray/RLlib Framework) on a single compute. Please see [Pong Problem](../atari-on-distributed-compute/pong_rllib.ipynb)
example which uses Ray RLlib to train a Pong playing agent on a multi-node cluster.