## Azure IDs used for this workshop
```python
subscription_id='4feb84f6-2c10-4536-9c8a-0a2360eabfc5'
resource_group='azureml'
container_registry='/subscriptions/4feb84f6-2c10-4536-9c8a-0a2360eabfc5/resourcegroups/azureml/providers/microsoft.containerregistry/registries/danielscacrcvsldmti'
location='westeurope'
```

## Slides

https://aka.ms/aifetalk

## Setup

1. Install the VS Code extension at : https://msasg.visualstudio.com/OpenMind%20Studio/OpenMind%20Studio%20Team/_build/results?buildId=3354388
Make sure it has a recent version of the Python SDK -- remove the folder ~/.azureml/envs if there are issuse. A current SDK will be installed when you first use AML from VSCode.
2. Install the Python SDK: https://github.com/Azure/ViennaDocs/tree/master/PrivatePreview
    make sure to install automl, notebook, and contrib

```shell
source activate py36
pip install --upgrade azureml-sdk[notebooks,contrib,automl] --user
jupyter nbextension install --py --user azureml.train.widgets
jupyter nbextension enable azureml.train.widgets --user --py

```
You will need to restart jupyter after this
3. Clone this repository
```shell
git clone https://github.com/danielsc/aml-workshop
jupyter notebook
```

## Before each demo
* Create a new, empty notebook in the ignite directory of the repo
* if you have run this before, make sure you delete mnist_with_summaries.py from the folder ignite_demo
* open VSCode at the ignite_demo directory
* go to ignite directory and start jupyter notebook there
* make sure to bring up the cluster by going to the Azure portal


## F1 Download Tensorflow  demo code (we will just do this in the browser)


In [6]:
source_folder = './demo'
import requests
import os
tf_code = requests.get("https://raw.githubusercontent.com/tensorflow/tensorflow/r1.8/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py")
with open(os.path.join(source_folder, "mnist_with_summaries.py"), "w") as file:
    file.write(tf_code.text)

## run the script locally 

In [None]:
!/anaconda/envs/py36/bin/python demo/mnist_with_summaries.py


## F2 make sure we import everything -- makes for faster typing and intellisense

In [1]:
from azureml.train.dnn import TensorFlow
from azureml.contrib.tensorboard import Tensorboard
from azureml.core import Workspace, Experiment
from azureml.core.compute import ComputeTarget, BatchAiCompute
from azureml.train.widgets import RunDetails
from azureml.train.hyperdrive import *
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.constants import Metric
from demo.get_data import get_data
import time
import azureml.core
import math, os

source_folder = './demo'
print("SDK version:", azureml.core.VERSION)

SDK version: 0.1.68


## F3 This is how you create a workspace
we are explicitly setting the container registry to an existing one to avoid
image creation during the workshop (image creation taks 3-8 minutes depending on size)

In [None]:
ws = Workspace.create(name=<<your_alias_here>>, 
                      subscription_id='4feb84f6-2c10-4536-9c8a-0a2360eabfc5', 
                      resource_group='azureml', 
                      container_registry='/subscriptions/4feb84f6-2c10-4536-9c8a-0a2360eabfc5/resourcegroups/azureml/providers/microsoft.containerregistry/registries/danielscacrcvsldmti',
                      location='westeurope',
                      exist_ok=True)
ws.name

In [2]:
ws = Workspace.get(name='<<your_alias_here>>', subscription_id='4feb84f6-2c10-4536-9c8a-0a2360eabfc5', resource_group='azureml')

## F4 This is how you create a batch AI cluster
we are setting min_nodes to 2 to have save us the preparation time for each job -- **make sure to dial this down after the workshop**

In [None]:
#                                                                            Standard_NC6s_v2
provisioning_config = BatchAiCompute.provisioning_configuration(vm_size = "Standard_NC6",
                                                                autoscale_enabled = True,
                                                                cluster_min_nodes = 2, 
                                                                cluster_max_nodes = 20)

p100cluster = ComputeTarget.create(ws, 
                                   name = "p100cluster", 
                                   provisioning_configuration=provisioning_config)


In [None]:
p100cluster.refresh_state()
p100cluster.provisioning_state


In [None]:
p100cluster.refresh_state()
p100cluster.status.node_state_counts.serialize()

In [3]:
p100cluster = ws.compute_targets()['p100cluster']

## F5 let's create an experiment in VSCode and then retrieve it here

In [4]:
mnist = Experiment(ws, 'linkedin')

## F6 now we run the script on our cluster, tracked in the mnist Experiment

(what is an estimator, how is tensorflow different, what other estimators are there?)

In [5]:
tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=p100cluster,
                          entry_script='mnist_with_summaries.py')

run = mnist.submit(tf_estimator)

## F7 How can we monitor the experiment?

If you don't see the widget, you will need to

1. stop jupyter
2. run the following
```
    jupyter nbextension install --py --user azureml.train.widgets
    jupyter nbextension enable azureml.train.widgets --user --py
```
3. start jupyter

In [6]:
RunDetails(run).show()

_UserRun(widget_settings={'childWidgetDisplay': 'popup'})

## Let's visualize the Accuracy so we can better see the progress of the training run

1. add this at the top of the file
```python
    from azureml.core.run import Run
```
2. add 'Run.get_submitted....' near line 165, so you get this
```python
    print('Accuracy at step %s: %s' % (i, acc))
    if i % 50 == 0:
        Run.get_submitted_run().log('Accuracy_test', acc)
```
3. add the same line with different log name further down
```python
    test_writer.close()
    Run.get_submitted_run().log('Accuracy', acc)
```

4. and add this to log the parameters chosen at the end of the file
```python
    FLAGS, unparsed = parser.parse_known_args()
    Run.get_submitted_run().log('learning_rate', FLAGS.learning_rate)
    Run.get_submitted_run().log('dropout', FLAGS.dropout)
```

## F10 copy the the script over

In [8]:
# Or just run this cell
from  shutil import copyfile
copyfile('../misc/mnist_with_summaries.py', os.path.join(source_folder, 'mnist_with_summaries.py'))

'./demo/mnist_with_summaries.py'

## F11 run the same way as above

In [11]:
tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=p100cluster,
                          entry_script='mnist_with_summaries.py')

run = mnist.submit(tf_estimator)
RunDetails(run).show()

_UserRun(widget_settings={'childWidgetDisplay': 'popup'})

## F12 wasn't there another way to visualize the metrics of a model? Tensorboard.

In [7]:
# just add the log_dir parameter
script_params={
    '--log_dir': './logs',
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=p100cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)
RunDetails(run).show()

_UserRun(widget_settings={'childWidgetDisplay': 'popup'})

## 1. Start Tensorboard

Now, while the run is in progress, we just need to start Tensorboard with the run as its target, and it will begin streaming logs.

In [8]:
tb = Tensorboard(run)
tb.start()

http://danielsc:6006


'http://danielsc:6006'

## Stop Tensorboard

When you're done, make sure to call the `stop()` method of the Tensorboard object, or it will stay running even after your job completes.

In [None]:
tb.stop()

## F8 But, what if my data is not sitting somewhere on the internet

In [None]:
ds = ws.get_default_datastore()
mnist_data = ds.upload(src_dir = '/Users/danielsc/data/mnist', target_path = 'mnist', show_progress = True)


## F9 Run the same as above but with script_param pointing to the data

In [None]:
# run the same was as above
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=p100cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)
RunDetails(run).show()

## 4. Hyperparameter tuning

In [None]:
# same as above but increase the max_steps and remove the parameters
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
    '--max_steps': 5000
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=p100cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)


## 5. kicking of the job

In [None]:
ps = RandomParameterSampling(
    {
        '--learning_rate': loguniform(-15, -3),
        '--dropout': uniform(0.5, 0.95)
    }
)

early_termination_policy = BanditPolicy(slack_factor = 0.15, evaluation_interval=2)

hyperdrive_run_config = HyperDriveRunConfig(estimator = tf_estimator, 
                                            hyperparameter_sampling = ps, 
                                            policy = early_termination_policy,
                                            primary_metric_name = "Accuracy_test",
                                            primary_metric_goal = PrimaryMetricGoal.MAXIMIZE,
                                            max_total_runs = 20,
                                            max_concurrent_runs = 5)

hd_run = mnist.submit(hyperdrive_run_config)

RunDetails(hd_run).show()

## 7. What if I don't know what type of model to choose?

In [None]:
automl_config = AutoMLConfig(task = 'classification',
                             path=source_folder,
                             compute_target = p100cluster,
                             data_script = source_folder + "/get_data.py",
                             max_time_sec = 600,
                             iterations = 20,
                             n_cross_validations = 5,
                             primary_metric = Metric.AUCWeighted,
                             concurrent_iterations = 10)

remote_run = mnist.submit(automl_config)

RunDetails(remote_run).show()

In [None]:
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.constants import Metric
from ignite_demo.get_data import get_data
import time, logging


automl_config = AutoMLConfig(task = 'classification',
                             debug_log = 'automl_errors.log',
                             primary_metric = 'AUC_weighted',
                             iterations = 10,
                             n_cross_validations = 3,
                             verbosity = logging.INFO,
                             X = get_data()['X'], 
                             y = get_data()['y'],
                             max_cores_per_iteration = 1,
                             enforce_time_on_windows = False,
                             path=source_folder)

local_run = mnist.submit(automl_config, show_output=True)