## Slides

slides: https://microsoft.sharepoint.com/:p:/t/Vienna/EQR04ke7teRLmRtCaghJ4QwBnFrIqu6sIo4LqAogPdMtTw?e=cXd9cZ

https://microsoft-my.sharepoint.com/:p:/p/sejuare/EXAS0v9XvOtMuqttDUItuqoBXzwt78ITVNqUfATM0ffgaA?e=WTHaRK
## Setup

1. Install the VS Code extension at : https://msasg.visualstudio.com/OpenMind%20Studio/OpenMind%20Studio%20Team/_build/results?buildId=3354388
Make sure it has a recent version of the Python SDK -- remove the folder ~/.azureml/envs. A current SDK will be installed when you first use AML from VSCode.
2. Install the Python SDK: https://github.com/Azure/ViennaDocs/tree/master/PrivatePreview
    make sure to install automl, notebook, and contrib
```shell
pip install --upgrade azureml-sdk[notebooks,contrib,automl]
```
3. Clone this repository
```shell
git clone https://github.com/Azure/AzureMLUsabilityStudy
cd ignite
jupyter notebook
```

## Before each demo
* Create a new, empty notebook in the ignite directory of the repo
* if you have run this before, make sure you delete mnist_with_summaries.py from the folder ignite_demo
* open VSCode at the ignite_demo directory
* go to ignite directory and start jupyter notebook there
* make sure to bring up the cluster by running

```python
provisioning_config = BatchAiCompute.provisioning_configuration(vm_size = "STANDARD_NC6",
                                                                autoscale_enabled = True,
                                                                cluster_min_nodes = 10, 
                                                                cluster_max_nodes = 20)

nc6_cluster = ComputeTarget.create(ws, 
                                   name = "nc6_cluster", 
                                   provisioning_configuration=provisioning_config)
```

## F1 Download Tensorflow  demo code (we will just do this in the browser)


In [1]:
source_folder = './ignite_demo'
import requests
import os
tf_code = requests.get("https://raw.githubusercontent.com/tensorflow/tensorflow/r1.8/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py")
with open(os.path.join(source_folder, "mnist_with_summaries.py"), "w") as file:
    file.write(tf_code.text)

## run the script locally (in VSCode)

In [2]:
## running the script in VSCode


## F2 make sure we import everything -- makes for faster typing and intellisense

In [3]:
from azureml.train.dnn import TensorFlow
from azureml.contrib.tensorboard import Tensorboard
from azureml.core import Workspace, Experiment
from azureml.core.compute import ComputeTarget, BatchAiCompute
from azureml.train.widgets import RunDetails
from azureml.train.hyperdrive import *
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.constants import Metric
from ignite_demo.get_data import get_data
import time
import azureml.core
import math

source_folder = './ignite_demo'
print("SDK version:", azureml.core.VERSION)

SDK version: 0.1.59


## F3 This is how you create a workspace (we have already done this before the talk)


In [4]:
ws = Workspace.create(name='DanielSc', 
                      subscription_id='15ae9cb6-95c1-483d-a0e3-b1a1a3b06324', 
                      resource_group='DanielSc', 
                      location='eastus2',
                      exist_ok=True)
ws.name

'DanielSc'

## F4 This is how you create a batch AI cluster (we have already done this before the talk)

In [5]:
provisioning_config = BatchAiCompute.provisioning_configuration(vm_size = "STANDARD_NC6",
                                                                autoscale_enabled = True,
                                                                cluster_min_nodes = 10, 
                                                                cluster_max_nodes = 20)

nc6_cluster = ComputeTarget.create(ws, 
                                   name = "nc6_cluster", 
                                   provisioning_configuration=provisioning_config)


## F5 let's create an experiment in VSCode and then retrieve it here

In [6]:

#mnist = ws.experiments()['ignite']
mnist = Experiment(ws, 'ignite2')

## F6 now we run the script on our cluster, tracked in the mnist Experiment

(what is an estimator, how is tensorflow different, what other estimators are there?)

In [7]:
tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py')

run = mnist.submit(tf_estimator)

## F7 How can we monitor the experiment?

In [8]:
RunDetails(run).show()

_UserRun()

## Let's visualize the Accuracy so we can better see the progress of the training run

1. add this at the top of the file
```python
    from azureml.core.run import Run
```
2. add 'Run.get_submitted....' near line 165, so you get this
```python
    print('Accuracy at step %s: %s' % (i, acc))
    if i % 50 == 0:
        Run.get_submitted_run().log('Accuracy_test', acc)
```
3. add the same line with different log name further down
```python
    test_writer.close()
    Run.get_submitted_run().log('Accuracy', acc)
```

4. and add this to log the parameters chosen at the end of the file
```python
    FLAGS, unparsed = parser.parse_known_args()
    Run.get_submitted_run().log('learning_rate', FLAGS.learning_rate)
    Run.get_submitted_run().log('dropout', FLAGS.dropout)
```

## F10 copy the the script over

In [117]:
# Or just run this cell
from  shutil import copyfile
copyfile('../ignite_misc/save/mnist_with_summaries.py', os.path.join(source_folder, 'mnist_with_summaries.py'))

'./ignite_demo/mnist_with_summaries.py'

## F11 run the same way as above

In [118]:
tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py')

run = mnist.submit(tf_estimator)
RunDetails(run).show()

_UserRun()

## F12 wasn't there another way to visualize the metrics of a model? Tensorboard.

In [119]:
# just add the log_dir parameter
script_params={
    '--log_dir': './logs',
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)
RunDetails(run).show()

_UserRun()

## 1. Start Tensorboard

Now, while the run is in progress, we just need to start Tensorboard with the run as its target, and it will begin streaming logs.

In [121]:
tb = Tensorboard(run)
tb.start()

http://danielscMBP.local:6006


'http://danielscMBP.local:6006'

In [122]:
run

Experiment,Id,Type,Status,Details Page,Docs Page
ignite2,ignite2_1537677517789,azureml.scriptrun,Finalizing,Link to Azure Portal,Link to Documentation


## Stop Tensorboard

When you're done, make sure to call the `stop()` method of the Tensorboard object, or it will stay running even after your job completes.

In [123]:
tb.stop()

## F8 But, what if my data is not sitting somewhere on the internet

In [124]:
ds = ws.get_default_datastore()
mnist_data = ds.upload(src_dir = '/Users/danielsc/data/mnist', target_path = 'mnist', show_progress = True)


Target already exists. Skipping upload for mnist/t10k-images-idx3-ubyte.gz
Target already exists. Skipping upload for mnist/train-images-idx3-ubyte.gz
Target already exists. Skipping upload for mnist/train-labels-idx1-ubyte.gz
Target already exists. Skipping upload for mnist/t10k-labels-idx1-ubyte.gz


## F9 Run the same as above but with script_param pointing to the data

In [125]:
# run the same was as above
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)
RunDetails(run).show()

_UserRun()

## 2. Now let's try some different hyperparameter combinations

In [126]:
# just adding the hyperparameters
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
    '--learning_rate': 0.0005,
    '--dropout': 0.85
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)

RunDetails(run).show()

_UserRun()

In [127]:
# same as above but slightly different values
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
    '--learning_rate': 0.01,
    '--dropout': 0.5
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)

RunDetails(run).show()

_UserRun()

In [128]:
# same as above but slightly different values
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
    '--learning_rate': 0.001,
    '--dropout': 0.8
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = mnist.submit(tf_estimator)

RunDetails(run).show()

_UserRun()

## That's nice, but all that starting of different runs was a lot of work
## 4. Hyperparameter tuning

In [129]:
# same as above but increase the max_steps and remove the parameters
script_params={
    '--log_dir': './logs',
    '--data_dir': mnist_data,
    '--max_steps': 5000
}

tf_estimator = TensorFlow(source_directory=source_folder,
                          compute_target=nc6_cluster,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)


## 5. kicking of the job

In [130]:
ps = RandomParameterSampling(
    {
        '--learning_rate': loguniform(-15, -3),
        '--dropout': uniform(0.5, 0.95)
    }
)

early_termination_policy = BanditPolicy(slack_factor = 0.15, evaluation_interval=2)

hyperdrive_run_config = HyperDriveRunConfig(estimator = tf_estimator, 
                                            hyperparameter_sampling = ps, 
                                            policy = early_termination_policy,
                                            primary_metric_name = "Accuracy_test",
                                            primary_metric_goal = PrimaryMetricGoal.MAXIMIZE,
                                            max_total_runs = 20,
                                            max_concurrent_runs = 5)

hd_run = mnist.submit(hyperdrive_run_config)

RunDetails(hd_run).show()

_HyperDrive(widget_settings={'childWidgetDisplay': 'popup'})

## 7. What if I don't know what type of model to choose?

In [9]:
automl_config = AutoMLConfig(task = 'classification',
                             path=source_folder,
                             compute_target = nc6_cluster,
                             data_script = source_folder + "/get_data.py",
                             max_time_sec = 600,
                             iterations = 20,
                             n_cross_validations = 5,
                             primary_metric = Metric.AUCWeighted,
                             concurrent_iterations = 10)

remote_run = mnist.submit(automl_config)

RunDetails(remote_run).show()

_AutoML(widget_settings={'childWidgetDisplay': 'popup'})

In [30]:
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.constants import Metric
from ignite_demo.get_data import get_data
import time, logging


automl_config = AutoMLConfig(task = 'classification',
                             debug_log = 'automl_errors.log',
                             primary_metric = 'AUC_weighted',
                             iterations = 10,
                             n_cross_validations = 3,
                             verbosity = logging.INFO,
                             X = get_data()['X'], 
                             y = get_data()['y'],
                             max_cores_per_iteration = 1,
                             enforce_time_on_windows = False,
                             path=source_folder)

local_run = mnist.submit(automl_config, show_output=True)

Parent Run ID: AutoML_bc505304-1e3d-4fc6-ba34-3b112e0e44a7
***********************************************************************************************
ITERATION: The iteration being evaluated.
PIPELINE: A summary description of the pipeline being evaluated.
DURATION: Time taken for the current iteration.
METRIC: The result of computing score on the fitted pipeline.
BEST: The best observed score thus far.
***********************************************************************************************

 ITERATION     PIPELINE                               DURATION                METRIC      BEST
         0     

Your function call closed the pipe prematurely -> Subprocess probably got an uncatchable signal.
Traceback (most recent call last):
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/site-packages/azureml/train/automl/_limit_function_call.py", line 304, in __call__
    self2.result, self2.exit_status = parent_conn.recv()
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError


                                       0:00:06.367580           0.000     0.000
ERROR:                                         
         1      StandardScalerWrapper KNeighborsClassi0:00:20.744270           0.998     0.998
         2      MaxAbsScaler LightGBMClassifier       0:00:23.332491           0.998     0.998
         3      MaxAbsScaler DecisionTreeClassifier   0:00:17.269414           0.832     0.998
         4      SparseNormalizer LightGBMClassifier   0:00:28.490736           0.998     0.998
         5      StandardScalerWrapper KNeighborsClassi0:00:20.767038           0.998     0.998
         6      StandardScalerWrapper LightGBMClassifi0:00:26.035480           0.999     0.999
         7     

Your function call closed the pipe prematurely -> Subprocess probably got an uncatchable signal.
Traceback (most recent call last):
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/site-packages/azureml/train/automl/_limit_function_call.py", line 304, in __call__
    self2.result, self2.exit_status = parent_conn.recv()
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/Users/danielsc/miniconda3/envs/preview/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError


                                       0:00:05.778395           0.000     0.999
ERROR:                                         
         8                                            0:14:18.179034           0.000     0.999
ERROR: local variable 'dependencies' referenced before assignment
Received interrupt. Returning now.

In [37]:
for m in ws.models():
    print(m.name)

mnist_tf_model


In [45]:
m = ws.models('mnist_tf_model')[0]


In [49]:
w = ws.webservices(model_name='mnist_tf_model')[0]

In [55]:
i = ws.images(model_id="mnist_tf_model:1")[0]

In [57]:
m.id

'mnist_tf_model:1'

In [58]:
i.id

'tf-mnist:1'

In [60]:
w.name

'aci-service-mnist'

In [61]:
w.cname

In [71]:
list(ws.experiments().values())[0]

Name,Workspace,Report Page,Docs Page
tensorflow-hyperdrive,DanielSc,Link to Azure Portal,Link to Documentation


In [72]:
e = ws.experiments()['test']

In [81]:
next(e.get_runs(properties={'Id':'AutoML_865262da-3cbc-41d9-8c9c-90e28a1df6bd_19'}))

StopIteration: 

In [77]:
next(e.get_runs())

Experiment,Id,Type,Status,Details Page,Docs Page
test,AutoML_865262da-3cbc-41d9-8c9c-90e28a1df6bd_19,azureml.scriptrun,Completed,Link to Azure Portal,Link to Documentation
