# Object Detection with PyTorch and Mask R-CNN 

In this tutorial, you will finetune a pre-trained [Mask R-CNN](https://arxiv.org/abs/1703.06870) model on images from the [Penn-Fudan Database for Pedestrian Detection and Segmentation](https://www.cis.upenn.edu/~jshi/ped_html/). The dataset has 170 images with 345 instances of pedestrians.

## Prerequisities

- If you are using an Azure Machine Learning Notebook VM, your environment already meets these prerequisites. Otherwise, go through the [Configuration](https://docs.microsoft.com/azure/machine-learning/how-to-configure-environment) steps to install the Azure Machine Learning Python SDK and [create an Azure ML Workspace](https://docs.microsoft.com/azure/machine-learning/how-to-manage-workspace#create-a-workspace).


In [32]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)

SDK version: 1.0.65


## Diagnostics

Opt-in diagnostics for better experience, quality, and security in future releases.

In [33]:
from azureml.telemetry import set_diagnostics_collection

set_diagnostics_collection(send_diagnostics=True)

Turning diagnostics collection on. 


## Initialize a workspace

Initialize a [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`, using the [from_config()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.workspace(class)?view=azure-ml-py#from-config-path-none--auth-none---logger-none---file-name-none-) method.

In [34]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep='\n')

Workspace name: gopalv-ws
Azure region: westus2
Subscription id: 15ae9cb6-95c1-483d-a0e3-b1a1a3b06324
Resource group: aifxdemo


## Create or attach existing Azure ML Managed Compute

You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/concept-compute-target) for training your model. In this tutorial, we use [Azure ML managed compute](https://docs.microsoft.com/azure/machine-learning/how-to-set-up-training-targets#amlcompute) for our remote training compute resource. Specifically, the below code creates a `STANDARD_NC6` GPU cluster that autoscales from 0 to 4 nodes.

**Creation of Compute takes approximately 5 minutes.** If the Aauzre ML Compute with that name is already in your workspace, this code will skip the creation process. 

As with other Azure servies, there are limits on certain resources associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/azure/machine-learning/how-to-manage-quotas) on the default limits and how to request more quota.

> Note that the below code creates GPU compute. If you instead want to create CPU compute, provide a different VM size to the `vm_size` parameter, such as `STANDARD_D2_V2`.

In [35]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException


# choose a name for your cluster
cluster_name = 'gpu-cluster'

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target.')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', 
                                                           max_nodes=4)

    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

    compute_target.wait_for_completion(show_output=True)

# use get_status() to get a detailed status for the current cluster. 
print(compute_target.get_status().serialize())

['demouser', 'demo@pass123']
Found existing compute target.
{'currentNodeCount': 1, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 1, 'preemptedNodeCount': 0}, 'allocationState': 'Resizing', 'allocationStateTransitionTime': '2020-03-06T18:53:17.364000+00:00', 'errors': None, 'creationTime': '2020-03-05T14:55:28.441267+00:00', 'modifiedTime': '2020-03-05T15:05:30.544256+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT120S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_NC6'}


## Train model on the remote compute

### Create a project directory
Create a directory that will contain all the code from your local machine that you will need access to on the remote resource. This includes the training script an any additional files your training script depends on.

In [36]:
import os

project_folder = './pytorch-peds'

try:
    os.makedirs(project_folder, exist_ok=False)
except FileExistsError:
    print(f'project folder {project_folder} exists, moving on...')

project folder ./pytorch-peds exists, moving on...


Possibly helpful: [this link](https://github.com/drabastomek/GTC/blob/master/SJ_2020/workshop/1_Setup/Setup.ipynb), and this sample dockerfile from Jordan:

```
FROM mcr.microsoft.com/azureml/base-gpu:intelmpi2018.3-cuda9.0-cudnn7-ubuntu16.04

# Install Horovod, temporarily using CUDA stubsddd 
RUN ldconfig /usr/local/cuda/lib64/stubs && \     
# Install AzureML SDK     
pip install --no-cache-dir azureml-defaults && \     
# Install PyTorch     
pip install --no-cache-dir tensorflow==2.0.0b1 tensorflow-gpu==2.0.0b1 keras==2.0.8 matplotlib==3.0.3 seaborn==0.9.0 requests==2.21.0 bs4==0.0.1 imageio==2.5.0 sklearn pandas==0.24.2 numpy==1.16.2 hickle==3.4.3 && \     
# Install Horovod     
pip install --no-cache-dir horovod==0.13.5 && \     ldconfig
```

### Copy training script and dependencies into project directory

In [37]:
import shutil

shutil.copy('data.py', project_folder)
shutil.copy('model.py', project_folder)
shutil.copy('script.py', project_folder)

files_to_copy = ['utils', 'transforms', 'coco_eval', 'engine', 'coco_utils']
for file in files_to_copy:
    shutil.copy('./'+ file + '.py', project_folder)

In [38]:
# !git clone https://github.com/pytorch/vision.git

# !git checkout v0.3.0

# %cd vision
# !cp references/detection/utils.py ../
# !cp references/detection/transforms.py ../
# !cp references/detection/coco_eval.py ../
# !cp references/detection/engine.py ../
# !cp references/detection/coco_utils.py ../
# %cd ..



### Download data and upload to Azure blob storage

First we download the sample dataset, and extract the images into local storage.

In [39]:
import urllib.request

from zipfile import ZipFile

data_file = './test.zip'

urllib.request.urlretrieve('https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip', data_file)
zip = ZipFile(file=data_file)
zip.extractall()
!ls PennFudanPed/

Annotation  PNGImages  PedMasks  added-object-list.txt	readme.txt


Then, we upload the data files to the datastore associated with this workspace, so that we can access them during training.

In [40]:
# get the default datastore
ds = ws.get_default_datastore()
print(ds.name, ds.datastore_type, ds.account_name, ds.container_name)

ds.upload('./PennFudanPed', target_path=None, overwrite=False)

workspaceblobstore AzureBlob gopalvws3790775563 azureml-blobstore-e47496c6-9688-4277-a05b-ceb722514b9d
Uploading an estimated of 512 files
Target already exists. Skipping upload for added-object-list.txt
Target already exists. Skipping upload for readme.txt
Target already exists. Skipping upload for Annotation/FudanPed00001.txt
Target already exists. Skipping upload for Annotation/FudanPed00002.txt
Target already exists. Skipping upload for Annotation/FudanPed00003.txt
Target already exists. Skipping upload for Annotation/FudanPed00004.txt
Target already exists. Skipping upload for Annotation/FudanPed00005.txt
Target already exists. Skipping upload for Annotation/FudanPed00006.txt
Target already exists. Skipping upload for Annotation/FudanPed00007.txt
Target already exists. Skipping upload for Annotation/FudanPed00008.txt
Target already exists. Skipping upload for Annotation/FudanPed00009.txt
Target already exists. Skipping upload for Annotation/FudanPed00010.txt
Target already exists.

Target already exists. Skipping upload for Annotation/PennPed00038.txt
Target already exists. Skipping upload for Annotation/PennPed00039.txt
Target already exists. Skipping upload for Annotation/PennPed00040.txt
Target already exists. Skipping upload for Annotation/PennPed00041.txt
Target already exists. Skipping upload for Annotation/PennPed00042.txt
Target already exists. Skipping upload for Annotation/PennPed00043.txt
Target already exists. Skipping upload for Annotation/PennPed00044.txt
Target already exists. Skipping upload for Annotation/PennPed00045.txt
Target already exists. Skipping upload for Annotation/PennPed00046.txt
Target already exists. Skipping upload for Annotation/PennPed00047.txt
Target already exists. Skipping upload for Annotation/PennPed00048.txt
Target already exists. Skipping upload for Annotation/PennPed00049.txt
Target already exists. Skipping upload for Annotation/PennPed00050.txt
Target already exists. Skipping upload for Annotation/PennPed00051.txt
Target

Target already exists. Skipping upload for PNGImages/FudanPed00064.png
Target already exists. Skipping upload for PNGImages/FudanPed00065.png
Target already exists. Skipping upload for PNGImages/FudanPed00066.png
Target already exists. Skipping upload for PNGImages/FudanPed00067.png
Target already exists. Skipping upload for PNGImages/FudanPed00068.png
Target already exists. Skipping upload for PNGImages/FudanPed00069.png
Target already exists. Skipping upload for PNGImages/FudanPed00070.png
Target already exists. Skipping upload for PNGImages/FudanPed00071.png
Target already exists. Skipping upload for PNGImages/FudanPed00072.png
Target already exists. Skipping upload for PNGImages/FudanPed00073.png
Target already exists. Skipping upload for PNGImages/FudanPed00074.png
Target already exists. Skipping upload for PNGImages/PennPed00001.png
Target already exists. Skipping upload for PNGImages/PennPed00002.png
Target already exists. Skipping upload for PNGImages/PennPed00003.png
Target al

Target already exists. Skipping upload for PedMasks/FudanPed00018_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00019_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00020_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00021_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00022_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00023_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00024_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00025_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00026_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00027_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00028_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00029_mask.png
Target already exists. Skipping upload for PedMasks/FudanPed00030_mask.png
Target already exists. Sk

Target already exists. Skipping upload for PedMasks/PennPed00054_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00055_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00056_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00057_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00058_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00059_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00060_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00061_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00062_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00063_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00064_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00065_mask.png
Target already exists. Skipping upload for PedMasks/PennPed00066_mask.png
Target already exists. Skipping upload

$AZUREML_DATAREFERENCE_workspaceblobstore

### Register a dataset


In [41]:
from azureml.core import Dataset

dataset_name = 'penn_ds'
datastore_paths = [(ds, 'data')]
penn_ds = Dataset.File.from_files(path=datastore_paths)
penn_ds.register(workspace=ws,
                 name=dataset_name,
                 description='Penn Fudan pedestrian data')

{
  "source": [
    "('workspaceblobstore', 'data')"
  ],
  "definition": [
    "GetDatastoreFiles"
  ],
  "registration": {
    "name": "penn_ds",
    "version": 1,
    "description": "Penn Fudan pedestrian data",
    "workspace": "Workspace.create(name='gopalv-ws', subscription_id='15ae9cb6-95c1-483d-a0e3-b1a1a3b06324', resource_group='aifxdemo')"
  }
}

### Create an experiment

In [42]:
from azureml.core import Experiment

experiment_name = 'pytorch-peds'
experiment = Experiment(ws, name=experiment_name)

### Specify dependencies with a custom Dockerfile

There are a number of ways to [use environments](https://docs.microsoft.com/azure/machine-learning/how-to-use-environments) for specifying dependencies during model training. In this case, we use a custom Dockerfile.

In [43]:
from azureml.core import Environment

my_env = Environment(name='maskr-docker')
my_env.docker.enabled = True
with open("dockerfiles/Dockerfile1", "r") as f:
    dockerfile_contents_of_your_base_image=f.read()
my_env.docker.base_dockerfile=dockerfile_contents_of_your_base_image 
my_env.docker.base_image = None
my_env.docker.gpu_support = True
my_env.python.interpreter_path = '/opt/miniconda/bin/python'
my_env.python.user_managed_dependencies = True





### Create a ScriptRunConfig

Use the [ScriptRunConfig](https://docs.microsoft.com/python/api/azureml-core/azureml.core.scriptrunconfig?view=azure-ml-py) class to define your run. Specify the source driectory, compute target, and environment.

In [44]:
from azureml.train.dnn import PyTorch
from azureml.core import ScriptRunConfig

model_name = 'pytorch-peds'
output_dir = './outputs/test'
n_epochs = 10

script_args = [
    '--dataset_name', dataset_name,
    '--model_name', model_name,
    '--output_dir', output_dir,
    '--n_epochs', n_epochs
]
# Add training script to run config
runconfig = ScriptRunConfig(
    source_directory=project_folder,
    script="script.py",
    arguments=script_args)

# Attach compute target to run config
runconfig.run_config.target = cluster_name

# Uncomment the line below if you want to try this locally first
#runconfig.run_config.target = "local"

# Attach environment to run config
runconfig.run_config.environment = my_env

### Submit your run

In [45]:
# Submit run 
run = experiment.submit(runconfig)

# to get more details of your run
print(run.get_details())

{'runId': 'pytorch-peds_1583520975_43211984', 'target': 'gpu-cluster', 'status': 'Starting', 'properties': {'_azureml.ComputeTargetType': 'amlcompute', 'ContentSnapshotId': '72aa4d9c-801c-4b7a-92ec-dfeb69f48ceb', 'azureml.git.repository_uri': 'git@github.com:gvashishtha/pytorch-object.git', 'mlflow.source.git.repoURL': 'git@github.com:gvashishtha/pytorch-object.git', 'azureml.git.branch': 'register-model', 'mlflow.source.git.branch': 'register-model', 'azureml.git.commit': '164fbede8c31e67bdc3b9e310a39e6d2765f3046', 'mlflow.source.git.commit': '164fbede8c31e67bdc3b9e310a39e6d2765f3046', 'azureml.git.dirty': 'False', 'AzureML.DerivedImageName': 'azureml/azureml_9125fd9b495cfdec8f7bf56c6d28d91d'}, 'inputDatasets': [], 'runDefinition': {'script': 'script.py', 'useAbsolutePath': False, 'arguments': ['--dataset_name', 'penn_ds', '--model_name', 'pytorch-peds', '--output_dir', './outputs/test', '--n_epochs', '10'], 'sourceDirectoryDataStore': None, 'framework': 'Python', 'communicator': 'Non

### Monitor your run

In [46]:
from azureml.widgets import RunDetails

RunDetails(run).show()
run.wait_for_completion(show_output=True)

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': True, 'log_level': 'INFO', 's…

RunId: pytorch-peds_1583520975_43211984
Web View: https://mlworkspace.azure.ai/portal/subscriptions/15ae9cb6-95c1-483d-a0e3-b1a1a3b06324/resourceGroups/aifxdemo/providers/Microsoft.MachineLearningServices/workspaces/gopalv-ws/experiments/pytorch-peds/runs/pytorch-peds_1583520975_43211984

Streaming azureml-logs/55_azureml-execution-tvmps_b3d6525af5ca16b5531c7c8df206f75e7a1399e0c12532e1ede88b949915b934_d.txt

2020-03-06T19:01:19Z Starting output-watcher...
2020-03-06T19:01:19Z IsDedicatedCompute == True, won't poll for Low Pri Preemption
Login Succeeded
Using default tag: latest
latest: Pulling from azureml/azureml_9125fd9b495cfdec8f7bf56c6d28d91d
7ddbc47eeb70: Pulling fs layer
c1bbdc448b72: Pulling fs layer
8c3b70e39044: Pulling fs layer
45d437916d57: Pulling fs layer
d8f1569ddae6: Pulling fs layer
85386706b020: Pulling fs layer
ee9b457b77d0: Pulling fs layer
be4f3343ecd3: Pulling fs layer
30b4effda4fd: Pulling fs layer
b398e882f414: Pulling fs layer
f2e1f2321196: Pulling fs layer
1e87

imgs is ['FudanPed00001.png', 'FudanPed00002.png', 'FudanPed00003.png', 'FudanPed00004.png', 'FudanPed00005.png', 'FudanPed00006.png', 'FudanPed00007.png', 'FudanPed00008.png', 'FudanPed00009.png', 'FudanPed00010.png', 'FudanPed00011.png', 'FudanPed00012.png', 'FudanPed00013.png', 'FudanPed00014.png', 'FudanPed00015.png', 'FudanPed00016.png', 'FudanPed00017.png', 'FudanPed00018.png', 'FudanPed00019.png', 'FudanPed00020.png', 'FudanPed00021.png', 'FudanPed00022.png', 'FudanPed00023.png', 'FudanPed00024.png', 'FudanPed00025.png', 'FudanPed00026.png', 'FudanPed00027.png', 'FudanPed00028.png', 'FudanPed00029.png', 'FudanPed00030.png', 'FudanPed00031.png', 'FudanPed00032.png', 'FudanPed00033.png', 'FudanPed00034.png', 'FudanPed00035.png', 'FudanPed00036.png', 'FudanPed00037.png', 'FudanPed00038.png', 'FudanPed00039.png', 'FudanPed00040.png', 'FudanPed00041.png', 'FudanPed00042.png', 'FudanPed00043.png', 'FudanPed00044.png', 'FudanPed00045.png', 'FudanPed00046.png', 'FudanPed00047.png', 'Fud

Epoch: [0]  [ 0/60]  eta: 0:02:44  lr: 0.000090  loss: 3.5827 (3.5827)  loss_classifier: 0.7385 (0.7385)  loss_box_reg: 0.1523 (0.1523)  loss_mask: 2.6620 (2.6620)  loss_objectness: 0.0224 (0.0224)  loss_rpn_box_reg: 0.0076 (0.0076)  time: 2.7464  data: 0.1213  max mem: 2303
Epoch: [0]  [10/60]  eta: 0:01:21  lr: 0.000936  loss: 1.5610 (2.1212)  loss_classifier: 0.4484 (0.4976)  loss_box_reg: 0.1826 (0.1906)  loss_mask: 0.9259 (1.4017)  loss_objectness: 0.0224 (0.0208)  loss_rpn_box_reg: 0.0090 (0.0105)  time: 1.6329  data: 0.0198  max mem: 2862
Epoch: [0]  [20/60]  eta: 0:01:02  lr: 0.001783  loss: 0.8701 (1.4318)  loss_classifier: 0.2394 (0.3411)  loss_box_reg: 0.1579 (0.1732)  loss_mask: 0.4010 (0.8839)  loss_objectness: 0.0191 (0.0216)  loss_rpn_box_reg: 0.0099 (0.0120)  time: 1.5068  data: 0.0080  max mem: 2862
Epoch: [0]  [30/60]  eta: 0:00:47  lr: 0.002629  loss: 0.5360 (1.1214)  loss_classifier: 0.0940 (0.2570)  loss_box_reg: 0.1155 (0.1599)  loss_mask: 0.2490 (0.6752)  loss_ob

Epoch: [2]  [10/60]  eta: 0:01:18  lr: 0.005000  loss: 0.1848 (0.1741)  loss_classifier: 0.0291 (0.0291)  loss_box_reg: 0.0137 (0.0158)  loss_mask: 0.1111 (0.1199)  loss_objectness: 0.0006 (0.0011)  loss_rpn_box_reg: 0.0063 (0.0082)  time: 1.5780  data: 0.0150  max mem: 3597
Epoch: [2]  [20/60]  eta: 0:01:00  lr: 0.005000  loss: 0.1434 (0.1592)  loss_classifier: 0.0195 (0.0235)  loss_box_reg: 0.0094 (0.0123)  loss_mask: 0.1075 (0.1156)  loss_objectness: 0.0004 (0.0008)  loss_rpn_box_reg: 0.0045 (0.0070)  time: 1.5153  data: 0.0061  max mem: 3597
Epoch: [2]  [30/60]  eta: 0:00:46  lr: 0.005000  loss: 0.1586 (0.1813)  loss_classifier: 0.0271 (0.0297)  loss_box_reg: 0.0092 (0.0169)  loss_mask: 0.1153 (0.1248)  loss_objectness: 0.0003 (0.0012)  loss_rpn_box_reg: 0.0078 (0.0087)  time: 1.5495  data: 0.0063  max mem: 3597
Epoch: [2]  [40/60]  eta: 0:00:31  lr: 0.005000  loss: 0.1961 (0.1826)  loss_classifier: 0.0331 (0.0293)  loss_box_reg: 0.0153 (0.0166)  loss_mask: 0.1266 (0.1269)  loss_ob

Epoch: [4]  [10/60]  eta: 0:01:18  lr: 0.000500  loss: 0.1588 (0.1612)  loss_classifier: 0.0245 (0.0223)  loss_box_reg: 0.0109 (0.0131)  loss_mask: 0.1094 (0.1175)  loss_objectness: 0.0004 (0.0007)  loss_rpn_box_reg: 0.0074 (0.0075)  time: 1.5720  data: 0.0142  max mem: 3597
Epoch: [4]  [20/60]  eta: 0:01:00  lr: 0.000500  loss: 0.1588 (0.1640)  loss_classifier: 0.0245 (0.0246)  loss_box_reg: 0.0074 (0.0118)  loss_mask: 0.1094 (0.1198)  loss_objectness: 0.0004 (0.0012)  loss_rpn_box_reg: 0.0059 (0.0066)  time: 1.5186  data: 0.0065  max mem: 3597
Epoch: [4]  [30/60]  eta: 0:00:45  lr: 0.000500  loss: 0.1458 (0.1668)  loss_classifier: 0.0254 (0.0270)  loss_box_reg: 0.0078 (0.0124)  loss_mask: 0.1012 (0.1185)  loss_objectness: 0.0005 (0.0014)  loss_rpn_box_reg: 0.0059 (0.0074)  time: 1.4981  data: 0.0067  max mem: 3597
Epoch: [4]  [40/60]  eta: 0:00:31  lr: 0.000500  loss: 0.1489 (0.1642)  loss_classifier: 0.0258 (0.0269)  loss_box_reg: 0.0090 (0.0121)  loss_mask: 0.1073 (0.1168)  loss_ob

Epoch: [6]  [10/60]  eta: 0:01:23  lr: 0.000050  loss: 0.1387 (0.1445)  loss_classifier: 0.0206 (0.0221)  loss_box_reg: 0.0060 (0.0074)  loss_mask: 0.1018 (0.1091)  loss_objectness: 0.0002 (0.0005)  loss_rpn_box_reg: 0.0059 (0.0054)  time: 1.6753  data: 0.0158  max mem: 3597
Epoch: [6]  [20/60]  eta: 0:01:06  lr: 0.000050  loss: 0.1387 (0.1525)  loss_classifier: 0.0192 (0.0236)  loss_box_reg: 0.0064 (0.0106)  loss_mask: 0.1011 (0.1112)  loss_objectness: 0.0002 (0.0005)  loss_rpn_box_reg: 0.0059 (0.0067)  time: 1.6492  data: 0.0063  max mem: 3597
Epoch: [6]  [30/60]  eta: 0:00:49  lr: 0.000050  loss: 0.1449 (0.1536)  loss_classifier: 0.0252 (0.0254)  loss_box_reg: 0.0078 (0.0101)  loss_mask: 0.1028 (0.1108)  loss_objectness: 0.0003 (0.0006)  loss_rpn_box_reg: 0.0077 (0.0068)  time: 1.6225  data: 0.0063  max mem: 3597
Epoch: [6]  [40/60]  eta: 0:00:32  lr: 0.000050  loss: 0.1516 (0.1606)  loss_classifier: 0.0254 (0.0267)  loss_box_reg: 0.0078 (0.0113)  loss_mask: 0.1046 (0.1147)  loss_ob

Epoch: [8]  [10/60]  eta: 0:01:19  lr: 0.000050  loss: 0.1523 (0.1525)  loss_classifier: 0.0254 (0.0232)  loss_box_reg: 0.0066 (0.0106)  loss_mask: 0.1112 (0.1119)  loss_objectness: 0.0003 (0.0006)  loss_rpn_box_reg: 0.0045 (0.0061)  time: 1.5929  data: 0.0265  max mem: 3597
Epoch: [8]  [20/60]  eta: 0:01:03  lr: 0.000050  loss: 0.1489 (0.1610)  loss_classifier: 0.0217 (0.0228)  loss_box_reg: 0.0084 (0.0119)  loss_mask: 0.1101 (0.1181)  loss_objectness: 0.0004 (0.0010)  loss_rpn_box_reg: 0.0067 (0.0072)  time: 1.5820  data: 0.0054  max mem: 3597
Epoch: [8]  [30/60]  eta: 0:00:47  lr: 0.000050  loss: 0.1395 (0.1570)  loss_classifier: 0.0217 (0.0229)  loss_box_reg: 0.0093 (0.0120)  loss_mask: 0.1005 (0.1141)  loss_objectness: 0.0003 (0.0009)  loss_rpn_box_reg: 0.0078 (0.0071)  time: 1.5616  data: 0.0070  max mem: 3597
Epoch: [8]  [40/60]  eta: 0:00:31  lr: 0.000050  loss: 0.1451 (0.1638)  loss_classifier: 0.0232 (0.0247)  loss_box_reg: 0.0099 (0.0123)  loss_mask: 0.1117 (0.1181)  loss_ob


Streaming azureml-logs/75_job_post-tvmps_b3d6525af5ca16b5531c7c8df206f75e7a1399e0c12532e1ede88b949915b934_d.txt

Starting job release. Current time:2020-03-06T19:24:25.307059
Logging experiment finalizing status in history service.
Starting the daemon thread to refresh tokens in background for process with pid = 815
Job release is complete. Current time:2020-03-06T19:24:27.834744

Execution Summary
RunId: pytorch-peds_1583520975_43211984
Web View: https://mlworkspace.azure.ai/portal/subscriptions/15ae9cb6-95c1-483d-a0e3-b1a1a3b06324/resourceGroups/aifxdemo/providers/Microsoft.MachineLearningServices/workspaces/gopalv-ws/experiments/pytorch-peds/runs/pytorch-peds_1583520975_43211984



{'runId': 'pytorch-peds_1583520975_43211984',
 'target': 'gpu-cluster',
 'status': 'Completed',
 'startTimeUtc': '2020-03-06T19:01:19.232987Z',
 'endTimeUtc': '2020-03-06T19:25:07.235836Z',
 'properties': {'_azureml.ComputeTargetType': 'amlcompute',
  'ContentSnapshotId': '72aa4d9c-801c-4b7a-92ec-dfeb69f48ceb',
  'azureml.git.repository_uri': 'git@github.com:gvashishtha/pytorch-object.git',
  'mlflow.source.git.repoURL': 'git@github.com:gvashishtha/pytorch-object.git',
  'azureml.git.branch': 'register-model',
  'mlflow.source.git.branch': 'register-model',
  'azureml.git.commit': '164fbede8c31e67bdc3b9e310a39e6d2765f3046',
  'mlflow.source.git.commit': '164fbede8c31e67bdc3b9e310a39e6d2765f3046',
  'azureml.git.dirty': 'False',
  'AzureML.DerivedImageName': 'azureml/azureml_9125fd9b495cfdec8f7bf56c6d28d91d',
  'ProcessInfoFile': 'azureml-logs/process_info.json',
  'ProcessStatusFile': 'azureml-logs/process_status.json'},
 'inputDatasets': [{'dataset': {'id': '1cd2dd50-ae6d-44f9-81e2-f0

In [47]:
from azureml.core import Run

last_successful = next(experiment.get_runs())


In [48]:
last_successful.get_properties()

{'_azureml.ComputeTargetType': 'amlcompute',
 'ContentSnapshotId': '72aa4d9c-801c-4b7a-92ec-dfeb69f48ceb',
 'azureml.git.repository_uri': 'git@github.com:gvashishtha/pytorch-object.git',
 'mlflow.source.git.repoURL': 'git@github.com:gvashishtha/pytorch-object.git',
 'azureml.git.branch': 'register-model',
 'mlflow.source.git.branch': 'register-model',
 'azureml.git.commit': '164fbede8c31e67bdc3b9e310a39e6d2765f3046',
 'mlflow.source.git.commit': '164fbede8c31e67bdc3b9e310a39e6d2765f3046',
 'azureml.git.dirty': 'False',
 'AzureML.DerivedImageName': 'azureml/azureml_9125fd9b495cfdec8f7bf56c6d28d91d',
 'ProcessInfoFile': 'azureml-logs/process_info.json',
 'ProcessStatusFile': 'azureml-logs/process_status.json'}

In [51]:
last_successful.get_file_names()
last_successful.register_model(model_name=model_name, model_path=model_name)

Model(workspace=Workspace.create(name='gopalv-ws', subscription_id='15ae9cb6-95c1-483d-a0e3-b1a1a3b06324', resource_group='aifxdemo'), name=pytorch-peds, id=pytorch-peds:4, version=4, tags={}, properties={})

In [None]:
test.register_model(model_name=model_name, model_path='outputs/model.pt')

### Get your latest run and register your model

In [4]:
from azureml.core import Model

model = Model(workspace=ws, name=model_name)

### Download your model and run predictions

We download the model parameters which were registered during the ScriptRun above, using them to initialize a model for inferencing. We then run inferencing on a single test image and display the results.

In [5]:
import torch
from azureml.core import Dataset
from data import PennFudanDataset
from script import get_transform

from model import get_instance_segmentation_model
from script import NUM_CLASSES

path = model.download(target_dir='.', exist_ok=True)

if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

predict_model = get_instance_segmentation_model(NUM_CLASSES)

predict_model.to(device)

predict_model.load_state_dict(torch.load(path, map_location=device))
predict_model.eval()

penn_ds = Dataset.get_by_name(workspace=ws, name='penn_ds')
dataset_test = PennFudanDataset(penn_ds, get_transform(train=False))


# pick one image from the test set
img, _ = dataset_test[0]
# put the model in evaluation mode
predict_model.eval()
with torch.no_grad():
    prediction = predict_model([img.to(device)])

# model = torch.load(path)
#torch.load(model.get_model_path(model_name='outputs/model.pt'))

imgs is ['FudanPed00001.png', 'FudanPed00002.png', 'FudanPed00003.png', 'FudanPed00004.png', 'FudanPed00005.png', 'FudanPed00006.png', 'FudanPed00007.png', 'FudanPed00008.png', 'FudanPed00009.png', 'FudanPed00010.png', 'FudanPed00011.png', 'FudanPed00012.png', 'FudanPed00013.png', 'FudanPed00014.png', 'FudanPed00015.png', 'FudanPed00016.png', 'FudanPed00017.png', 'FudanPed00018.png', 'FudanPed00019.png', 'FudanPed00020.png', 'FudanPed00021.png', 'FudanPed00022.png', 'FudanPed00023.png', 'FudanPed00024.png', 'FudanPed00025.png', 'FudanPed00026.png', 'FudanPed00027.png', 'FudanPed00028.png', 'FudanPed00029.png', 'FudanPed00030.png', 'FudanPed00031.png', 'FudanPed00032.png', 'FudanPed00033.png', 'FudanPed00034.png', 'FudanPed00035.png', 'FudanPed00036.png', 'FudanPed00037.png', 'FudanPed00038.png', 'FudanPed00039.png', 'FudanPed00040.png', 'FudanPed00041.png', 'FudanPed00042.png', 'FudanPed00043.png', 'FudanPed00044.png', 'FudanPed00045.png', 'FudanPed00046.png', 'FudanPed00047.png', 'Fud

ImportError: /home/gopalv/miniconda3/envs/azureml/lib/python3.6/site-packages/torchvision/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c1012CUDATensorIdEv

### Display the input image

In [None]:
Image.fromarray(img.mul(255).permute(1, 2, 0).byte().numpy())

### Display the predicted mask

In [None]:
Image.fromarray(prediction[0]['masks'][0, 0].mul(255).byte().cpu().numpy())

In [None]:
penn_ds = Dataset.get_by_name(workspace=ws, name='penn_ds')
penn_ds.to_path()
penn_ds.download('./PennFudan', overwrite=True)


In [None]:
import os

list(sorted(os.listdir(
            os.path.join('./PennFudan', 'data', 'PNGImages'))))