# TABLE OF CONTENTS:
---
* [Notebook Summary](#Notebook-Summary)
* [Setup](#Setup)
    * [Connect to Workspace](#Connect-to-Workspace)
* [Data](#Data)
    * [Overview](#Overview)
    * [Download & Extract Data](#Download-&-Extract-Data)
    * [Upload Data](#Upload-Data)
    * [Explore Data](#Explore-Data)
    * [Create and Register AML Dataset](#Create-and-Register-AML-Dataset)
---

# Notebook Summary

This notebook will download the [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) from the Stanford Vision website to the local compute and then upload it to the Azure Machine Learning (AML) workspace default blob storage. It will also create an AML file dataset that can be used for easy data access during the ML lifecycle.

Run this notebook from the Jupyter kernel that has been created in `00_environment_setup`.

# Setup

Append parent directory to sys path to be able to import created modules from src directory.

In [1]:
import sys
sys.path.append(os.path.dirname(os.path.abspath("")))

Automatically reload modules when changes are made.

In [2]:
%load_ext autoreload
%autoreload 2

Import libraries and modules.

In [3]:
# Import libraries
import azureml.core
import torchvision
from azureml.core import Dataset, Workspace

# Import created modules
from src.utils import download_stanford_dogs_archives, extract_stanford_dogs_archives, load_data, show_image, show_batch_of_images

print(f"azureml.core version: {azureml.core.VERSION}")

azureml.core version: 1.20.0


### Connect to Workspace

In order to connect and communicate with the AML workspace, a workspace object needs to be instantiated using the AML Python SDK.

In [4]:
# Connect to the AML workspace using interactive authentication
ws = Workspace.from_config()

# Data

### Overview

The [Stanford Dogs Dataset](http://vision.stanford.edu/aditya86/ImageNetDogs/) is an image dataset that will be used to train a multiclass dog breed classification model. In total there are 120 different dog breeds/classes and 20,580 images. The dataset has been built using images and annotations from ImageNet for the task of fine-grained image categorization. The images are three-channel color images of variable pixels in size. While a file with a given train/test split can be downloaded from the website, the train dataset will be further split into a validation and train set (80:20). This will ultimately lead into a data distribution as follows:
- 9600 training images (47.65%)
- 2400 validation images (11.66%)
- 8580 test images (41.69%)

### Download & Extract Data

Download the data to the local compute.

A utility file with functions to download the dogs dataset archive files from the Stanford Vision website and to extract the archives into a format expected by the [torchvision.datasets.ImageFolder](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder) has been created (`<PROJECT_ROOT>/src/utils/data_utils.py`).

In [5]:
# Download the dataset archive files
download_stanford_dogs_archives()

0it [00:00, ?it/s]

Downloading http://vision.stanford.edu/aditya86/ImageNetDogs/images.tar to /mnt/batch/tasks/shared/LS_root/mounts/clusters/brikse1-cpu/code/Users/BRIKSE1-CPU/pytorch-use-cases-azure-ml/stanford_dogs/data/archives/images/images.tar.


100%|█████████▉| 792608768/793579520 [02:11<00:00, 5944311.46it/s]
0it [00:00, ?it/s][A

Downloading http://vision.stanford.edu/aditya86/ImageNetDogs/lists.tar to /mnt/batch/tasks/shared/LS_root/mounts/clusters/brikse1-cpu/code/Users/BRIKSE1-CPU/pytorch-use-cases-azure-ml/stanford_dogs/data/archives/lists/lists.tar.



  0%|          | 0/481280 [00:00<?, ?it/s][A
 12%|█▏        | 57344/481280 [00:00<00:00, 456583.55it/s][A
 24%|██▍       | 114688/481280 [00:00<00:00, 397698.20it/s][A
 39%|███▉      | 188416/481280 [00:00<00:00, 449030.73it/s][A
 63%|██████▎   | 303104/481280 [00:00<00:00, 511654.20it/s][A
483328it [00:01, 434535.41it/s]                            [A


In [None]:
# Extract the dataset archives and remove them after extraction
extract_stanford_dogs_archives()

Lists.tar archive has been extracted successfully.
File lists have been read successfully.
Extracting images.tar archive...


793583616it [02:30, 5944311.46it/s]                               
  0%|          | 0/20701 [00:00<?, ?it/s][A
  0%|          | 3/20701 [00:00<30:50, 11.19it/s][A
  0%|          | 5/20701 [00:00<28:57, 11.91it/s][A
  0%|          | 7/20701 [00:00<34:58,  9.86it/s][A
  0%|          | 8/20701 [00:00<42:35,  8.10it/s][A
  0%|          | 10/20701 [00:01<36:40,  9.40it/s][A
  0%|          | 11/20701 [00:01<39:49,  8.66it/s][A
  0%|          | 12/20701 [00:01<39:11,  8.80it/s][A
  0%|          | 13/20701 [00:01<39:35,  8.71it/s][A
  0%|          | 15/20701 [00:01<35:40,  9.66it/s][A
  0%|          | 16/20701 [00:01<35:25,  9.73it/s][A
  0%|          | 17/20701 [00:01<35:44,  9.64it/s][A
  0%|          | 18/20701 [00:01<35:54,  9.60it/s][A
  0%|          | 20/20701 [00:02<35:51,  9.61it/s][A
  0%|          | 21/20701 [00:02<1:02:40,  5.50it/s][A
  0%|          | 23/20701 [00:02<51:54,  6.64it/s]  [A
  0%|          | 25/20701 [00:02<46:25,  7.42it/s][A
  0%|          | 27/2070

  1%|▏         | 268/20701 [00:27<32:23, 10.51it/s][A
  1%|▏         | 270/20701 [00:28<31:41, 10.75it/s][A
  1%|▏         | 272/20701 [00:28<28:58, 11.75it/s][A
  1%|▏         | 274/20701 [00:28<32:57, 10.33it/s][A
  1%|▏         | 276/20701 [00:28<33:17, 10.23it/s][A
  1%|▏         | 278/20701 [00:28<33:54, 10.04it/s][A
  1%|▏         | 280/20701 [00:29<35:15,  9.65it/s][A
  1%|▏         | 281/20701 [00:29<44:25,  7.66it/s][A
  1%|▏         | 282/20701 [00:29<42:43,  7.96it/s][A
  1%|▏         | 284/20701 [00:29<39:32,  8.61it/s][A
  1%|▏         | 285/20701 [00:29<40:34,  8.39it/s][A
  1%|▏         | 287/20701 [00:29<35:59,  9.45it/s][A
  1%|▏         | 289/20701 [00:29<33:24, 10.18it/s][A
  1%|▏         | 291/20701 [00:30<29:47, 11.42it/s][A
  1%|▏         | 293/20701 [00:30<29:09, 11.66it/s][A
  1%|▏         | 295/20701 [00:30<28:10, 12.07it/s][A
  1%|▏         | 297/20701 [00:30<26:06, 13.02it/s][A
  1%|▏         | 299/20701 [00:30<29:06, 11.68it/s][A
  1%|▏    

  3%|▎         | 558/20701 [00:55<34:14,  9.81it/s][A
  3%|▎         | 560/20701 [00:55<31:39, 10.60it/s][A
  3%|▎         | 562/20701 [00:55<36:14,  9.26it/s][A
  3%|▎         | 563/20701 [00:56<1:08:57,  4.87it/s][A
  3%|▎         | 565/20701 [00:56<56:08,  5.98it/s]  [A
  3%|▎         | 567/20701 [00:56<48:09,  6.97it/s][A
  3%|▎         | 569/20701 [00:56<51:42,  6.49it/s][A
  3%|▎         | 570/20701 [00:57<51:34,  6.51it/s][A
  3%|▎         | 572/20701 [00:57<44:07,  7.60it/s][A
  3%|▎         | 574/20701 [00:57<37:45,  8.88it/s][A
  3%|▎         | 576/20701 [00:57<32:46, 10.24it/s][A
  3%|▎         | 578/20701 [00:57<28:53, 11.61it/s][A
  3%|▎         | 580/20701 [00:57<30:00, 11.18it/s][A
  3%|▎         | 582/20701 [00:58<30:00, 11.17it/s][A
  3%|▎         | 584/20701 [00:58<26:57, 12.44it/s][A
  3%|▎         | 586/20701 [00:58<26:59, 12.42it/s][A
  3%|▎         | 588/20701 [00:58<26:25, 12.69it/s][A
  3%|▎         | 590/20701 [00:58<25:55, 12.93it/s][A
  3%|▎

  4%|▍         | 847/20701 [01:22<29:10, 11.34it/s][A
  4%|▍         | 849/20701 [01:23<36:30,  9.06it/s][A
  4%|▍         | 851/20701 [01:23<31:18, 10.57it/s][A
  4%|▍         | 853/20701 [01:23<29:49, 11.09it/s][A
  4%|▍         | 855/20701 [01:23<28:58, 11.42it/s][A
  4%|▍         | 857/20701 [01:23<28:30, 11.60it/s][A
  4%|▍         | 859/20701 [01:23<29:39, 11.15it/s][A
  4%|▍         | 861/20701 [01:24<30:09, 10.97it/s][A
  4%|▍         | 863/20701 [01:24<32:01, 10.32it/s][A
  4%|▍         | 865/20701 [01:24<34:06,  9.69it/s][A
  4%|▍         | 867/20701 [01:24<31:07, 10.62it/s][A
  4%|▍         | 869/20701 [01:24<30:18, 10.91it/s][A
  4%|▍         | 871/20701 [01:25<30:35, 10.81it/s][A
  4%|▍         | 873/20701 [01:25<31:27, 10.51it/s][A
  4%|▍         | 875/20701 [01:25<31:59, 10.33it/s][A
  4%|▍         | 877/20701 [01:25<35:40,  9.26it/s][A
  4%|▍         | 879/20701 [01:25<33:08,  9.97it/s][A
  4%|▍         | 881/20701 [01:26<33:24,  9.89it/s][A
  4%|▍    

  5%|▌         | 1125/20701 [01:51<23:59, 13.60it/s][A
  5%|▌         | 1127/20701 [01:51<28:00, 11.65it/s][A
  5%|▌         | 1129/20701 [01:51<31:15, 10.44it/s][A
  5%|▌         | 1131/20701 [01:51<29:01, 11.24it/s][A
  5%|▌         | 1133/20701 [01:52<27:51, 11.71it/s][A
  5%|▌         | 1135/20701 [01:52<28:52, 11.29it/s][A
  5%|▌         | 1137/20701 [01:52<30:34, 10.67it/s][A
  6%|▌         | 1139/20701 [01:52<42:05,  7.75it/s][A
  6%|▌         | 1141/20701 [01:53<38:04,  8.56it/s][A
  6%|▌         | 1142/20701 [01:53<42:06,  7.74it/s][A
  6%|▌         | 1143/20701 [01:53<45:36,  7.15it/s][A
  6%|▌         | 1144/20701 [01:53<1:07:46,  4.81it/s][A
  6%|▌         | 1145/20701 [01:54<1:20:25,  4.05it/s][A
  6%|▌         | 1146/20701 [01:54<1:09:52,  4.66it/s][A
  6%|▌         | 1147/20701 [01:54<1:09:38,  4.68it/s][A
  6%|▌         | 1149/20701 [01:54<56:58,  5.72it/s]  [A
  6%|▌         | 1150/20701 [01:54<1:00:41,  5.37it/s][A
  6%|▌         | 1152/20701 [01:54<4

  7%|▋         | 1366/20701 [02:20<29:14, 11.02it/s][A
  7%|▋         | 1368/20701 [02:21<29:32, 10.91it/s][A
  7%|▋         | 1370/20701 [02:21<31:28, 10.24it/s][A
  7%|▋         | 1372/20701 [02:21<28:57, 11.13it/s][A
  7%|▋         | 1374/20701 [02:21<32:35,  9.88it/s][A
  7%|▋         | 1376/20701 [02:21<29:39, 10.86it/s][A
  7%|▋         | 1378/20701 [02:22<28:25, 11.33it/s][A
  7%|▋         | 1380/20701 [02:22<26:21, 12.22it/s][A
  7%|▋         | 1382/20701 [02:22<25:41, 12.53it/s][A
  7%|▋         | 1384/20701 [02:22<32:21,  9.95it/s][A
  7%|▋         | 1386/20701 [02:22<30:32, 10.54it/s][A
  7%|▋         | 1388/20701 [02:23<31:54, 10.09it/s][A
  7%|▋         | 1390/20701 [02:23<35:37,  9.03it/s][A
  7%|▋         | 1391/20701 [02:23<46:35,  6.91it/s][A
  7%|▋         | 1392/20701 [02:23<43:33,  7.39it/s][A
  7%|▋         | 1393/20701 [02:23<41:17,  7.79it/s][A
  7%|▋         | 1394/20701 [02:23<38:46,  8.30it/s][A
  7%|▋         | 1396/20701 [02:23<33:01,  9.74i

  8%|▊         | 1637/20701 [02:47<54:59,  5.78it/s]  [A
  8%|▊         | 1639/20701 [02:47<46:53,  6.77it/s][A
  8%|▊         | 1640/20701 [02:47<49:41,  6.39it/s][A
  8%|▊         | 1642/20701 [02:47<46:18,  6.86it/s][A
  8%|▊         | 1643/20701 [02:47<47:46,  6.65it/s][A
  8%|▊         | 1644/20701 [02:48<44:14,  7.18it/s][A
  8%|▊         | 1646/20701 [02:48<38:53,  8.17it/s][A
  8%|▊         | 1647/20701 [02:48<38:40,  8.21it/s][A
  8%|▊         | 1649/20701 [02:48<35:14,  9.01it/s][A
  8%|▊         | 1651/20701 [02:48<32:19,  9.82it/s][A
  8%|▊         | 1653/20701 [02:48<33:13,  9.56it/s][A
  8%|▊         | 1655/20701 [02:49<39:09,  8.11it/s][A
  8%|▊         | 1657/20701 [02:49<35:46,  8.87it/s][A
  8%|▊         | 1659/20701 [02:49<34:42,  9.14it/s][A
  8%|▊         | 1661/20701 [02:49<33:06,  9.58it/s][A
  8%|▊         | 1663/20701 [02:49<29:15, 10.84it/s][A
  8%|▊         | 1665/20701 [02:50<28:06, 11.29it/s][A
  8%|▊         | 1667/20701 [02:50<26:57, 11.7

  9%|▉         | 1883/20701 [03:18<34:11,  9.17it/s][A
  9%|▉         | 1884/20701 [03:18<34:10,  9.18it/s][A
  9%|▉         | 1886/20701 [03:18<33:14,  9.43it/s][A
  9%|▉         | 1887/20701 [03:18<53:59,  5.81it/s][A
  9%|▉         | 1889/20701 [03:19<44:44,  7.01it/s][A
  9%|▉         | 1891/20701 [03:19<41:37,  7.53it/s][A
  9%|▉         | 1893/20701 [03:19<38:32,  8.13it/s][A
  9%|▉         | 1894/20701 [03:19<37:58,  8.25it/s][A
  9%|▉         | 1895/20701 [03:19<42:27,  7.38it/s][A
  9%|▉         | 1897/20701 [03:20<41:26,  7.56it/s][A
  9%|▉         | 1899/20701 [03:20<39:21,  7.96it/s][A
  9%|▉         | 1900/20701 [03:22<4:13:22,  1.24it/s][A
  9%|▉         | 1901/20701 [03:22<3:13:33,  1.62it/s][A
  9%|▉         | 1903/20701 [03:22<2:24:08,  2.17it/s][A
  9%|▉         | 1905/20701 [03:23<1:49:06,  2.87it/s][A
  9%|▉         | 1907/20701 [03:23<1:24:19,  3.71it/s][A
  9%|▉         | 1909/20701 [03:23<1:08:26,  4.58it/s][A
  9%|▉         | 1910/20701 [03:23<5

 10%|█         | 2134/20701 [03:47<34:00,  9.10it/s][A
 10%|█         | 2136/20701 [03:47<34:18,  9.02it/s][A
 10%|█         | 2138/20701 [03:47<38:45,  7.98it/s][A
 10%|█         | 2139/20701 [03:48<42:32,  7.27it/s][A
 10%|█         | 2140/20701 [03:48<40:17,  7.68it/s][A
 10%|█         | 2142/20701 [03:48<37:46,  8.19it/s][A
 10%|█         | 2144/20701 [03:48<37:47,  8.18it/s][A
 10%|█         | 2146/20701 [03:48<34:31,  8.96it/s][A
 10%|█         | 2148/20701 [03:48<33:57,  9.10it/s][A
 10%|█         | 2149/20701 [03:49<38:42,  7.99it/s][A
 10%|█         | 2150/20701 [03:49<37:17,  8.29it/s][A
 10%|█         | 2151/20701 [03:49<49:25,  6.26it/s][A
 10%|█         | 2152/20701 [03:49<56:00,  5.52it/s][A
 10%|█         | 2154/20701 [03:49<46:03,  6.71it/s][A
 10%|█         | 2155/20701 [03:49<41:42,  7.41it/s][A
 10%|█         | 2157/20701 [03:50<37:35,  8.22it/s][A
 10%|█         | 2159/20701 [03:50<40:52,  7.56it/s][A
 10%|█         | 2160/20701 [03:50<43:31,  7.10i

 12%|█▏        | 2405/20701 [04:14<48:23,  6.30it/s][A
 12%|█▏        | 2406/20701 [04:14<46:50,  6.51it/s][A
 12%|█▏        | 2408/20701 [04:15<38:44,  7.87it/s][A
 12%|█▏        | 2410/20701 [04:15<39:34,  7.70it/s][A
 12%|█▏        | 2412/20701 [04:15<54:04,  5.64it/s][A
 12%|█▏        | 2413/20701 [04:16<56:38,  5.38it/s][A
 12%|█▏        | 2415/20701 [04:16<46:33,  6.55it/s][A
 12%|█▏        | 2417/20701 [04:16<40:41,  7.49it/s][A
 12%|█▏        | 2418/20701 [04:17<1:26:10,  3.54it/s][A
 12%|█▏        | 2420/20701 [04:17<1:12:08,  4.22it/s][A
 12%|█▏        | 2421/20701 [04:17<1:00:10,  5.06it/s][A
 12%|█▏        | 2423/20701 [04:17<1:00:46,  5.01it/s][A
 12%|█▏        | 2424/20701 [04:17<52:45,  5.77it/s]  [A
 12%|█▏        | 2425/20701 [04:18<56:56,  5.35it/s][A
 12%|█▏        | 2426/20701 [04:18<52:57,  5.75it/s][A
 12%|█▏        | 2427/20701 [04:18<47:45,  6.38it/s][A
 12%|█▏        | 2429/20701 [04:18<39:27,  7.72it/s][A
 12%|█▏        | 2430/20701 [04:18<53:

 13%|█▎        | 2606/20701 [04:50<30:44,  9.81it/s][A
 13%|█▎        | 2608/20701 [04:50<28:46, 10.48it/s][A
 13%|█▎        | 2610/20701 [04:51<30:19,  9.94it/s][A
 13%|█▎        | 2612/20701 [04:51<30:28,  9.89it/s][A
 13%|█▎        | 2614/20701 [04:51<28:09, 10.70it/s][A
 13%|█▎        | 2616/20701 [04:51<28:19, 10.64it/s][A
 13%|█▎        | 2618/20701 [04:51<30:35,  9.85it/s][A
 13%|█▎        | 2620/20701 [04:52<34:37,  8.71it/s][A
 13%|█▎        | 2622/20701 [04:52<32:16,  9.34it/s][A
 13%|█▎        | 2623/20701 [04:52<31:41,  9.51it/s][A
 13%|█▎        | 2624/20701 [04:52<35:12,  8.56it/s][A
 13%|█▎        | 2626/20701 [04:52<36:33,  8.24it/s][A
 13%|█▎        | 2627/20701 [04:53<38:30,  7.82it/s][A
 13%|█▎        | 2628/20701 [04:53<41:18,  7.29it/s][A
 13%|█▎        | 2629/20701 [04:53<39:58,  7.54it/s][A
 13%|█▎        | 2631/20701 [04:53<34:55,  8.62it/s][A
 13%|█▎        | 2632/20701 [04:53<36:56,  8.15it/s][A
 13%|█▎        | 2633/20701 [04:53<37:15,  8.08i

 14%|█▍        | 2870/20701 [05:18<25:16, 11.76it/s][A
 14%|█▍        | 2872/20701 [05:19<30:37,  9.70it/s][A
 14%|█▍        | 2874/20701 [05:19<29:58,  9.91it/s][A
 14%|█▍        | 2876/20701 [05:19<30:31,  9.73it/s][A
 14%|█▍        | 2878/20701 [05:19<34:42,  8.56it/s][A
 14%|█▍        | 2879/20701 [05:19<36:11,  8.21it/s][A
 14%|█▍        | 2881/20701 [05:20<32:34,  9.12it/s][A
 14%|█▍        | 2883/20701 [05:20<30:43,  9.66it/s][A
 14%|█▍        | 2885/20701 [05:20<34:15,  8.67it/s][A
 14%|█▍        | 2887/20701 [05:20<32:17,  9.19it/s][A
 14%|█▍        | 2889/20701 [05:20<27:54, 10.64it/s][A
 14%|█▍        | 2891/20701 [05:21<29:24, 10.10it/s][A
 14%|█▍        | 2893/20701 [05:21<29:20, 10.11it/s][A
 14%|█▍        | 2895/20701 [05:21<31:58,  9.28it/s][A
 14%|█▍        | 2897/20701 [05:21<30:08,  9.85it/s][A
 14%|█▍        | 2899/20701 [05:21<27:22, 10.84it/s][A
 14%|█▍        | 2901/20701 [05:22<27:38, 10.74it/s][A
 14%|█▍        | 2903/20701 [05:22<29:40, 10.00i

 15%|█▌        | 3124/20701 [05:46<43:05,  6.80it/s][A
 15%|█▌        | 3125/20701 [05:47<43:22,  6.75it/s][A
 15%|█▌        | 3127/20701 [05:47<40:55,  7.16it/s][A
 15%|█▌        | 3128/20701 [05:47<37:28,  7.82it/s][A
 15%|█▌        | 3130/20701 [05:47<33:40,  8.70it/s][A
 15%|█▌        | 3131/20701 [05:47<45:48,  6.39it/s][A
 15%|█▌        | 3132/20701 [05:47<50:44,  5.77it/s][A
 15%|█▌        | 3133/20701 [05:48<52:52,  5.54it/s][A
 15%|█▌        | 3134/20701 [05:48<47:14,  6.20it/s][A
 15%|█▌        | 3135/20701 [05:48<43:53,  6.67it/s][A
 15%|█▌        | 3136/20701 [05:48<41:35,  7.04it/s][A
 15%|█▌        | 3137/20701 [05:48<42:44,  6.85it/s][A
 15%|█▌        | 3138/20701 [05:48<52:04,  5.62it/s][A
 15%|█▌        | 3139/20701 [05:49<46:16,  6.32it/s][A
 15%|█▌        | 3140/20701 [05:49<48:50,  5.99it/s][A
 15%|█▌        | 3141/20701 [05:49<50:01,  5.85it/s][A
 15%|█▌        | 3143/20701 [05:49<42:59,  6.81it/s][A
 15%|█▌        | 3144/20701 [05:49<40:58,  7.14i

 16%|█▌        | 3285/20701 [06:14<1:33:02,  3.12it/s][A
 16%|█▌        | 3286/20701 [06:14<1:28:23,  3.28it/s][A
 16%|█▌        | 3287/20701 [06:15<1:15:48,  3.83it/s][A
 16%|█▌        | 3288/20701 [06:15<1:16:25,  3.80it/s][A
 16%|█▌        | 3289/20701 [06:15<1:04:40,  4.49it/s][A
 16%|█▌        | 3290/20701 [06:15<55:46,  5.20it/s]  [A
 16%|█▌        | 3291/20701 [06:15<57:03,  5.09it/s][A
 16%|█▌        | 3292/20701 [06:15<53:27,  5.43it/s][A
 16%|█▌        | 3293/20701 [06:15<46:55,  6.18it/s][A
 16%|█▌        | 3294/20701 [06:16<54:26,  5.33it/s][A
 16%|█▌        | 3295/20701 [06:16<56:54,  5.10it/s][A
 16%|█▌        | 3296/20701 [06:16<51:16,  5.66it/s][A
 16%|█▌        | 3297/20701 [06:16<51:24,  5.64it/s][A
 16%|█▌        | 3298/20701 [06:16<47:56,  6.05it/s][A
 16%|█▌        | 3299/20701 [06:17<44:40,  6.49it/s][A
 16%|█▌        | 3300/20701 [06:17<51:16,  5.66it/s][A
 16%|█▌        | 3301/20701 [06:17<45:08,  6.42it/s][A
 16%|█▌        | 3302/20701 [06:17<4

 17%|█▋        | 3450/20701 [06:39<39:09,  7.34it/s][A
 17%|█▋        | 3451/20701 [06:39<46:16,  6.21it/s][A
 17%|█▋        | 3453/20701 [06:39<42:59,  6.69it/s][A
 17%|█▋        | 3455/20701 [06:40<37:39,  7.63it/s][A
 17%|█▋        | 3456/20701 [06:40<35:42,  8.05it/s][A
 17%|█▋        | 3457/20701 [06:40<35:40,  8.06it/s][A
 17%|█▋        | 3458/20701 [06:40<34:06,  8.42it/s][A
 17%|█▋        | 3459/20701 [06:40<41:29,  6.93it/s][A
 17%|█▋        | 3460/20701 [06:40<39:00,  7.37it/s][A
 17%|█▋        | 3462/20701 [06:40<37:44,  7.61it/s][A
 17%|█▋        | 3463/20701 [06:41<39:50,  7.21it/s][A
 17%|█▋        | 3464/20701 [06:41<41:57,  6.85it/s][A
 17%|█▋        | 3465/20701 [06:41<40:15,  7.13it/s][A
 17%|█▋        | 3467/20701 [06:41<36:06,  7.95it/s][A
 17%|█▋        | 3468/20701 [06:41<34:47,  8.26it/s][A
 17%|█▋        | 3469/20701 [06:41<37:59,  7.56it/s][A
 17%|█▋        | 3470/20701 [06:42<37:47,  7.60it/s][A
 17%|█▋        | 3472/20701 [06:42<34:22,  8.35i

 18%|█▊        | 3667/20701 [07:05<25:23, 11.18it/s][A
 18%|█▊        | 3669/20701 [07:05<26:44, 10.62it/s][A
 18%|█▊        | 3671/20701 [07:06<30:17,  9.37it/s][A
 18%|█▊        | 3673/20701 [07:06<29:31,  9.61it/s][A
 18%|█▊        | 3675/20701 [07:06<28:56,  9.81it/s][A
 18%|█▊        | 3677/20701 [07:06<34:16,  8.28it/s][A
 18%|█▊        | 3679/20701 [07:07<40:19,  7.03it/s][A
 18%|█▊        | 3681/20701 [07:07<36:48,  7.71it/s][A
 18%|█▊        | 3682/20701 [07:07<44:59,  6.31it/s][A
 18%|█▊        | 3683/20701 [07:07<57:38,  4.92it/s][A
 18%|█▊        | 3684/20701 [07:08<49:37,  5.72it/s][A
 18%|█▊        | 3685/20701 [07:08<54:58,  5.16it/s][A
 18%|█▊        | 3686/20701 [07:08<53:35,  5.29it/s][A
 18%|█▊        | 3687/20701 [07:08<51:55,  5.46it/s][A
 18%|█▊        | 3688/20701 [07:08<48:43,  5.82it/s][A
 18%|█▊        | 3689/20701 [07:08<43:40,  6.49it/s][A
 18%|█▊        | 3690/20701 [07:09<49:08,  5.77it/s][A
 18%|█▊        | 3691/20701 [07:09<58:13,  4.87i

 19%|█▉        | 3885/20701 [07:33<34:23,  8.15it/s][A
 19%|█▉        | 3887/20701 [07:33<31:48,  8.81it/s][A
 19%|█▉        | 3888/20701 [07:33<32:40,  8.57it/s][A
 19%|█▉        | 3890/20701 [07:33<29:58,  9.35it/s][A
 19%|█▉        | 3891/20701 [07:33<31:39,  8.85it/s][A
 19%|█▉        | 3893/20701 [07:33<30:46,  9.10it/s][A
 19%|█▉        | 3895/20701 [07:34<31:11,  8.98it/s][A
 19%|█▉        | 3897/20701 [07:34<28:45,  9.74it/s][A
 19%|█▉        | 3899/20701 [07:34<41:10,  6.80it/s][A
 19%|█▉        | 3900/20701 [07:34<37:57,  7.38it/s][A
 19%|█▉        | 3901/20701 [07:35<35:41,  7.84it/s][A
 19%|█▉        | 3902/20701 [07:35<37:39,  7.43it/s][A
 19%|█▉        | 3904/20701 [07:35<37:26,  7.48it/s][A
 19%|█▉        | 3906/20701 [07:35<32:34,  8.60it/s][A
 19%|█▉        | 3907/20701 [07:35<36:04,  7.76it/s][A
 19%|█▉        | 3908/20701 [07:35<36:40,  7.63it/s][A
 19%|█▉        | 3909/20701 [07:36<35:27,  7.89it/s][A
 19%|█▉        | 3910/20701 [07:36<33:25,  8.37i

### Upload Data

Upload the data to the default AML datastore.

In [None]:
datastore = ws.get_default_datastore()
datastore.upload(src_dir="../data", target_path="data/stanford_dogs", overwrite=True)

### Explore Data

Load the data into memory. A utility function to create dataloaders has been created as part of the `<PROJECT_ROOT>/src/utils/data_utils.py` script.

In [None]:
# Load data
dataloaders, dataset_sizes, class_names = load_data("../data")

Display an example image. All images have different shapes.

In [None]:
show_image(image_path="../data/val/n02085620-Chihuahua/n02085620_1152.jpg")

Display the first batch of 4 images.

In [None]:
# Get some random training images
dataiter = iter(dataloaders["val"])
images, labels = dataiter.next()

# Show images
show_batch_of_images(torchvision.utils.make_grid(images))
# Print labels
print("\n".join("%s" % class_names[labels[j]].split("-")[1] for j in range(4)))

### Create and Register AML Dataset

Register the data as a file dataset in the AML workspace for easy accessibility throughout the ML lifecycle.

In [None]:
# Create a dataset object from the datastore location
dataset = Dataset.File.from_files(path=(datastore, "data/stanford_dogs"))

In [None]:
# Register the dataset
dataset = dataset.register(workspace=ws,
                           name="stanford-dogs-dataset",
                           description="Stanford Dogs Dataset containing training, validation and test data",
                           create_new_version=True)