<img align="right" style="max-width: 200px; height: auto" src="https://github.com/HSG-AIML/LabAIML/blob/main/challenge/hsg_logo.png?raw=1">

###  AI:ML Mini Project Coding Challenge - "Tiny ImageNet Visual Classification"

Introduction to ML and DL, University of St. Gallen, Fall Term 2021

This is a helper notebook of the AI:ML Coding Challenge to load and inspect the **Stanford CS231N Tiny ImageNet** dataset of the coding challenge. Further details on the dataset, such as dataset statistics, label information etc., can be obtained via the following url: https://tiny-imagenet.herokuapp.com.

Let's import a couple of necessary libraries:

In [None]:
import torchvision

Enable inline plotting:

In [None]:
%matplotlib inline

## 0. Using your Google Drive

If you are using Google Colab (which we highly recommend), you can connect your Google Drive to your Colab runtime environment, thus providing a way to save your notebooks and data permanently:

In [None]:
import os
#from google.colab import drive
#drive.mount('/content/drive')
#os.chdir('/content/drive/MyDrive/Colab Notebooks/')

###  A. Load the Train Image Data as Plain JPEG

Load the training images of the Tiny ImageNet dataset:

In [None]:
# download image data (uncomment to run if necessary)
#!wget http://cs231n.stanford.edu/tiny-imagenet-200.zip
#!unzip -q tiny-imagenet-200.zip && ls tiny-imagenet-200

# define the directory of the tiny imagenet training images
train_images_dir = './tiny-imagenet-200/train'

# load the tiny imagenet training images
train_dataset = torchvision.datasets.ImageFolder(train_images_dir)

Determine the details and shape of the training data:

In [None]:
# show train dataset details
train_dataset

Inspect the content and the label of a random image id: 

In [None]:
# set random image id
image_id = 100

# determine image content and label
train_image_content, train_image_label = train_dataset[image_id]

Plot the image content:

In [None]:
train_image_content

Show the corresponding image label:

In [None]:
train_image_label

###  B. Load the Train Image Data as 3D-Tensor

Define image to tensor transformation:

In [None]:
data_transformation = torchvision.transforms.Compose([torchvision.transforms.ToTensor()])

Load the training images of the Tiny ImageNet dataset:

In [None]:
# define the directory of the tiny imagenet training images
train_images_dir = './tiny-imagenet-200/train'

# load the tiny imagenet training images
train_dataset_tensor = torchvision.datasets.ImageFolder(train_images_dir, transform=data_transformation)

Determine the details and shape of the tensor training data:

In [None]:
# show train dataset details
train_dataset_tensor

Inspect the content and the label of a random image id: 

In [None]:
# set random image id
image_id = 100

# determine image content and label
train_image_content, train_image_label = train_dataset_tensor[image_id]

Reminder how image data is stored in Python:

<img align="center" style="max-width: 800px; height: auto" src="https://github.com/HSG-AIML/LabAIML/blob/main/challenge/python_image_processing.png?raw=1">

Plot the image content:

In [None]:
train_image_content

Plot the image shape:

In [None]:
train_image_content.shape

Show the corresponding image label:

In [None]:
train_image_label

Determine image class to label mapping:

In [None]:
class_to_label_mapping = train_dataset_tensor.class_to_idx

Show image label to folder (class) mapping:

In [None]:
class_to_label_mapping

###  C. Load the Validation Image Data as 3D-Tensor

Define image to tensor transformation:

In [None]:
data_transformation = torchvision.transforms.Compose([torchvision.transforms.ToTensor()])

Load the validation images of the Tiny ImageNet dataset:

In [None]:
# define the directory of the tiny imagenet validation images
validation_images_dir = './tiny-imagenet-200/val'

# load the tiny imagenet validation images
validation_dataset_tensor = torchvision.datasets.ImageFolder(validation_images_dir, transform=data_transformation)

Determine the details and shape of the tensor validation data:

In [None]:
# show validation dataset details
validation_dataset_tensor

Inspect the content and the label of a random image id: 

In [None]:
# set random image id
image_id = 100

# determine image content and label
validation_image_content, validation_image_label = validation_dataset_tensor[image_id]

Plot the image content:

In [None]:
validation_image_content

Plot the image shape:

In [None]:
validation_image_content.shape

Show the corresponding image label:

In [None]:
validation_image_label

###  D. Annotate the Validation Image Data

Read the validation dataset annotation file and parse class labels:

In [None]:
# init the validation labels
validation_labels = []

# open the validation label file
with open('./tiny-imagenet-200/val/val_annotations.txt', 'r') as fp:
    
    # iterate over each line in validation label file
    for line in fp.readlines():
        
        # split each line into terms 
        terms = line.split('\t')
        
        # obtain the file name and label text
        file_name, label_text = terms[0], terms[1]
        
        # convert class to label
        validation_labels.append(class_to_label_mapping[label_text])

Show parsed validation dataset label annotations:

In [None]:
validation_labels

Reset labels of validation dataset: 

In [None]:
# iterate over validation images
for i in range(0, len(validation_dataset_tensor)):
    
    # convert image-label pair to list
    validation_dataset_tensor_list = list(validation_dataset_tensor.imgs[i])
    
    # reset label to "true" validation labels
    validation_dataset_tensor_list[1] = validation_labels[i]
    
    # re-convert image-label pair to tupel
    validation_dataset_tensor.imgs[i] = tuple(validation_dataset_tensor_list)

Inspect the content and the label of a random validation image id: 

In [None]:
# set random image id
image_id = 100

# determine image content and label
validation_image_content, validation_image_label = validation_dataset_tensor[image_id]

Show the corresponding image label:

In [None]:
validation_image_label