## Libraries


In [54]:
import os
from os import path, listdir
from torchvision import transforms
from tqdm import tqdm
import torch
import pandas as pd
from skimage import transform
from skimage.io import imread
from sklearn.model_selection import train_test_split

## Configuration


In [48]:
CONFIG = {
    "DEVICE": torch.device("cuda" if torch.cuda.is_available() else "cpu"),
    "IMAGE_SIZE": (64, 64),
    "DATA_PATH": path.join(path.abspath(path.pardir), "data"),
    "SEED": 27,
    "MODEL": {"EPOCH": 1000, "LEARNING_RATE": 1e-6},
}

CONFIG["DATA"] = {
    "TRANSFORMATION": transforms.Compose(
        [
            transforms.Resize(CONFIG.get("IMAGE_SIZE")),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
        ]
    ),
}

## Data

The dataset states the following:

_All images with 1st letter as captial are cat images while images with small first letter are dog images._

As this project will be evaluation the race and classification of the animals (breed and species), we'll transform the label dataset into one with increased semantic charge, replacing species to `cat` or `dog` and the breed to its respective name. This will be replaced with a label enconder when trainging the model and the only advantage of it is having some increased ease when exploring the results; leaving [magic numbers](<https://en.wikipedia.org/wiki/Magic_number_(programming)>) out of the paper

Well be using a custom train/validation/test split of 70/15/15. As such, we'll be merging the label files and creating the splits from them


In [49]:
image_files = [
    file
    for file in listdir(path.join(CONFIG.get("DATA_PATH"), "images"))
    if file.endswith(".jpg")
]

labels = pd.DataFrame(
    {
        "image_name": image_files,
        "species": ["cat" if file[0].isupper() else "dog" for file in image_files],
        "breed": ["_".join(file.split("_")[:-1]).lower() for file in image_files],
    }
)
labels

Unnamed: 0,image_name,species,breed
0,Abyssinian_1.jpg,cat,abyssinian
1,Abyssinian_10.jpg,cat,abyssinian
2,Abyssinian_100.jpg,cat,abyssinian
3,Abyssinian_101.jpg,cat,abyssinian
4,Abyssinian_102.jpg,cat,abyssinian
...,...,...,...
7385,yorkshire_terrier_95.jpg,dog,yorkshire_terrier
7386,yorkshire_terrier_96.jpg,dog,yorkshire_terrier
7387,yorkshire_terrier_97.jpg,dog,yorkshire_terrier
7388,yorkshire_terrier_98.jpg,dog,yorkshire_terrier


Finally, we can load all the images and apply the necessary transformations to have them standarized. In this case, as we want the base transformations to have some dlexibility on each tackled problem, we'll be downscaling the images:

In [55]:
images = [
    transform.resize(
        imread(path.join(CONFIG.get("DATA_PATH"), "images", name)),
        CONFIG.get("IMAGE_SIZE"),
    )
    for name in tqdm(labels["image_name"], desc="Loading image data", unit=" images")
]

Loading image data: 100%|██████████| 7390/7390 [01:15<00:00, 98.29 images/s] 
