# Convolutional Neural Networks

## Dog breed classification using transfer learning



## Step 1: import dataset

Make sure that you've downloaded the required dog dataset:
* Download the dataset, unzip the folder and place it in this project's home directory, at the location `/dogs`.

In [9]:
import numpy as np
from glob import glob

# load filenames
dog_files = np.array(glob("dogs/*/*/*"))

# print the number of images in the dataset
print('There are %d total dog images.' % len(dog_files))

There are 8351 total dog images.


## Step 2: detect dogs

In this section, we will use a Pre-trained resnet50 Model to detect dogs.

The following code cell downloads the resnet50 model, along with weights that have been trained on [ImageNet](http://www.image-net.org/), a large, popular dataset used for image classification with 1000 categories.

In [58]:
import torch
import torchvision.models as models

# define resnet50 model
resnet50 = models.resnet50(pretrained=True)

# check if CUDA is available and move model to GPU
use_cuda = torch.cuda.is_available()
if use_cuda:
    resnet50 = resnet50.cuda()

### Using the pre-trained model

Next, we will write a function that accepts as input a path to an image, and returns an integer between 0 and 999 (inclusive), representing one of the 1000 possible classes.

In [61]:
from PIL import Image
import torchvision.transforms as transforms

# Set PIL to be tolerant of image files that are truncated.
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

def resnet50_predict(img_path):
    '''
    Use pre-trained resnet50 model to obtain index corresponding to
    predicted class for image at specified path

    Args:
        img_path: path to an image

    Returns:
        Index corresponding to resnet50 model's prediction
    '''

    image = Image.open(img_path).convert('RGB')
    normalize = transforms.Compose([
            transforms.Resize(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.25, 0.25, 0.25])])

    image = normalize(image)[:3,:,:].unsqueeze(0)

    if use_cuda:
        image = image.cuda()

    classify=resnet50(image)

    index = classify.argmax().item() #index of the class with highest proba returned as scalar

    return index # predicted class index

Among output classes, categories corresponding to dogs appear in an uninterrupted sequence and correspond to dictionary keys 151-268 (inclusive).  Therefore, to verify whether a dog is detected, we only need to check if the pre-trained model predicts an index between 151 and 268.

The `dog_detector` function below returns `True` if a dog is detected in an image, and `False` if not.

In [62]:
def dog_detector(img_path):
    res_ = resnet50_predict(img_path)
    if  (res_ >= 151 and res_ <= 268):
        return True
    else:
        return False

### Test your dog detector

In [63]:
# Test the performance of the dog_detector function
# on the first 100 images.

dog_files_short = dog_files[:100]
count_dog_dog = sum(dog_detector(img) for img in dog_files_short)
print('Dogs detected: {} %'.format(count_dog_dog))

Dogs detected: 0 %


## Step 3: create a CNN to classify dog breeds using transfer learning
### Specify data loaders

In the following code cell, we write three separate [data loaders](http://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) for the training, validation, and test datasets (located at `dogs/train`, `dogs/valid`, and `dogs/test`, respectively).  You may find [this documentation on custom datasets](http://pytorch.org/docs/stable/torchvision/datasets.html) to be a useful resource.  If you are interested in augmenting your training and/or validation data, check out the wide variety of [transforms](http://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transform)!

In [64]:
import os
from torchvision import datasets

# Writing data loaders for training, validation, and test sets
# Specifying appropriate transforms, and batch_sizes
batch_size  = 20
num_workers = 0

img_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.RandomResizedCrop(224),
    transforms.CenterCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(10),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])

train_data=datasets.ImageFolder('dogs/train', transform = img_transform)
validation_data=datasets.ImageFolder('dogs/valid', transform = img_transform)
test_data=datasets.ImageFolder('dogs/test', transform = img_transform)

train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, num_workers=num_workers, shuffle=True)
valid_loader = torch.utils.data.DataLoader(validation_data, batch_size=batch_size, num_workers=num_workers, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, num_workers=num_workers)

loaders_transfer = {
    'train': train_loader,
    'valid': valid_loader,
    'test': test_loader
}


### Model architecture

Transfer learning is used to create a CNN to classify dog breed.  The initialized model will be saved as the variable `model_transfer`.


In [65]:
import torchvision.models as models
import torch.nn as nn

# model architecture
model_transfer = models.resnet50(pretrained=True)
print(model_transfer)


ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

Since resnet50 isn't functional like VGG16, it does not have a classifier and will use it's own functions

In [66]:
model_transfer = models.resnet50(pretrained=True)
for param in model_transfer.parameters():
    param.requires_grad = False
model_transfer.fc = nn.Linear(model_transfer.fc.in_features, 133)
if use_cuda:
    model_transfer = model_transfer.cuda()

### Specify loss function and optimizer

The following code cell is used to specify a [loss function](http://pytorch.org/docs/master/nn.html#loss-functions) and an [optimizer](http://pytorch.org/docs/master/optim.html).  The chosen loss function is saved as `criterion_transfer`, and the optimizer as `optimizer_transfer`.

In [67]:
import torch.optim as optim
criterion_transfer = torch.nn.CrossEntropyLoss()
optimizer_transfer = optim.SGD(model_transfer.fc.parameters(), lr=0.01)

### Train and validate the model

Now we will train and validate our model, and we will save the model parameters as save_path `'model_transfer.pt'`.

In [35]:
# @title Default title text
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf

    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0

        ###################
        # train the model #
        ###################
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            # find the loss and update the model parameters accordingly
            # record the average training loss

            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.item() - train_loss))

        ######################
        # validate the model #
        ######################
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            ## update the average validation loss
            output = model(data)
            loss = criterion(output, target)
            valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (loss.item() - valid_loss))


        # print training/validation statistics
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
            epoch,
            train_loss,
            valid_loss
            ))

        # save the model if validation loss has decreased
        if valid_loss <= valid_loss_min:
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
            valid_loss_min,
            valid_loss))
            torch.save(model.state_dict(), save_path)
            valid_loss_min = valid_loss
    # return trained model
    return model


# train the model
n_epochs = 10
model_transfer = train(n_epochs, loaders_transfer, model_transfer, optimizer_transfer, criterion_transfer, use_cuda, 'model_transfer.pt')

# load the model that got the best validation accuracy (uncomment the line below)
model_transfer.load_state_dict(torch.load('model_transfer.pt'))

Epoch: 1 	Training Loss: 2.950188 	Validation Loss: 2.399353
Validation loss decreased (inf --> 2.399353).  Saving model ...
Epoch: 2 	Training Loss: 2.355038 	Validation Loss: 1.946906
Validation loss decreased (2.399353 --> 1.946906).  Saving model ...
Epoch: 3 	Training Loss: 1.987067 	Validation Loss: 1.686647
Validation loss decreased (1.946906 --> 1.686647).  Saving model ...
Epoch: 4 	Training Loss: 1.792759 	Validation Loss: 1.507887
Validation loss decreased (1.686647 --> 1.507887).  Saving model ...
Epoch: 5 	Training Loss: 1.649313 	Validation Loss: 1.416335
Validation loss decreased (1.507887 --> 1.416335).  Saving model ...
Epoch: 6 	Training Loss: 1.521599 	Validation Loss: 1.396135
Validation loss decreased (1.416335 --> 1.396135).  Saving model ...
Epoch: 7 	Training Loss: 1.451331 	Validation Loss: 1.298116
Validation loss decreased (1.396135 --> 1.298116).  Saving model ...
Epoch: 8 	Training Loss: 1.422777 	Validation Loss: 1.326404
Epoch: 9 	Training Loss: 1.324414 

<All keys matched successfully>

### Test the model

Let's try out our model on the test dataset. We will calculate and print the test loss and accuracy.  

In [68]:
def test(loaders, model, criterion, use_cuda):

    # monitor test loss and accuracy
    test_loss = 0.
    correct = 0.
    total = 0.

    model.eval()
    for batch_idx, (data, target) in enumerate(loaders['test']):
        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        # calculate the loss
        loss = criterion(output, target)
        # update average test loss
        test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data - test_loss))
        # convert output probabilities to predicted class
        pred = output.data.max(1, keepdim=True)[1]
        # compare predictions to true label
        correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
        total += data.size(0)

    print('Test Loss: {:.6f}\n'.format(test_loss))

    print('\nTest Accuracy: %2d%% (%2d/%2d)' % (
        100. * correct / total, correct, total))

test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)

Test Loss: 4.949788


Test Accuracy:  0% ( 6/836)


### Predict dog breed with the model

Let's write a function that takes an image path as input and returns the dog breed that is predicted by our model.  

In [69]:
# list of class names by index, i.e. a name can be accessed like class_names[0]
class_names = [item[4:].replace("_", " ") for item in loaders_transfer['train'].dataset.classes]

def predict_breed_transfer(img_path):
    # load the image and return the predicted breed
    image = Image.open(img_path).convert('RGB')
    normalize = transforms.Compose([
            transforms.Resize(224),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.25, 0.25, 0.25])])

    image = normalize(image)[:3,:,:].unsqueeze(0)

    if use_cuda:
        image = image.cuda()

    classify=model_transfer(image)
    index = classify.argmax().item() #index of the class with highest proba returned as scalar


    return class_names[index]

Now let us try with our own images. Create a folder named myImages and put inside some dog images that you obtain
from the internet (or from your smartphone camera if you own one or more dogs...)

In [70]:
import os
import cv2
import matplotlib.pyplot as plt
from PIL import Image

imgsFolder='myImages/'
for imgFile in os.listdir(imgsFolder):
    img_path=os.path.join(imgsFolder,imgFile)
    img = cv2.imread(img_path)
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    plt.show()
    print("This dog breed is likely a ", predict_breed_transfer(img_path))



This dog breed is likely a  Dachshund
This dog breed is likely a  Lowchen
This dog breed is likely a  Keeshond
This dog breed is likely a  Clumber spaniel
This dog breed is likely a  Lowchen
This dog breed is likely a  Clumber spaniel


In [52]:
import torch
from glob import glob
from PIL import Image
import os
import cv2
import matplotlib.pyplot as plt
import torchvision.transforms as transforms

# Load YOLOv5 model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# Set device to GPU if available
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)

# Define function to detect dogs using YOLOv5
def detect_dogs_yolov5(img_path):
    """
    Detects dogs using YOLOv5, predicts breeds using a transfer learning model,
    and visualizes the results.

    Args:
        img_path (str): Path to the image file.

    Returns:
        list: List of predicted dog breeds for each detected dog (or None if no dogs detected).
        ndarray: Annotated image with bounding boxes drawn around detected dogs.
    """

    # Perform inference
    results = model(img_path)

    # Get labels and coordinates of detected objects
    labels = results.names
    boxes = results.xyxy[0].cpu().numpy()

    # Filter out only "dog" class detections
    dog_boxes = [box for box in boxes if labels[int(box[-1])] == 'dog']

    # If no dogs detected, return None
    if len(dog_boxes) == 0:
        return None, None

    # Load the image
    if isinstance(img_path, str):
        img = cv2.imread(img_path)
    else:
        img = img_path

    # Initialize list to store detected dog breeds
    dog_breeds = []

    # Process each detected dog
    for box in dog_boxes:
        # Extract coordinates of the detected dog
        x1, y1, x2, y2, conf, _ = box.astype(int)

        # Crop the image to contain only the detected dog
        dog_img = img[y1:y2, x1:x2]

        # Resize the image to 224x224 (required input size for resnet50)
        transform = transforms.Compose([
            transforms.ToPILImage(),
            transforms.Resize((224, 224)),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.25, 0.25, 0.25])
        ])
        dog_img = transform(dog_img).unsqueeze(0).to(device)

        # Predict the breed of the dog using the transfer learning model
        breed_index = model_transfer(dog_img).argmax().item()
        breed = class_names[breed_index]

        # Append the predicted breed to the list
        dog_breeds.append(breed)

        # Draw a bounding box around the detected dog (optional)
        cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)

    # Return the list of dog breeds and annotated image
    return dog_breeds, img

# Test the dog detection and breed prediction function
imgsFolder = 'myImages/'
for imgFile in os.listdir(imgsFolder):
    img_path = os.path.join(imgsFolder, imgFile)

    # Detect dogs and predict breeds
    dog_breeds, annotated_img = detect_dogs_yolov5(img_path)

    # Display the image with bounding boxes (if applicable)
    if annotated_img is not None:
        plt.imshow(cv2.cvtColor(annotated_img, cv2.COLOR_BGR2RGB))
        plt.title('Detected Dogs with Bounding Boxes')
        plt.show()

    # If dogs detected, print their breeds
    if dog_breeds:
        for i, breed in enumerate(dog_breeds):
            print(f"Dog {i+1} breed: {breed}")
    else:
        print("No dogs detected in the image.")


Using cache found in C:\Users\m_a_g/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5  2024-4-1 Python-3.11.0 torch-2.2.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 


Dog 1 breed: Border collie
Dog 1 breed: Japanese chin
Dog 2 breed: American foxhound
Dog 3 breed: Australian cattle dog
Dog 4 breed: Australian cattle dog
Dog 5 breed: Australian cattle dog
Dog 6 breed: Pekingese
Dog 1 breed: Norwegian lundehund
No dogs detected in the image.
Dog 1 breed: Labrador retriever
Dog 1 breed: Pointer
