# Dog Model Detector
## By: Jon Barker
## Date: 15-January-2024

This is a model I am working through for my February Machine Learning YouTube series. I will be using the Stanford dogs dataset to train a Convolution Neural Network (CNN) to be able to recognize dog faces and dog breeds. After this model is created, the goal will be to implement this model into a live streaming webcam. With the processing occurring, I want to be able to trigger some type of hardware based around the recognition

Libraries:

This model will be using sklearn, and keras for machine learning processing.

In [14]:
import numpy as np
import random
import cv2                
import matplotlib.pyplot as plt   
import sys
import os
import dlib

from skimage import io
from tqdm import tqdm
from glob import glob
from sklearn.datasets import load_files


from keras.callbacks import ModelCheckpoint  
from keras.preprocessing import image                  
from keras.applications.resnet50 import ResNet50, preprocess_input, decode_predictions
import np_utils

from IPython.display import Image, display

import xml.etree.ElementTree as ET
from sklearn.preprocessing import LabelEncoder, OneHotEncoder



Tensor breakdown:

The below functions are used for extracting features from images using different pre-trained deep learning models from the Keras library. Each function is designed to work with a specific model: VGG16, VGG19, ResNet50, Xception, and InceptionV3

In [6]:
def extract_VGG16(tensor):
	from keras.applications.vgg16 import VGG16, preprocess_input
	return VGG16(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

def extract_VGG19(tensor):
	from keras.applications.vgg19 import VGG19, preprocess_input
	return VGG19(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

def extract_Resnet50(tensor):
	from keras.applications.resnet50 import ResNet50, preprocess_input
	return ResNet50(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

def extract_Xception(tensor):
	from keras.applications.xception import Xception, preprocess_input
	return Xception(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

def extract_InceptionV3(tensor):
	from keras.applications.inception_v3 import InceptionV3, preprocess_input
	return InceptionV3(weights='imagenet', include_top=False).predict(preprocess_input(tensor))

Dataset Loading:

This function, `load_dataset`, will process the image and annotation directories to create the required datasets. It uses a label encoder to convert breed names into numerical labels and then onehot-encodes these labels. The function returns the image file paths, the onehot-encoded targets, and the list of dog breed names (dog_names) for the training dataset. The same list is used for the validation and test datasets but is not returned, as the breed names will be the same across all sets.

When you run this code, ensure that the paths ('dogImages/train', 'dogImages/valid', 'dogImages/test', and 'annotations/Annotation') match the structure of your dataset. Also, make sure all necessary Python packages (like `glob`, `numpy`, `xml.etree.ElementTree`, `sklearn.preprocessing`, etc.) are installed and imported.


Currently, I am looking into one specific dataset. Will need to addd functionality to parse through the rest of the folders

In [78]:
def load_dataset(image_path, annotation_path):
    # Gather image file paths
    #image_files = np.array(glob(os.path.join(image_path, "*/*")))
    
     # Gather image file paths
    image_files = np.array(glob(os.path.join(image_path, "*")))
    
    # Check if any files are found
    if len(image_files) == 0:
        print("No files found in the directory:", image_path)
        return np.array([]), np.array([]), []

    
    # Extract breed names and encode labels
    breeds = []
    for file in image_files:
        #print(f"file = {file}")
        base = os.path.basename(file)
        #print(f"base = {base}")
        #print(f"Working Directory = {os.getcwd()}")
        xml_file = os.path.join(annotation_path, os.path.splitext(base)[0])
        tree = ET.parse(xml_file)
        root = tree.getroot()
        breed = root.find("./object/name").text
        breeds.append(breed)

    # Convert breed names into numerical labels
    label_encoder = LabelEncoder()
    integer_encoded = label_encoder.fit_transform(breeds)

    # Onehot-encode the labels
    onehot_encoder = OneHotEncoder(sparse=False)
    integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
    onehot_encoded = onehot_encoder.fit_transform(integer_encoded)

    return image_files, onehot_encoded, label_encoder.classes_

# Load train, test, and validation datasets
train_files, train_targets, dog_names = load_dataset('dogImages/images/Images/n02085620-Chihuahua/', './dogImages/annotations/Annotation/n02085620-Chihuahua')
#valid_files, valid_targets, _ = load_dataset('dogImages/valid', 'annotations/Annotation')
#test_files, test_targets, _ = load_dataset('dogImages/test', 'annotations/Annotation')

# Print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % len(np.hstack([train_files])))
#print('There are %d training dog images.' % len(train_files))
#print('There are %d validation dog images.' % len(valid_files))
#print('There are %d test dog images.' % len(test_files))


There are 1 total dog categories.
There are 152 total dog images.





array([], dtype=float64)

'/Users/jon/Documents/Python/dog_detector/dog_detector'

/dogImages/images/Images/n02085620-Chihuahua/*


NameError: name 'cd' is not defined

In [72]:
cd ../

/Users/jon/Documents/Python/dog_detector/dog_detector


In [None]:
# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/images')
#valid_files, valid_targets = load_dataset('dogImages/valid')
#test_files, test_targets = load_dataset('dogImages/test')

# load list of dog names
dog_names = [item[20:-1] for item in sorted(glob("dogImages/train/*/"))]

# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training dog images.' % len(train_files))
#print('There are %d validation dog images.' % len(valid_files))
#print('There are %d test dog images.'% len(test_files))