# Introduction

Dog Breed Identification needs to know breed of a dog only using a photo. To handle this problem the best option is using transfer learning, because there few images per breed and trained models with dogs are very common.

Process:

1. Use pre-trained models as feature extractor in order to get the most important features of images.
2. Concatenate result of many feature extractors in order to get an important number of features.
3. Create a model classification where its input is the output of features extractors.  

This project is based on: https://www.kaggle.com/phylake1337/0-18-loss-simple-feature-extractors

Data source: https://www.kaggle.com/c/dog-breed-identification/data

Libraries:

* Python: 3.7
* Tensorflow: 2.9
* Pandas: 1.3.5
* Numpy: 1.21.6
* sklearn: 1.0.2

# Prepare environment

## Load data

Download data from data source.

In [None]:
# Get the most recent verion of kaggle library
!pip install kaggle --upgrade --force

In [None]:
# Get kaggle config file
from google.colab import files
uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

# Then move kaggle.json into the folder where the API expects to find it.
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json

Saving kaggle.json to kaggle.json
User uploaded file "kaggle.json" with length 68 bytes


In [None]:
!kaggle competitions download -c dog-breed-identification

Downloading dog-breed-identification.zip to /content
 99% 684M/691M [00:05<00:00, 141MB/s]
100% 691M/691M [00:05<00:00, 127MB/s]


In [None]:
!unzip /content/dog-breed-identification.zip

## Libraries

In [None]:
import os
import time
from dataclasses import dataclass

# Libraries for exploring data
import pandas as pd
import numpy as np

# Libraries for preparing data
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

# Libraries for training model
import tensorflow as tf  # TF 2.9

In [None]:
# Constants
PATH_DATA = '/content/'
SEED = 1
TOTAL_CLASS = 0

# Data explanatory analysis

## Training data

### Labels

Dataset about dog breeds, where each observation has an id to use to find image and its breed.

Variables:

* **Id**: identifier of each observation.
* **breed**: target variable.

In [None]:
dsLabel = pd.read_csv(os.path.join(PATH_DATA, 'labels.csv'))
print('Shape:', dsLabel.shape)
dsLabel.head()

Shape: (10222, 2)


Unnamed: 0,id,breed
0,000bec180eb18c7604dcecc8fe0dba07,boston_bull
1,001513dfcb2ffafc82cccf4d8bbaba97,dingo
2,001cdf01b096e06d78e9e5112d419397,pekinese
3,00214f311d5d2247d5dfe4fe24b2303d,bluetick
4,0021f9ceb3235effd7fcde7f7538ed62,golden_retriever


In [None]:
dsLabel.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10222 entries, 0 to 10221
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   id      10222 non-null  object
 1   breed   10222 non-null  object
dtypes: object(2)
memory usage: 159.8+ KB


In [None]:
TOTAL_CLASS = len(dsLabel['breed'].unique())
print('Number of breeds:', TOTAL_CLASS)

Number of breeds: 120


Calculating number of observations per breed.

In [None]:
dsLabelGroup = dsLabel.groupby('breed').count().reset_index()
dsLabelGroup = dsLabelGroup.rename(columns={'id': 'count'})
print('Shape:', dsLabelGroup.shape)
dsLabelGroup.head()

Shape: (120, 2)


Unnamed: 0,breed,count
0,affenpinscher,80
1,afghan_hound,116
2,african_hunting_dog,86
3,airedale,107
4,american_staffordshire_terrier,74


In [None]:
# Get breed with the most amount of observations
dsLabelGroup[dsLabelGroup['count'] == dsLabelGroup['count'].max()]

Unnamed: 0,breed,count
97,scottish_deerhound,126


In [None]:
# Get breed with the least amount of observations
dsLabelGroup[dsLabelGroup['count'] == dsLabelGroup['count'].min()]

Unnamed: 0,breed,count
23,briard,66
43,eskimo_dog,66


Conclusions:

* There are no null values.
* Few observations per class (breed).

#### Encoder target variable

In [None]:
le = preprocessing.LabelEncoder()
y = le.fit_transform(dsLabel['breed'].values)
dsLabel['breed_enconded'] = y

In [None]:
dsLabel.head()

Unnamed: 0,id,breed,breed_enconded
0,000bec180eb18c7604dcecc8fe0dba07,boston_bull,19
1,001513dfcb2ffafc82cccf4d8bbaba97,dingo,37
2,001cdf01b096e06d78e9e5112d419397,pekinese,85
3,00214f311d5d2247d5dfe4fe24b2303d,bluetick,15
4,0021f9ceb3235effd7fcde7f7538ed62,golden_retriever,49


# Modeling

## Configurations

In [None]:
@dataclass
class TrainingConfiguration:
    '''
        Describes configuration of the training process
    '''
    epochs                : int = 10            # number of iterations
    learningRate          : float = 0.001       # determines the speed of network's weights update
    numClasses            : int = TOTAL_CLASS   # total of classes
    xConcatFeaturesLength : int = 0             # number of input features of classification model

@dataclass
class DataConfiguration:
    '''
        Describes configuration of data loader and transformation
    '''
    batchSize         : int = 16
    pathData          : str = PATH_DATA # base path where there is the input data.
    seed              : int = SEED
    numberChannels    : int = 3         # number of channels of an image
    imgSize           : int = 224       # size of an image
    applyAugmentation : bool = True     # if apply augmentation transformation

## Create dataset

In [None]:
def getDataAugmentation():
    '''
      Objective: apply Augmentation
    '''
    
    dataAugmentation = tf.keras.Sequential([
        tf.keras.layers.RandomFlip('horizontal_and_vertical'),
        tf.keras.layers.RandomRotation(0.2)
    ])
    
    return dataAugmentation

In [None]:
def getImage(filePath, dataConfiguration):
    '''
      Objective: read images
      Parameters:
        filePath: path of image
        dataConfiguration: DataConfiguration instance
    '''
    
    img = tf.io.read_file(filePath)
    img = tf.image.decode_jpeg(img, channels = dataConfiguration.numberChannels)
    img = tf.image.resize(img, [dataConfiguration.imgSize, dataConfiguration.imgSize])
    return img

In [None]:
def getDataLoader(x, y, dataConfiguration):
    '''
      Objective: get tensor of images
      Parameters:
        x: predictor variable
        y: target variable
        dataConfiguration: DataConfiguration instance
        shuffle: boolean
    '''

    # Image
    dsTX = tf.data.Dataset.from_tensor_slices(x)
    dsTX = dsTX.map(lambda x: getImage(tf.strings.join([dataConfiguration.pathData, 'train/', x, '.jpg']), dataConfiguration), 
                    num_parallel_calls = tf.data.experimental.AUTOTUNE)
    
    # Data Augmentation
    if dataConfiguration.applyAugmentation:
        dataAugmentation = getDataAugmentation()
        dsTX = dsTX.map(lambda x: dataAugmentation(x), num_parallel_calls = tf.data.AUTOTUNE)
        print('Augmentation applied')

    # Target variable
    dsTY = tf.data.Dataset.from_tensor_slices(y)

    # Tuple (predictor, target)
    return (dsTX, dsTY)

In [None]:
def getData(ds, dataConfiguration):
    '''
      Objective: create dataset
      Parameters:
        ds: dsLabels
        dataConfiguration: DataConfiguration instance
    '''

    # Split data
    dsTrain, dsValidation = train_test_split(ds, test_size = 0.2, random_state = dataConfiguration.seed)

    # Load loaders
    (dsTX_Train, dsTY_Train) = getDataLoader(dsTrain['id'], dsTrain['breed_enconded'], dataConfiguration)

    # For data validation is not necessary to apply shuffle
    dataConfiguration.applyAugmentation = False
    (dsTX_Val, dsTY_Val) = getDataLoader(dsValidation['id'], dsValidation['breed_enconded'], dataConfiguration)

    # Tuple (predictor_train, target_train), (predictor_val, target_val)
    return (dsTX_Train, dsTY_Train), (dsTX_Val, dsTY_Val)

## Extract features
In this step, I will extract features using transfer learning from three different models. These new features will be input features of classify model.


In [None]:
dataConfiguration = DataConfiguration()
(dsTX_Train, dsTY_Train), (dsTX_Val, dsTY_Val) = getData(dsLabel, dataConfiguration)

Augmentation applied


In [None]:
# Apply batch transformation due to feature extractor needs it
dsTX_Train = dsTX_Train.batch(dataConfiguration.batchSize)
dsTX_Val = dsTX_Val.batch(dataConfiguration.batchSize)

In [None]:
def extractFeatures(modelName, dataConfiguration):
    '''
      Objective: extract features from images
      Parameters:
        modelName: name of extractor
        dataConfiguration: DataConfiguration instance
    '''
    modelExtractor = None
    extractor = None
    if modelName == 'InceptionV3':
        modelExtractor = tf.keras.applications.inception_v3.InceptionV3(weights = 'imagenet', 
                                                                        include_top = False, 
                                                                        input_shape = (dataConfiguration.imgSize, dataConfiguration.imgSize, dataConfiguration.numberChannels))
        
        extractor = tf.keras.Sequential([
            tf.keras.Input(shape = (dataConfiguration.imgSize, dataConfiguration.imgSize, dataConfiguration.numberChannels)),
            tf.keras.layers.Lambda(tf.keras.applications.inception_v3.preprocess_input),
            modelExtractor,
            tf.keras.layers.GlobalAveragePooling2D()
        ])
    elif modelName == 'Xception':
        modelExtractor = tf.keras.applications.Xception(weights = 'imagenet', 
                                                        include_top = False, 
                                                        input_shape=(dataConfiguration.imgSize, dataConfiguration.imgSize, dataConfiguration.numberChannels))
        
        
        extractor = tf.keras.Sequential([
            tf.keras.Input(shape = (dataConfiguration.imgSize, dataConfiguration.imgSize, dataConfiguration.numberChannels)),
            tf.keras.layers.Lambda(tf.keras.applications.xception.preprocess_input),
            modelExtractor,
            tf.keras.layers.GlobalAveragePooling2D()
        ])
    else:
        modelExtractor = tf.keras.applications.InceptionResNetV2(weights = 'imagenet', 
                                                                 include_top = False, 
                                                                 input_shape=(dataConfiguration.imgSize, dataConfiguration.imgSize, dataConfiguration.numberChannels))
        
        extractor = tf.keras.Sequential([
          tf.keras.Input(shape = (dataConfiguration.imgSize, dataConfiguration.imgSize, dataConfiguration.numberChannels)),
          tf.keras.layers.Lambda(tf.keras.applications.inception_resnet_v2.preprocess_input),
          modelExtractor,
          tf.keras.layers.GlobalAveragePooling2D()
        ])

    return extractor

### InceptionV3

In [None]:
dataConfiguration = DataConfiguration()
extractor1 = extractFeatures('InceptionV3', dataConfiguration)

In [None]:
extractor1.summary()

Model: "sequential_18"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lambda_4 (Lambda)           (None, 224, 224, 3)       0         
                                                                 
 inception_v3 (Functional)   (None, 5, 5, 2048)        21802784  
                                                                 
 global_average_pooling2d_4   (None, 2048)             0         
 (GlobalAveragePooling2D)                                        
                                                                 
Total params: 21,802,784
Trainable params: 21,768,352
Non-trainable params: 34,432
_________________________________________________________________


In [None]:
# Extract features of training data
t = time.time()

featuresTrain1 = extractor1.predict(dsTX_Train)
featuresTrain1.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 13.385615110397339


In [None]:
# Extract features of validation data
t = time.time()

featuresVal1 = extractor1.predict(dsTX_Val)
featuresVal1.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 3.1861430644989013


### Xception

In [None]:
dataConfiguration = DataConfiguration()
extractor2 = extractFeatures('Xception', dataConfiguration)

In [None]:
extractor2.summary()

Model: "sequential_19"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lambda_5 (Lambda)           (None, 224, 224, 3)       0         
                                                                 
 xception (Functional)       (None, 7, 7, 2048)        20861480  
                                                                 
 global_average_pooling2d_5   (None, 2048)             0         
 (GlobalAveragePooling2D)                                        
                                                                 
Total params: 20,861,480
Trainable params: 20,806,952
Non-trainable params: 54,528
_________________________________________________________________


In [None]:
# Extract features of training data
t = time.time()

featuresTrain2 = extractor2.predict(dsTX_Train)
featuresTrain2.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 22.23886743783951


In [None]:
# Extract features of validation data
t = time.time()

featuresVal2 = extractor2.predict(dsTX_Val)
featuresVal2.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 6.365489224592845


### InceptionResNetV2

In [None]:
dataConfiguration = DataConfiguration()
extractor3 = extractFeatures('InceptionResNetV2', dataConfiguration)

In [None]:
extractor3.summary()

Model: "sequential_20"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lambda_6 (Lambda)           (None, 224, 224, 3)       0         
                                                                 
 inception_resnet_v2 (Functi  (None, 5, 5, 1536)       54336736  
 onal)                                                           
                                                                 
 global_average_pooling2d_6   (None, 1536)             0         
 (GlobalAveragePooling2D)                                        
                                                                 
Total params: 54,336,736
Trainable params: 54,276,192
Non-trainable params: 60,544
_________________________________________________________________


In [None]:
# Extract features of training data
t = time.time()

featuresTrain3 = extractor3.predict(dsTX_Train)
featuresTrain3.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 28.121075932184855


In [None]:
# Extract features of validation data
t = time.time()

featuresVal3 = extractor3.predict(dsTX_Val)
featuresVal3.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 7.365653963883718


## Integration
To create a input for classification model, I have to concatenate previous features.

In [None]:
def getConcatInputData(XConcat, y, dataConfiguration):
    '''
      Objective: create batch of input data
      Parameters:
        XConcat: features
        y: target variable
        dataConfiguration: DataConfiguration instance
    '''
    dsX = tf.data.Dataset.from_tensor_slices(XConcat)
    dsT = tf.data.Dataset.zip((dsX, y))

    dsT = dsT.prefetch(tf.data.AUTOTUNE)
    dsT = dsT.batch(dataConfiguration.batchSize)

    return dsT

In [None]:
XtrainConcat = tf.concat([featuresTrain1, featuresTrain2, featuresTrain3], -1)
print('XtrainConcat:', XtrainConcat.shape)

XtrainConcat: (8177, 5632)


In [None]:
XvalConcat = tf.concat([featuresVal1, featuresVal2, featuresVal3], -1)
print('XvalConcat:', XvalConcat.shape)

XvalConcat: (2045, 5632)


In [None]:
dataConfiguration = DataConfiguration()
trainLoader = getConcatInputData(XtrainConcat, dsTY_Train, dataConfiguration)
valLoader = getConcatInputData(XvalConcat, dsTY_Val, dataConfiguration)

## Create model classification

In [None]:
def getClassificationModel(trainingConfiguration):
    '''
      Objective: create classification model
      Parameters:
        trainingConfiguration: TrainingConfiguration instance
    '''
    
    model = tf.keras.Sequential([
        tf.keras.Input(shape = (trainingConfiguration.xConcatFeaturesLength, )),
        tf.keras.layers.Dropout(0.7),
        tf.keras.layers.Dense(1024, activation = 'relu'),
        tf.keras.layers.Dense(trainingConfiguration.numClasses, activation = 'softmax')
    ])
    return model

In [None]:
trainingConfiguration = TrainingConfiguration()
trainingConfiguration.xConcatFeaturesLength = XtrainConcat.shape[1]

model = getClassificationModel(trainingConfiguration)
model.summary()

Model: "sequential_21"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dropout_8 (Dropout)         (None, 5632)              0         
                                                                 
 dense_16 (Dense)            (None, 1024)              5768192   
                                                                 
 dense_17 (Dense)            (None, 120)               123000    
                                                                 
Total params: 5,891,192
Trainable params: 5,891,192
Non-trainable params: 0
_________________________________________________________________


In [None]:
model.compile(optimizer=tf.keras.optimizers.Adam(lr = trainingConfiguration.learningRate),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits = False),
              metrics=['accuracy'])

  super(Adam, self).__init__(name, **kwargs)


In [None]:
t = time.time()

history = model.fit(trainLoader, 
                    validation_data = valLoader, 
                    epochs = trainingConfiguration.epochs)

print('Duration minutes:', (time.time() - t)/60)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Duration minutes: 2.6306493719418844


Conclutions:

* Validation loss is not good, because is it too high (overfitting). A possible solution is reduce model complexity (less features) or changing feature extractors.
* Training loss is not so bad. Changing feature extractor could improve results.

## Testing models

### Create dataset

In [None]:
def getDataTest(dataConfiguration):
    '''
      Objective: create dataset
      Parameters:
        dataConfiguration: DataConfiguration instance
    '''

    # Load path files
    listFileTest = tf.data.Dataset.list_files(os.path.join(dataConfiguration.pathData, 'test', '*'), shuffle = False, seed = dataConfiguration.seed)

    print('Number of observations:', tf.data.experimental.cardinality(listFileTest).numpy())

    dsTest = listFileTest.map(lambda x: getImage(x, dataConfiguration), num_parallel_calls = tf.data.experimental.AUTOTUNE)

    dsTest = dsTest.batch(dataConfiguration.batchSize)

    return dsTest

In [None]:
dataConfiguration = DataConfiguration()
dsTest = getDataTest(dataConfiguration)

Number of observations: 10357


### Extract features

In [None]:
t = time.time()

dataConfiguration = DataConfiguration()
extractor1 = extractFeatures('InceptionV3', dataConfiguration)

featuresTest1 = extractor1.predict(dsTest)
featuresTest1.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 16.297970457871756


In [None]:
t = time.time()

dataConfiguration = DataConfiguration()
extractor2 = extractFeatures('Xception', dataConfiguration)

featuresTest2 = extractor2.predict(dsTest)
featuresTest2.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 28.40865908463796


In [None]:
t = time.time()

dataConfiguration = DataConfiguration()
extractor3 = extractFeatures('InceptionResNetV2', dataConfiguration)

featuresTest3 = extractor3.predict(dsTest)
featuresTest3.shape

print('Duration minutes:', (time.time() - t)/60)

Duration minutes: 36.556178696950276


Concat extrated features

In [None]:
dataConfiguration = DataConfiguration()

XtestConcat = tf.data.Dataset.from_tensor_slices(tf.concat([featuresTest1, featuresTest2, featuresTest3], -1))
XtestConcat = XtestConcat.batch(dataConfiguration.batchSize)

In [None]:
predicted = model.predict(XtestConcat)
print('Predicted shape:', predicted.shape)

Predicted shape: (10357, 120)
