# Predicting Pawpularity

#### Objective
PetFinder.my is Malaysia’s leading animal welfare platform, featuring over 180,000 animals with 54,000 happily adopted. PetFinder collaborates closely with animal lovers, media, corporations, and global organizations to improve animal welfare.

Currently, PetFinder.my uses a basic Cuteness Meter to rank pet photos. It analyzes picture composition and other factors compared to the performance of thousands of pet profiles. While this basic tool is helpful, it's still in an experimental stage and the algorithm could be improved.

In this competition, you’ll analyze raw images and metadata to predict the “Pawpularity” of pet photos. You'll train and test your model on PetFinder.my's thousands of pet profiles.

#### Description of data
- CSV file with 9912 rows and 14 columns of metadata (no nulls)
- Folder with 9912 jpeg files linked to metadata via id in file name

#### Notes:
- Selection method of data is unclear
- Unclear whether photos are profile photos
- Pawpularity score is based on webtraffic on pet profile, not based on metadata
- Metadata does not include information whether featured pets are dog or cat
- No data concerning pet location (which may have significant impact on website traffic)

#### Procedure:
Test 4 different deep learning models, including:
* Multilayer Perceptron (MLP) Model for only the metadata
* two different Convoluted Neural Networks (CNNs) with a varying number of dense layers and only the images as input
* one mixed model using both the metadata and images as input


## Load and review Metadata

In [1]:
import pandas as pd

In [2]:
data = pd.read_csv('/Users/arnet/Desktop/Ironhack/GitHub/Deep_Learning_Final_Project/MetaData.csv')

In [3]:
data.head()

Unnamed: 0,Id,Subject Focus,Eyes,Face,Near,Action,Accessory,Group,Collage,Human,Occlusion,Info,Blur,Pawpularity
0,0007de18844b0dbbb5e1f607da0606e0,0,1,1,1,0,0,1,0,0,0,0,0,63
1,0009c66b9439883ba2750fb825e1d7db,0,1,1,0,0,0,0,0,0,0,0,0,42
2,0013fd999caf9a3efe1352ca1b0d937e,0,1,1,1,0,0,0,0,1,1,0,0,28
3,0018df346ac9c1d8413cfcc888ca8246,0,1,1,1,0,0,0,0,0,0,0,0,15
4,001dc955e10590d3ca4673f034feeef2,0,0,0,1,0,0,1,0,0,0,0,0,72


In [4]:
data = data.set_index('Id')
data = data.rename_axis(None)

In [5]:
data.head()

Unnamed: 0,Subject Focus,Eyes,Face,Near,Action,Accessory,Group,Collage,Human,Occlusion,Info,Blur,Pawpularity
0007de18844b0dbbb5e1f607da0606e0,0,1,1,1,0,0,1,0,0,0,0,0,63
0009c66b9439883ba2750fb825e1d7db,0,1,1,0,0,0,0,0,0,0,0,0,42
0013fd999caf9a3efe1352ca1b0d937e,0,1,1,1,0,0,0,0,1,1,0,0,28
0018df346ac9c1d8413cfcc888ca8246,0,1,1,1,0,0,0,0,0,0,0,0,15
001dc955e10590d3ca4673f034feeef2,0,0,0,1,0,0,1,0,0,0,0,0,72


In [6]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 9912 entries, 0007de18844b0dbbb5e1f607da0606e0 to fff8e47c766799c9e12f3cb3d66ad228
Data columns (total 13 columns):
 #   Column         Non-Null Count  Dtype
---  ------         --------------  -----
 0   Subject Focus  9912 non-null   int64
 1   Eyes           9912 non-null   int64
 2   Face           9912 non-null   int64
 3   Near           9912 non-null   int64
 4   Action         9912 non-null   int64
 5   Accessory      9912 non-null   int64
 6   Group          9912 non-null   int64
 7   Collage        9912 non-null   int64
 8   Human          9912 non-null   int64
 9   Occlusion      9912 non-null   int64
 10  Info           9912 non-null   int64
 11  Blur           9912 non-null   int64
 12  Pawpularity    9912 non-null   int64
dtypes: int64(13)
memory usage: 1.1+ MB


In [7]:
for column in data:
    print(column.upper())
    print(data[column].value_counts())

SUBJECT FOCUS
0    9638
1     274
Name: Subject Focus, dtype: int64
EYES
1    7658
0    2254
Name: Eyes, dtype: int64
FACE
1    8960
0     952
Name: Face, dtype: int64
NEAR
1    8540
0    1372
Name: Near, dtype: int64
ACTION
0    9813
1      99
Name: Action, dtype: int64
ACCESSORY
0    9240
1     672
Name: Accessory, dtype: int64
GROUP
0    8630
1    1282
Name: Group, dtype: int64
COLLAGE
0    9420
1     492
Name: Collage, dtype: int64
HUMAN
0    8264
1    1648
Name: Human, dtype: int64
OCCLUSION
0    8207
1    1705
Name: Occlusion, dtype: int64
INFO
0    9305
1     607
Name: Info, dtype: int64
BLUR
0    9214
1     698
Name: Blur, dtype: int64
PAWPULARITY
28    318
30    318
26    316
31    312
29    304
     ... 
98     10
97      8
90      7
1       4
99      4
Name: Pawpularity, Length: 100, dtype: int64


## Process metadata

### Scale pawpularity score

In [8]:
# get target score
y = data["Pawpularity"]

In [9]:
# Scale and ensure that score is between 0 and 1
maxScore = y.max()
y_scaled= y / maxScore

In [10]:
y_scaled

0007de18844b0dbbb5e1f607da0606e0    0.63
0009c66b9439883ba2750fb825e1d7db    0.42
0013fd999caf9a3efe1352ca1b0d937e    0.28
0018df346ac9c1d8413cfcc888ca8246    0.15
001dc955e10590d3ca4673f034feeef2    0.72
                                    ... 
ffbfa0383c34dc513c95560d6e1fdb57    0.15
ffcc8532d76436fc79e50eb2e5238e45    0.70
ffdf2e8673a1da6fb80342fa3b119a20    0.20
fff19e2ce11718548fa1c5d039a5192a    0.20
fff8e47c766799c9e12f3cb3d66ad228    0.30
Name: Pawpularity, Length: 9912, dtype: float64

In [11]:
# Drop 'y'
data = data.drop('Pawpularity', axis=1)
data.head()

Unnamed: 0,Subject Focus,Eyes,Face,Near,Action,Accessory,Group,Collage,Human,Occlusion,Info,Blur
0007de18844b0dbbb5e1f607da0606e0,0,1,1,1,0,0,1,0,0,0,0,0
0009c66b9439883ba2750fb825e1d7db,0,1,1,0,0,0,0,0,0,0,0,0
0013fd999caf9a3efe1352ca1b0d937e,0,1,1,1,0,0,0,0,1,1,0,0
0018df346ac9c1d8413cfcc888ca8246,0,1,1,1,0,0,0,0,0,0,0,0
001dc955e10590d3ca4673f034feeef2,0,0,0,1,0,0,1,0,0,0,0,0


## Load image data

In [12]:
import cv2
import os
import numpy as np

In [13]:
def load_pet_images(data, inputPath):
    # initialize images array (i.e., the pet images themselves)
    images = []
   
    # get image files based on id in dataframe index column
    for i in data.index.values:
        base = os.path.join(inputPath, i).replace('\\','/')
        basePath = f'{base}.jpg'
        image = cv2.imread(basePath)
        
        # Resize images
        image = cv2.resize(image, (64, 64))
        images.append(image)
    
    # return set of images
    return np.array(images)


In [14]:
# provide input path for loading images
inputPath = '/Users/arnet/Desktop/Ironhack/GitHub/Deep_Learning_Final_Project/Images'

# make input path the current directory
os.chdir(inputPath)

images = load_pet_images(data, inputPath)

In [15]:
type(images)

numpy.ndarray

In [16]:
len(images)

9912

In [17]:
# scale image size
images = images / 255.0

In [18]:
images[0]

array([[[0.76078431, 0.70588235, 0.72156863],
        [0.69803922, 0.64313725, 0.65882353],
        [0.78431373, 0.73333333, 0.7372549 ],
        ...,
        [0.69411765, 0.69019608, 0.69803922],
        [0.72941176, 0.73333333, 0.7254902 ],
        [0.74901961, 0.74117647, 0.7372549 ]],

       [[0.75294118, 0.70588235, 0.69803922],
        [0.8       , 0.75294118, 0.74509804],
        [0.77647059, 0.72941176, 0.72156863],
        ...,
        [0.56078431, 0.65882353, 0.6       ],
        [0.56470588, 0.62352941, 0.58039216],
        [0.59607843, 0.61960784, 0.59607843]],

       [[0.7254902 , 0.67843137, 0.67058824],
        [0.74509804, 0.69803922, 0.69019608],
        [0.7372549 , 0.69019608, 0.68235294],
        ...,
        [0.57647059, 0.54509804, 0.54509804],
        [0.59215686, 0.57647059, 0.58431373],
        [0.52941176, 0.54117647, 0.54509804]],

       ...,

       [[0.63137255, 0.63529412, 0.61568627],
        [0.55686275, 0.61176471, 0.62352941],
        [0.41960784, 0

## Split into train and test data

In [42]:
from sklearn.model_selection import train_test_split

In [43]:
# Split data for METADATA, IMAGES AND PREDICTION VALUE 'y'
split = train_test_split(data, images, y, test_size=0.20, random_state=42)
(X_train_full, X_test, XImages_train_full, XImages_test, y_train_full, y_test) = split

# experiments were initially run using 'y-scaled', but using 'y' helps to judge accuracy 

In [44]:
# Split training set AGAIN to obtain VALIDATION SET
split_val = train_test_split(X_train_full, XImages_train_full, y_train_full, test_size=0.2, random_state=42)
(X_train, X_val, XImages_train, XImages_val, y_train, y_val) = split_val

In [68]:
for x in split_val:
    print(len(x))

6343
1586
6343
1586
6343
1586


In [45]:
X_train.head()

Unnamed: 0,Subject Focus,Eyes,Face,Near,Action,Accessory,Group,Collage,Human,Occlusion,Info,Blur
e887fce17c574c0d0b24a611d4c17c9e,0,0,1,1,0,0,1,0,0,0,0,0
4b249023b0234766daa3258ac27577ac,0,1,1,1,0,0,0,0,0,0,0,0
7e761f47cc1e3038a431f9f196234ab9,0,1,1,1,0,0,0,0,0,1,0,0
5c736b9e97fce993dc1c8b9f451618cb,0,1,1,1,0,0,0,0,1,1,0,0
80764b6de6f6554b8ac0363e88b4d064,0,1,1,0,0,0,0,0,0,1,1,0


# Models

In [46]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
import tensorflow as tf

In [47]:
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import concatenate

## Simple MLP model for Metadata

In [48]:
# function for creating 3-layer MLP model
def create_mlp(dim, regress=False):
    # define MLP network
    model = Sequential()
    model.add(Dense(8, input_dim=dim, activation="relu"))
    model.add(Dense(4, activation="relu"))
    # check to see if the regression node should be added
    if regress:
        model.add(Dense(1, activation="linear"))
    # return model
    return model

In [49]:
# Create and compile model
model = create_mlp(X_train.shape[1], regress=True)

model.compile(loss="mse", 
              optimizer='adam',
              metrics = ['mse', 
                         'mean_absolute_error', 
                         tf.metrics.RootMeanSquaredError()] 
             ) 

In [50]:
# Train model
model.fit(x=X_train, y=y_train, 
    validation_data=(X_val, y_val),
    epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x26d20137d00>

In [66]:
# Check accuracy on TEST DATA
preds = model.predict(X_test)

In [67]:
# Get difference
diff = preds.flatten() - y_test

abs_error = np.abs(diff)
mean_abs_error = np.mean(abs_error)
std_abs_error = np.std(abs_error)

print("Mean Absolute Error: ", mean_abs_error)

Mean Absolute Error:  16.109952986270446


#### Notes:
* Working without batches produced slightly better scores
* Number of epochs was reduced from an initial number of 100
* Working with y-scaled resulted in approximately equal results in terms of percentage error

## Single-layer CNN model for Images

In [82]:
# Create model
# One convolutional layer followed by max-pooling, flattening and dense layers
model2 = Sequential([
    Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)),
    Activation('relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(1),           
    Activation('linear')    
])

model2.compile(loss="mse",
              optimizer='adam',
              metrics=['mse', 
                      tf.keras.metrics.RootMeanSquaredError(),
                      'mean_absolute_error'])

In [83]:
# train model
model2.fit(x=XImages_train, y=y_train,
           validation_data=(XImages_val, y_val),
           epochs=5, 
           batch_size=8)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x26d20203670>

In [84]:
# Check accuracy on TEST DATA
preds2 = model2.predict(XImages_test)

In [91]:
diff2 = preds2.flatten() - y_test
abs_error = np.abs(diff2)
mean_abs_error = np.mean(abs_error)
std_abs_error = np.std(abs_error)

print("Mean Absolute Error: ", mean_abs_error)

Mean Absolute Error:  15.940919692385275


#### Notes:
* Tests were run with different numbers of neurons in Dense layer, 25 provided the best results
* Signs of overfitting nonetheless

## Triple-layer CNN model for Images

In [92]:
model3 = Sequential([
    Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 3)),
    Activation('relu'),
    BatchNormalization(), 
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(32, activation= 'relu'),
    BatchNormalization(),
    Dense(8, activation= 'relu'),
    BatchNormalization(),
    Dense(1),           
    Activation('linear') 
])

model3.compile(loss="mse",
              optimizer='adam', 
              metrics=['mse', 
                      tf.keras.metrics.RootMeanSquaredError(),
                      'mean_absolute_error'])

In [93]:
# train model
model3.fit(x=XImages_train, y=y_train,
           validation_data=(XImages_val, y_val),
           epochs=5, 
           batch_size=8)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x26d1fc1d700>

In [94]:
# Check accuracy on TEST DATA
preds3 = model3.predict(XImages_test)

In [95]:
diff3 = preds3.flatten() - y_test
abs_error = np.abs(diff3)
mean_abs_error = np.mean(abs_error)
std_abs_error = np.std(abs_error)

print("Mean Absolute Error: ", mean_abs_error)

Mean Absolute Error:  16.102777896117395


#### Notes:
* Adding layers for batch normalization significantly accelerates training rate, nonetheless much slower than single-layer model
* Once again, tests were done with different numbers of neurons in the first 2 dense layers (ex: 32 and 16, 20 and 10). The configuration of 32 and 8 provides reasonable results in terms of speed and accuracy.

## Combined Model - Images and Metadata

In [96]:
# CNN for regression prediction FOR IMAGE DATA
def create_cnn(width, height, depth, filters=(16, 32, 64)):
    # initialize the input shape and channel dimension, assuming
    # TensorFlow/channels-last ordering
    inputShape = (height, width, depth)
    chanDim = -1  # indicates axis which should be normalised, 

    # define the model input
    inputs = Input(shape=inputShape)
    
    # loop over the three filters
    for (i, f) in enumerate(filters):
        # if this is the first CONV layer then set the input
        # appropriately
        if i == 0:
            x = inputs
        # CONV => RELU => BN => POOL
        x = Conv2D(f, (3, 3), padding="same")(x)
        x = Activation("relu")(x)
        x = BatchNormalization(axis=chanDim)(x)
        x = MaxPooling2D(pool_size=(2, 2))(x)
    
    # flatten, then FC => RELU => BN => DROPOUT
    x = Flatten()(x)
    x = Dense(16)(x)
    x = Activation("relu")(x)
    x = BatchNormalization(axis=chanDim)(x)
    x = Dropout(0.5)(x)
    
    # apply another FC layer, this one to match the number of nodes
    # coming out of the MLP
    x = Dense(4)(x)
    x = Activation("relu")(x)
    
    # construct the CNN
    model = Model(inputs, x)
    # return the CNN
    return model

In [97]:
# CREATE the MLP and CNN models
mlp = create_mlp(X_train.shape[1], regress=False) # X_train.shape[1] = number of METADATA columns i.e. inputs
cnn = create_cnn(64, 64, 3)

# create the input to our final set of layers as the *output* of both
# the MLP and CNN
combinedInput = concatenate([mlp.output, cnn.output])

# our final FC layer head will have two dense layers, the final one
# being our regression head
x = Dense(4, activation="relu")(combinedInput)
x = Dense(1, activation="linear")(x)

# final model accepts categorical/numerical data on the MLP
# input and images on the CNN input, outputting the Pawpularity score
model4 = Model(inputs=[mlp.input, cnn.input], outputs=x)

In [98]:
# compile model
model4.compile(loss="mse", 
              metrics = ['mse', 
                      tf.keras.metrics.RootMeanSquaredError(),
                      'mean_absolute_error'],
              optimizer='adam')


# train the model
model4.fit(x=[X_train, XImages_train], y=y_train, 
          validation_data=([X_val, XImages_val], y_val), 
          epochs=5, 
          batch_size=8)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x26d20ba90a0>

In [99]:
# Check accuracy on TEST DATA
preds4 = model4.predict([X_test, XImages_test])

In [100]:
diff4 = preds4.flatten() - y_test
abs_error = np.abs(diff4)
mean_abs_error = np.mean(abs_error)
std_abs_error = np.std(abs_error)

print("Mean Absolute Error: ", mean_abs_error)

Mean Absolute Error:  15.796010349714209


#### Notes:
* By far the slowest model
* No improvements regarding overfitting on validation set
* best score on test set