Hi! The purpose of this project is to detect when Pépito (https://twitter.com/PepitoTheCat) is leaving or when Pépito is back at home.

I've downloaded all the images posted from his Twitter account (up to this date: 2018/06/06). There are 10,041 images plus another 227 images that I'm not gonna use since they have a different resolution (I want to keep this simple, at least for now that I'm starting). I'm not uploading the images to the GitHub repo, but if you want them ask me and I'll find a way to publish them!

## Data exploration

In [1]:
# Specially check for how unbalanced classes are
import pandas as pd

In [2]:
labelled_images = pd.read_csv('./data/labeled_images.csv')

In [3]:
label_count = labelled_images.groupby('label').count()[['img_name']]
label_count

Unnamed: 0_level_0,img_name
label,Unnamed: 1_level_1
home,5762
out,4279


In [4]:
print('prct home labels: %.3f' % ( (5762 / (5762+4297)) * 100) )

prct home labels: 57.282


In [5]:
print('prct home labels: %.3f' % ( (4279 / (5762+4297)) * 100) )

prct home labels: 42.539


Classes are not too unbalanced. But I think this shows possible bot problems, because it means there are consecutive classes and that shouldn't be possible because if Pépito leaves there's no way he can leave home again, he must have gotten back home before. Anyway, I'm sure there are other reason I don't know.

## Data splitting

In [3]:
from sklearn.utils import shuffle
labelled_images = shuffle(labelled_images)

In [4]:
from sklearn.model_selection import train_test_split

data_x = labelled_images[['img_name']]
data_y = labelled_images[['label']]

train_x, test_x, train_y, test_y = train_test_split(data_x, data_y, train_size=0.8, random_state=16121993, stratify = data_y.get_values())
dev_x, test_x, dev_y, test_y = train_test_split(test_x, test_y, train_size=0.5, random_state=16121993, stratify = test_y)
train_x, train_dev_x, train_y, train_dev_y = train_test_split(train_x, train_y, train_size=0.9, random_state=16121993, stratify = train_y)



After thinking about this, I'm gonna mix day and night pics and see how the model works. In case it doesn't work (among many other things) it would be worth it to pay more attention to this and make the train and dev splits to be as similar as possible.

I assume (I haven't checked every picture, sorry) the camera is fixed and its angle doesn't change.

In [8]:
# Create images train/train-dev/dev/test splits

In [9]:
# import auto_labelling
# import os, errno
# from shutil import copyfile
# import logging

# def create_folder(path):
#     try:
#         os.makedirs(path)
#     except OSError as e:
#         if e.errno != errno.EEXIST:
#             raise        

# def create_folder_structures(folder_name, data_x, data_y):
#     data = pd.concat([data_x, data_y], axis=1)
    
#     folder_name_full_path = os.path.join(auto_labelling.FILES_FOLDER_PATH, folder_name)
    
#     create_folder(folder_name_full_path)
    
#     home_imgs_full_path = os.path.join(folder_name_full_path, 'home')
#     out_imgs_full_path = os.path.join(folder_name_full_path, 'out')
    
#     create_folder(home_imgs_full_path)
#     create_folder(out_imgs_full_path)
    
#     logger = logging.getLogger('info_logger')
#     logger.setLevel(logging.INFO)
    
#     for row in data.itertuples(index=True):      
#         file_full_path = os.path.join(auto_labelling.FILES_FOLDER_PATH, getattr(row, 'img_name'))
        
#         if os.path.isfile(file_full_path): 
#             if getattr(row, 'label') == 'home':            
#                 copyfile(file_full_path, os.path.join(home_imgs_full_path, getattr(row, 'img_name')))
#             elif getattr(row, 'label') == 'out':
#                 copyfile(file_full_path, os.path.join(out_imgs_full_path, getattr(row, 'img_name')))          
#         else:
#             logger.info('%s is an old pic and it won\'t be used' % getattr(row, 'img_name'))

In [5]:
train_folder_name = 'train'
train_dev_folder_name = 'train_dev'
dev_folder_name = 'dev'
test_folder_name = 'test'

In [11]:
# create_folder_structures(train_folder_name, train_x, train_y)
# create_folder_structures(train_dev_folder_name, train_dev_x, train_dev_y)
# create_folder_structures(dev_folder_name, dev_x, dev_y)
# create_folder_structures(test_folder_name, test_x, test_y)

In [6]:
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
import os
import auto_labelling

datagen = ImageDataGenerator(rescale=1./255, data_format = 'channels_first')

BATCH_SIZE = 32

train_generator = datagen.flow_from_directory(
                    os.path.join(auto_labelling.FILES_FOLDER_PATH, train_folder_name),
                    target_size=(640, 480),
                    batch_size=BATCH_SIZE,
                    class_mode='binary',
                    classes = ['home', 'out'])

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Found 7223 images belonging to 2 classes.


## Modelling

I'm gonna use Convolutional Neural Networks. I'm gonna start with very easy networks and then try different architectures.

In [7]:
from keras.layers import Conv2D, Activation, Input, concatenate, Dropout, Flatten, LeakyReLU, AveragePooling2D, Dense
from keras.layers.normalization import BatchNormalization
from keras.models import Model
from keras import backend as K
from keras.initializers import glorot_normal
K.set_image_data_format('channels_first')

In [8]:
def f1(y_true, y_pred):
    def recall(y_true, y_pred):
        """Recall metric.

        Only computes a batch-wise average of recall.

        Computes the recall, a metric for multi-label classification of
        how many relevant items are selected.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        """Precision metric.

        Only computes a batch-wise average of precision.

        Computes the precision, a metric for multi-label classification of
        how many selected items are relevant.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))


In [9]:
SEED = 0

In [None]:
# Architecture 1: CNN - HL - output HL

In [10]:
#                  width height
X_input = Input((3, 640, 480))
X = Conv2D(16, (4, 4), strides = (1, 1), padding = 'valid', name = 'conv1', kernel_initializer = glorot_normal(seed=SEED))(X_input)
X = BatchNormalization(axis = 1, name = 'bn1')(X)
X = Dropout(0.2)(X)
X = LeakyReLU(alpha=0.3)(X)
X = AveragePooling2D(pool_size=(4, 4), strides=None, padding='valid')(X)

X = Flatten()(X)
X = Dense(1, activation='sigmoid', name='output_layer', kernel_initializer = glorot_normal(seed=SEED))(X)


model = Model(inputs = X_input, outputs = X, name='Model1')

In [11]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 3, 640, 480)       0         
_________________________________________________________________
conv1 (Conv2D)               (None, 16, 637, 477)      784       
_________________________________________________________________
bn1 (BatchNormalization)     (None, 16, 637, 477)      64        
_________________________________________________________________
dropout_1 (Dropout)          (None, 16, 637, 477)      0         
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 16, 637, 477)      0         
_________________________________________________________________
average_pooling2d_1 (Average (None, 16, 159, 119)      0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 302736)            0         
__________

In [12]:
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[f1])

In [None]:
import numpy as np
#                batch_size specified at train_generator
(model.fit_generator(train_generator, steps_per_epoch = np.floor(len(train_x) / BATCH_SIZE), 
                     epochs = 1, workers = 4, verbose = 2,
                     max_queue_size = 16,
                     use_multiprocessing = True
                    ))

Epoch 1/1
