# Reconnaissance et detection des personnages dans les Simpsons en utilisant les méthodes de réseaux de neurones artificiels

**Pierre-Edouard GUERIN**

18/12/2019
__________________

# Introduction


## Contexte

Un jeu de donnée complet d'images annotées des personnages de la série animé les Simpsons est disponible ici : https://www.kaggle.com/alexattia/the-simpsons-characters-dataset

En apprentissage automatique, un réseau de neurones convolutifs (CNN) est un type de réseau de neurones artificiels, dans lequel le motif de connexion entre les neurones est inspiré par le cortex visuel des animaux. L'idée est d'utiliser cette méthode pour créer un programme capable de detecter les différents personnages dans une video des Simpsons et de quantifier leur temps de présence pendant la video.


## Mission

L'objectif est de


## Script

a finir

# Prérequis

In [8]:
import os
import random
import numpy as np
from PIL import Image

import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.font_manager import FontProperties
import scipy
import cv2
%matplotlib inline

np.random.seed(2)

from sklearn.metrics import confusion_matrix
import itertools

from keras.utils.np_utils import to_categorical # convert to one-hot-encoding
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, BatchNormalization
from keras.optimizers import RMSprop
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint
from keras.models import model_from_json


from skimage.transform import resize
from skimage import data

import warnings

warnings.filterwarnings('ignore')

# Charger, formater et explorer le jeu de données

In [9]:
## prepare data
dict_characters = {0: 'abraham_grampa_simpson', 1: 'apu_nahasapeemapetilon', 2: 'bart_simpson', 
        3: 'charles_montgomery_burns', 4: 'chief_wiggum', 5: 'comic_book_guy', 6: 'edna_krabappel', 
        7: 'homer_simpson', 8: 'kent_brockman', 9: 'krusty_the_clown', 10: 'lenny_leonard', 11:'lisa_simpson',
        12: 'marge_simpson', 13: 'mayor_quimby',14:'milhouse_van_houten', 15: 'moe_szyslak', 
        16: 'ned_flanders', 17: 'nelson_muntz', 18: 'principal_skinner', 19: 'sideshow_bob'}

In [13]:
def load_train_set(dirname,dict_characters):
    """load train data"""
    X_train = []
    Y_train = []
    for label, character in dict_characters.items():
        list_images = os.listdir(dirname+'/'+character)
        for image_name in list_images:
            image =  plt.imread(dirname+'/'+character+'/'+image_name)            
            X_train.append(resize(image,(64,64)))
            Y_train.append(label)
    return np.array(X_train), np.array(Y_train)

In [17]:
def load_test_set(dirname,dict_characters):
    """load test data"""
    X_test = []
    Y_test = []
    for image_name in os.listdir(dirname):
        character_name = "_".join(image_name.split('_')[:-1])
        label = [label for label,character in dict_characters.items() if character == character_name][0]
        image = plt.imread(dirname+'/'+image_name)
        X_test.append(resize(image,(64,64)))
        Y_test.append(label)
    return np.array(X_test), np.array(Y_test)

In [14]:
## load train data
X_train, Y_train = load_train_set("data/simpsons_characters_recognition_detection/the-simpsons-characters-dataset/simpsons_dataset/", dict_characters)       


In [18]:
## load test data
X_test, Y_test = load_test_set("data/simpsons_characters_recognition_detection/the-simpsons-characters-dataset/kaggle_simpson_testset/", dict_characters)

# Normalization


In order to alleviate brightness variation contribution to the CNN model (we want a model able to recognize a character not a luminosity value), I have to grayscale images.

In [20]:
## Scale data
X_train = X_train / 255.0
X_test = X_test / 255.0