## Enunciado

1. Engineer your features. Here you do not have them for free. You need to think of possible ways for transforming the collected data into meaningful features. For some ideas, consider traditional features such as texture features, color features, bags of visual words or more powerful ones involving CNNs. If you cannot think of anything, talk to the professor for some ideas.

2. Propose classification techniques to solve the problem. Suggestions here are the CNN directly, or SVMs/Random Forests allied with CNNs through the use of transfer learning.

3. Consider using data augmentation in the training (what about in the testing as well?)

4. Observation: You are free to use any solution to help you extract the features at this point.

5. Report all of your results for the validation and test data. The labels for the test will be released one week before the deadline.

## Resultados

1. LeNet-like
    - Sem data augmentation com 50 epochs
        - Acuracia-1 = 0.05463301346611636
        - Acuracia-5 = 0.19096645689592612
        - Acuracia normalizada = 0.055267239601399634
        - F1 (media) = 0.0528899260412019
    - Sem data augmentation com 35 epochs
        - Acuracia-1 = 0.05596147537013693
        - Acuracia-5 = 0.18764530092091783
        - Acuracia normalizada = 0.05572421304755998
        - F1 (media) = 0.049322308806100265
    - Com data augmentation com 35 epochs 
        - Acuracia-1 = 0.056127533
        - Acuracia-5 = 0.18067087
        - Acuracia normalizada = 0.05669239630135797
        - F1 (media) = 0.05433258611366018
2. InceptionV3
    - Treinando apenas FC (20 epochs)
        - Acuracia-1 = 0.7789771
        - Acuracia-5 = 0.95034873
        - Acuracia normalizada = 0.7765120613992883
        - F1 (media) = 0.7714184908455786
     - teste
     Acuracia-1: 0.8501845
Acuracia-5: 0.98357934
Acuracia normalizada: 0.8441153359252538
F1 score: 0.8422229263930219

## Experimentos

In [1]:
#imports
import numpy as np
from glob import glob
from skimage import io
from skimage.transform import resize
import sklearn.metrics 
from keras import backend as K
from keras.applications.inception_v3 import InceptionV3
from keras.models import Sequential, Model
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Flatten, Dense, Activation
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import SGD
from keras.metrics import categorical_accuracy, top_k_categorical_accuracy
from keras.utils import np_utils, Sequence
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model
from IPython.display import SVG

  (fname, cnt))
  (fname, cnt))
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
#Definicoes
dataFolder = 'data/'
trainFolder = dataFolder + 'train/'
trainAugFolder = dataFolder + 'train_aug/'
validationFolder = dataFolder + 'val/'
testFolder = dataFolder + 'test/'

numberOfClasses = 83
largura = 200
altura = 200
profundidade = 3

In [3]:
#Funcoes auxiliares

#Retorna o vetor da imagem dado o nome do seu arquivo
def le_imagem(name):
    return io.imread(name,plugin='matplotlib') 

#Retorna o vetor da imagem e a classe dado o caminho do arquivo
def le_imagem_e_classe(name):
    img = le_imagem(name)
    img = resize(img, (200, 200))
    classe = np_utils.to_categorical(int(name.split('/')[2].split('_')[0]), numberOfClasses)
    return img, classe

#Retorna o vetor da imagem e a classe dado o caminho do arquivo
def le_imagem_e_classe_aug(name):
    img = le_imagem(name)
    classe = np_utils.to_categorical(int(name.split('/')[2].split('_')[0]), numberOfClasses)
    return img, classe

def le_imagem_e_classe_test(name, lista):
    img = le_imagem(name)
    img = resize(img, (200, 200))
    classe = np_utils.to_categorical(int(name.split('/')[2].split('_')[0]), numberOfClasses)
    return img, classe

#Retorna o caminho de todas as imagens dado a pasta
def nome_das_imagens(pasta):
    return glob(pasta + '*')

#Retorna o numero correspondente a predicao
def categorical_to_number(vector):
    maior = 0
    for i in range(len(vector)):
        if vector[i] > vector[maior]:
            maior = i
    return maior

def normalized_accuracy(y_true, y_pred):
    acc_by_class = 0
    ind_true = y_true.argmax(axis=1)
    ind_pred = y_pred.argmax(axis=1)
    for clss in range(numberOfClasses):
        mask = (ind_true == clss)
        acc_by_class += np.equal(ind_true[mask], ind_pred[mask]).mean()
    
    return acc_by_class/numberOfClasses

def f1(y_true, y_pred):
    y_true = y_true.argmax(axis=1)
    y_pred = y_pred.argmax(axis=1)
    return sklearn.metrics.f1_score(y_true, y_pred, average='macro')

def metricas(y_true, y_pred):
    tf = K.get_session()
    #Retorna acuracias normal e top5
    print("Acuracia-1: " + str(tf.run(K.mean(categorical_accuracy(y_true, y_pred)))))
    print("Acuracia-5: " + str(tf.run(top_k_categorical_accuracy(y_true, y_pred))))

    #Retorna nornmalized accuracy e F1
    print("Acuracia normalizada: " + str(normalized_accuracy(y_true, y_pred)))
    print("F1 score: " + str(f1(y_true, y_pred)))

In [4]:
#Le as imagens de treino
X_train = []
y_train = []
for file in nome_das_imagens(trainFolder):
    img, classe = le_imagem_e_classe(file)
    X_train.append(img)
    y_train.append(classe)

#Le as imagens aumentadas tambem?
if(True):
    for file in nome_das_imagens(trainAugFolder):
        img, classe = le_imagem_e_classe_aug(file)
        X_train.append(img)
        y_train.append(classe)
    
print(X_train[0].shape)
X_train = np.array(X_train)
print(X_train.shape)
y_train = np.array(y_train)
print(y_train.shape)



  warn("The default mode, 'constant', will be changed to 'reflect' in "


(200, 200, 3)
(33200, 200, 200, 3)
(33200, 83)


In [11]:
#Le as imagens de validacao
X_val = []
y_val = []
for file in nome_das_imagens(validationFolder):
    img, classe = le_imagem_e_classe(file)
    X_val.append(img)
    y_val.append(classe)

print(X_val[0].shape)
X_val = np.array(X_val)
print(X_val.shape)
y_val = np.array(y_val)
print(y_val.shape)

  warn("The default mode, 'constant', will be changed to 'reflect' in "


(200, 200, 3)
(6022, 200, 200, 3)
(6022, 83)


In [12]:
#Le as imagens de test
X_test = []
y_test = []
lista = open('MO444_dogs_test.txt', 'r').readlines()

for line in range(len(lista)):
    lista[line] = (lista[line].split(' ')[1].split('/')[-1], lista[line].split(' ')[2])
#print(lista)                   
for file in lista:
    img = le_imagem(testFolder + file[0])
    img = resize(img, (200, 200))
    X_test.append(img)
    y_test.append(np_utils.to_categorical(int(file[1]), numberOfClasses))

print(X_test[0].shape)
X_test = np.array(X_test)
print(X_test.shape)
y_test = np.array(y_test)
print(y_test.shape)

  warn("The default mode, 'constant', will be changed to 'reflect' in "


(200, 200, 3)
(5420, 200, 200, 3)
(5420, 83)


- ### LeNet

In [5]:
#Lenet

class LeNet:
    @staticmethod
    def build(width, height, depth, classes, weightsPath=None):
        # Inicializa modelo
        model = Sequential()
        initial_shape = (height, width, depth)
        
        # 1a layer CONV => RELU => POOL
        model.add(Conv2D(filters=6, kernel_size=5, strides=1, input_shape=initial_shape))
        model.add(Activation("relu"))
        model.add(MaxPooling2D(pool_size=2, strides=2))
                  
        # 2a layer CONV => RELU => POOL
        model.add(Conv2D(filters=16, kernel_size=5, strides=1))
        model.add(Activation("relu"))
        model.add(MaxPooling2D(pool_size=2, strides=2))
                  
        # FC layer
        model.add(Flatten())
        model.add(Dense(120))
        model.add(Activation("relu"))
        model.add(Dense(84))
        model.add(Activation("relu"))
                  
        # classificador softmax 
        model.add(Dense(classes))
        model.add(Activation("softmax"))
                  
        if weightsPath is not None:
            model.load_weights(weightsPath)
 
        return model

- ### LeNet sem data augmentation

In [34]:
#Treinando sem data augmentation
model = LeNet.build(width=largura, height=altura, depth=profundidade, classes=numberOfClasses, \
                    weightsPath=None)
opt = SGD(lr=0.01)
model.compile(loss="categorical_crossentropy", optimizer=opt, \
              metrics=[categorical_accuracy, top_k_categorical_accuracy])

In [8]:
#Gerar imagens da arquitetura da rede
plot_model(model, to_file='lenet.png', show_shapes=True, show_layer_names=False)
SVG(model_to_dot(model).create(prog='dot', format='svg'))

In [35]:
model.fit(X_train,y_train, batch_size=100, epochs=50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7f1f58a08198>

In [35]:
#Testa o modelo treinado 

pred = model.predict(X_val, batch_size=500)
metricas(y_val, pred)

Acuracia-1: 0.054633014
Acuracia-5: 0.19096646
Acuracia normalizada: 0.055267239601399634
F1 score: 0.0528899260412019


In [15]:
#Guarda os pesos treinados
model.save_weights('lenet_nda_50.h5')

- ### Data augmentation

In [42]:
#Data augmentation
data_aug_gen = ImageDataGenerator(rotation_range=30, width_shift_range=0.2, height_shift_range=0.2, \
                                  shear_range=0.2, zoom_range=0.2, vertical_flip=True)

iterator = data_aug_gen.flow(X_train, y_train, batch_size=100, seed=1)


#Geracao
i = 0
for batch in iterator:
    if(i == 24900):
        break
    imgs = batch[0]
    labels = batch[1].argmax(axis=1)
    for ind in range(imgs.shape[0]):
        io.imsave('data/train_aug/' + str(labels[ind]) + '_' + str(i) + '.png', imgs[ind,:,:,:])
        i += 1

  .format(dtypeobj_in, dtypeobj_out))


- ### LeNet com data augmentation

In [6]:
#Treinando com data augmentation
model = LeNet.build(width=largura, height=altura, depth=profundidade, classes=numberOfClasses, \
                    weightsPath='lenet_cda_35.h5')
opt = SGD(lr=0.01)
model.compile(loss="categorical_crossentropy", optimizer=opt, \
              metrics=[categorical_accuracy, top_k_categorical_accuracy])

In [7]:
model.fit(X_train,y_train, batch_size=100, epochs=35)

Epoch 1/35
Epoch 2/35
Epoch 3/35
Epoch 4/35
Epoch 5/35
Epoch 6/35
Epoch 7/35
Epoch 8/35
Epoch 9/35
Epoch 10/35
Epoch 11/35
Epoch 12/35
Epoch 13/35
Epoch 14/35
Epoch 15/35
Epoch 16/35
Epoch 17/35
Epoch 18/35
Epoch 19/35
Epoch 20/35
Epoch 21/35
Epoch 22/35
Epoch 23/35
Epoch 24/35
Epoch 25/35
Epoch 26/35
Epoch 27/35
Epoch 28/35
Epoch 29/35
Epoch 30/35
Epoch 31/35
Epoch 32/35
Epoch 33/35
Epoch 34/35
Epoch 35/35


<keras.callbacks.History at 0x7ff31983b080>

In [7]:
#Testa o modelo treinado 

pred = model.predict(X_val, batch_size=500)
metricas(y_val, pred)

Acuracia-1: 0.056127533
Acuracia-5: 0.18067087
Acuracia normalizada: 0.05669239630135797
F1 score: 0.05433258611366018


In [8]:
#Guarda os pesos treinados
model.save_weights('lenet_cda_35.h5')

- ### InceptionV3 congelando todas as layers convolucionais

In [13]:
# Criando modelo
base_model = InceptionV3(include_top=False, weights='imagenet', input_shape=(200,200,3))

# add a global spatial average pooling layer
x = base_model.output

x = GlobalAveragePooling2D()(x)

# let's add a fully-connected layer
x = Dense(2048, activation='relu')(x)
predictions = Dense(numberOfClasses, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

#Freezing layers
for layer in base_model.layers:
    layer.trainable = False
    
model.load_weights('inceptionv3_cda_20.h5')

In [19]:
#Gerar imagens da arquitetura da rede
plot_model(model, to_file='inceptionv3.png', show_shapes=True, show_layer_names=False)

In [14]:
#Compilar
opt = SGD(lr=0.01)
model.compile(loss="categorical_crossentropy", optimizer=opt, \
              metrics=[categorical_accuracy, top_k_categorical_accuracy])

In [7]:
#Treina o modelo
model.fit(X_train,y_train, batch_size=32, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f837e5bbb38>

In [8]:
#Salva os pesos
model.save_weights('inceptionv3_cda_20.h5')

In [16]:
#Testa o modelo treinado 

pred = model.predict(X_test, batch_size=500)
metricas(y_test, pred)

Acuracia-1: 0.8501845
Acuracia-5: 0.98357934
Acuracia normalizada: 0.8441153359252538
F1 score: 0.8422229263930219


- ### InceptionV3 congelando todas as layers convolucionais menos 2

In [5]:
# Criando modelo
base_model = InceptionV3(include_top=False, weights='imagenet', input_shape=(200,200,3))

# add a global spatial average pooling layer
x = base_model.output

x = GlobalAveragePooling2D()(x)

# let's add a fully-connected layer
x = Dense(2048, activation='relu')(x)
predictions = Dense(numberOfClasses, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# Congelando todas as layers menos as duas ultimas
for layer in model.layers[:249]:
    layer.trainable = False
for layer in model.layers[249:]:
    layer.trainable = True

In [6]:
#Compilar
opt = SGD(lr=0.01)
model.compile(loss="categorical_crossentropy", optimizer=opt, \
              metrics=[categorical_accuracy, top_k_categorical_accuracy])

In [7]:
#Treina o modelo
model.fit(X_train,y_train, batch_size=100, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f972dcf1a90>

In [8]:
#Salva os pesos
model.save_weights('inceptionv3_fino_cda_20.h5')

In [12]:
#Testa o modelo treinado 

pred = model.predict(X_val, batch_size=500)
metricas(y_val, pred)

Acuracia-1: 0.74958485
Acuracia-5: 0.9402192
Acuracia normalizada: 0.744914899793834
F1 score: 0.7413158812959674


- ### Melhor modelo