<a href="https://colab.research.google.com/github/fipaniagua/IIC3697-Deep-Learning/blob/develop/CNN%20Inception%20with%20Keras/HW1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tarea 1: Redes Neuronales Convolucionales (CNNs)
Francisco Paniagua

## Parte 1: GoogleNet Inception V1

En esta sección se implementara la arquitectura de googleNet inception mediante el uso de la librerias Keras.
Especificamente se usara la arquitectura descrita en la siguiente tabla:


![texto alternativo](https://miro.medium.com/max/1400/1*lRN3h9a_qJdT6NIy0VOu3Q.png)

In [0]:
#modulos de keras 
import keras
from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, Dense, Activation, Dropout, Flatten, concatenate, AveragePooling2D

### Actividad 1


In [24]:
Inputs = Input(shape=(224,224,3))

x = Conv2D(64, (7, 7), strides=(2,2), padding='same', activation='relu', use_bias=False, name='Conv2d_1a_7x7_conv')(Inputs)
print(x.shape) #Para analizar el tamaño del tensor x. Entrega: (batch_size="None", dimension1, dimension2, dimension3 (dada por la cantidad de filtros))
x = MaxPooling2D((3, 3), strides=(2, 2), padding='same', name='MaxPool_2a_3x3')(x)

model = Model(Inputs, x, name="Inception model 1")
model.summary()

(None, 112, 112, 64)
Model: "Inception model 1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_8 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
Conv2d_1a_7x7_conv (Conv2D)  (None, 112, 112, 64)      9408      
_________________________________________________________________
MaxPool_2a_3x3 (MaxPooling2D (None, 56, 56, 64)        0         
Total params: 9,408
Trainable params: 9,408
Non-trainable params: 0
_________________________________________________________________


Como se puede apreciar las salida del modelo "Inception model 1" de la capa Maxpool_2a_3x3 es de (56,56,64) lo cual concuerda con la segunda fila de la tabla de la arquitectura de googleNet


### Actividad 2

Ahora procederemos a agregar las siguientes capas al modelo:
1. Convolución de 64 filtros, con ventana de 1x1 y stride de 1. Sin bias.
1. Convolución de 192 filtros, con ventana de 3x3 y stride de 1. Sin bias.
1. MaxPooling de ventana de 3x3 y stride de 2


In [25]:
l2a = Conv2D(64, (1, 1), strides=(1,1), padding='same', activation='relu', use_bias=False, name='Conv2d_2a_1x1')(x) 
l2b = Conv2D(192, (3, 3), strides=(1,1), padding='same', activation='relu', use_bias=False, name='Conv2d_2b_3x3')(l2a)
l2c = MaxPooling2D((3,3), strides=(2,2), padding="same", name = "MaxPool_2c_3x3")(l2b)
print(l2c.shape)

(None, 28, 28, 192)


La arquitectura de GoogleNet usa continuamente convoluciones de 1x1 con stride de 1, debido a que si estas convoluciones no exitieran la complejidad de computo seria demasiado grande. A continuación se pueden ver 2 tablas, con las cuales podemos comparar el computo que se necesita para realizar dos convoluciones (una de 3x3 y otra de 5x5) si se antepone o no una convolucion de 1x1


> Sin convoluciones previas de 1x1:

Input size| operation type | filter size | output size | operations required (output size x filter size)
--- | --- | --- | --- |---
28x28x192 | Conv2d 5x5 (32 filters)| 5x5x192 | 28x28x32 | 120.422.400  
28x28x192 | Conc2d 3x3 (128 filters) | 3x3x192 | 28x28x128 | 173.408.256
    | |  | |  293 millones

> Con convoluciones previas de 1x1

Input size| operation type | filter size | output size | operations required (output size x filter size)
--- | --- | --- | --- |---
28x28x192 | Conv2d 1x1 (96 filters) | 1x1x192 | 28x28x96 | 14.450.688
28x28x192 | Conv2d 1x1 (16 filters) | 1x1x192 | 28x28x16 | 2.408.448
28x28x96  | Conv2d 3x3 (128 filters) | 3x3x96 | 28x28x128 | 86.704.128
28x28x16  | Conv2d 5x5 (32 filters)  | 5x5x16 | 28x28x32 | 10.035.200  
 | | | | 112 millones

En este ejemplo podemos apreciar como el uso de las convoluciones de 1x1 permiten reducir la complejidad de 293 millones a 112 millones de operaciones


###Actividad 4

In [0]:
def generate_inception_layer(input_tensor, size_1x1, reduce_3x3_size, size_3x3, reduce_5x5_size, size_5x5, pool_proj_size, module_name ):
    branch_0 = Conv2D(size_1x1, (1,1), strides=(1,1), padding="same", activation="relu", name = "{0}_branch_0_a_1x1".format(module_name))(input_tensor)
    branch_1a = Conv2D(reduce_3x3_size, (1,1), strides=(1,1), padding="same", activation="relu", name="{0}_branch_1a_1x1".format(module_name))(input_tensor)
    branch_1b = Conv2D(size_3x3, (3,3), strides=(1,1), padding="same", activation="relu", name="{0}_branch_1b_3x3".format(module_name))(branch_1a)
    branch_2a = Conv2D(reduce_5x5_size, (1,1), strides=(1,1), padding="same", activation="relu", name="{0}_branch_2a_1x1".format(module_name))(input_tensor)
    branch_2b = Conv2D(size_5x5, (5,5), strides=(1,1), padding="same", activation="relu", name="{0}_branch_2b_5x5".format(module_name))(branch_2a)
    branch_3a = MaxPooling2D((3,3),strides=(1,1),padding="same", name="{0}_branch_1c_maxpool".format(module_name))(input_tensor)
    branch_3b = Conv2D(pool_proj_size, (1,1), strides=(1,1), padding="same", activation="relu", name="{0}_branch_2c_1x1".format(module_name))(branch_3a)
    concat_layer = concatenate([branch_0, branch_1b, branch_2b, branch_3b], axis=3, name="{0}_concatenated_layer".format(module_name))
    return concat_layer


In [28]:
l3a = generate_inception_layer(l2c, 64, 96, 128, 16, 32, 32, "3a")
l3a.shape

TensorShape([None, 28, 28, 256])

Como podemos notar la funcion de modulo esta entregando las dimensiones correctas, ya que despues de la capa 2c viene un modulo *inception* que deveria entregar un output de 28x28x256 (fila 5 de la tabla de la arquitectura de GoogleNet)

###Actividad 5

En esta sección implementaremos la red completa previa a la capa de *AveragePool*.


<img src="https://i.kym-cdn.com/photos/images/facebook/000/531/557/a88.jpg" alt="drawing" width="500"/>


In [0]:
l3b = generate_inception_layer(l3a, 128, 128, 192, 32, 96, 64, "3b")
l3c = MaxPooling2D((3,3), strides=(2,2), padding="same", name="Max_pool_3c_3x3")(l3b)
l4a = generate_inception_layer(l3c, 192, 96, 208, 16, 48, 64, "4a")
l4b = generate_inception_layer(l4a, 160, 112, 224, 24, 64, 64, "4b")
l4c = generate_inception_layer(l4b, 128, 128, 256, 24, 64, 64, "4c")
l4d = generate_inception_layer(l4c, 112, 144, 288, 32, 64, 64, "4d")
l4e = generate_inception_layer(l4d, 256, 160, 320, 32, 128, 128, "4e")
l4f = MaxPooling2D((3,3), strides=(2,2), padding="same", name="Max_pool_4f_3x3")(l4e)
l5a = generate_inception_layer(l4f, 256, 160, 320, 32, 128, 128, "5a")
l5b = generate_inception_layer(l5a, 384, 192, 384, 48, 128, 128, "5b")

La manera mas sensilla de revisar si nuestra implementacion de las capas tiene las dimensiones correctas es verificar que la ultima capa "*Inception 5b*" entrege un output de 7x7x1024 (fila 5 de la tabla de la arquitectura)

In [31]:
l5b.shape

TensorShape([None, 7, 7, 1024])

### Actividad 6

Finalmente agregaremos las ultimas capas e instanciaremos el modelo.

In [36]:
lavgpool = AveragePooling2D((7,7), strides=(1,1), name="Avarage_pool_7x7")(l5b)
print(lavgpool.shape)
ldrop = Dropout(0.4)(lavgpool)
lflat = Flatten()(ldrop)
llinear = Dense(1000, use_bias=False, name="linear")(lflat)
lfinal = Activation("softmax")(llinear)

(None, 1, 1, 1024)


In [38]:
final_model = Model(Inputs, lfinal, name="inceptionNet")
final_model.summary()

Model: "inceptionNet"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_8 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
Conv2d_1a_7x7_conv (Conv2D)     (None, 112, 112, 64) 9408        input_8[0][0]                    
__________________________________________________________________________________________________
MaxPool_2a_3x3 (MaxPooling2D)   (None, 56, 56, 64)   0           Conv2d_1a_7x7_conv[0][0]         
__________________________________________________________________________________________________
Conv2d_2a_1x1 (Conv2D)          (None, 56, 56, 64)   4096        MaxPool_2a_3x3[0][0]             
_______________________________________________________________________________________