### Section :1 - Adapting Architectures for resolution change:

Q-1.1 – Please refer to the “Aggregated Residual Transformations for Deep Neural Networks” (ResNeXt) architecture (paper, code). The task is to adapt this architecture to work with 64x64 images for training and prediction. Please use the down-sampled ImageNet(link) as the train/test dataset. You can either directly adapt the Lua reference implementation from FAIR or re-implement it in your preferred framework. 
 

Paper Link : https://arxiv.org/pdf/1611.05431.pdf

Dataset Link : http://image-net.org/small/download.php

Git Link : https://github.com/facebookresearch/ResNeXt

### Objective : 

To adapt this architecture to work with 64x64 images for training and prediction


## Kears implementation

* Source 
* https://github.com/nitish11/Kaggle-submissions-image-classification/tree/master/cifar-10-keras
* https://github.com/fchollet/keras/blob/master/keras/applications/resnet50.py

In [1]:
import os
import numpy as np
import glob
import cv2
import six

from keras import layers
from keras.utils.data_utils import get_file
from keras.models import Model
from keras.layers import Input, Activation, Dense, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D, AveragePooling2D,ZeroPadding2D
from keras.layers.merge import add
from keras.layers.normalization import BatchNormalization
from keras.regularizers import l2
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import np_utils
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping
from keras.applications.imagenet_utils import _obtain_input_shape

Using TensorFlow backend.


In [2]:
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'


In [3]:
def identity_block(input_tensor, kernel_size, filters, stage, block):
    """The identity block is the block that has no conv layer at shortcut.
    # Arguments
        input_tensor: input tensor
        kernel_size: defualt 3, the kernel size of middle conv layer at main path
        filters: list of integers, the filterss of 3 conv layer at main path
        stage: integer, current stage label, used for generating layer names
        block: 'a','b'..., current block label, used for generating layer names
    # Returns
        Output tensor for the block.
    """
    filters1, filters2, filters3 = filters
    if K.image_data_format() == 'channels_last':
        bn_axis = 3
    else:
        bn_axis = 1
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters2, kernel_size,
               padding='same', name=conv_name_base + '2b')(x)
    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)

    x = layers.add([x, input_tensor])
    x = Activation('relu')(x)
    return x


In [4]:
def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
    """conv_block is the block that has a conv layer at shortcut
    # Arguments
        input_tensor: input tensor
        kernel_size: defualt 3, the kernel size of middle conv layer at main path
        filters: list of integers, the filterss of 3 conv layer at main path
        stage: integer, current stage label, used for generating layer names
        block: 'a','b'..., current block label, used for generating layer names
    # Returns
        Output tensor for the block.
    Note that from stage 3, the first conv layer at main path is with strides=(2,2)
    And the shortcut should have strides=(2,2) as well
    """
    filters1, filters2, filters3 = filters
    if K.image_data_format() == 'channels_last':
        bn_axis = 3
    else:
        bn_axis = 1
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    x = Conv2D(filters1, (1, 1), strides=strides,
               name=conv_name_base + '2a')(input_tensor)
    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters2, kernel_size, padding='same',
               name=conv_name_base + '2b')(x)
    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
    x = Activation('relu')(x)

    x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
    x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)

    shortcut = Conv2D(filters3, (1, 1), strides=strides,
                      name=conv_name_base + '1')(input_tensor)
    shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)

    x = layers.add([x, shortcut])
    x = Activation('relu')(x)
    return x



In [5]:
def ResNet50(include_top=True, weights='imagenet',
             input_tensor=None, input_shape=None,
             pooling=None,
             classes=1000):
    """Instantiates the ResNet50 architecture.
    Optionally loads weights pre-trained
    on ImageNet. Note that when using TensorFlow,
    for best performance you should set
    `image_data_format="channels_last"` in your Keras config
    at ~/.keras/keras.json.
    The model and the weights are compatible with both
    TensorFlow and Theano. The data format
    convention used by the model is the one
    specified in your Keras config file.
    # Arguments
        include_top: whether to include the fully-connected
            layer at the top of the network.
        weights: one of `None` (random initialization)
            or "imagenet" (pre-training on ImageNet).
        input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
            to use as image input for the model.
        input_shape: optional shape tuple, only to be specified
            if `include_top` is False (otherwise the input shape
            has to be `(224, 224, 3)` (with `channels_last` data format)
            or `(3, 224, 224)` (with `channels_first` data format).
            It should have exactly 3 inputs channels,
            and width and height should be no smaller than 197.
            E.g. `(200, 200, 3)` would be one valid value.
        pooling: Optional pooling mode for feature extraction
            when `include_top` is `False`.
            - `None` means that the output of the model will be
                the 4D tensor output of the
                last convolutional layer.
            - `avg` means that global average pooling
                will be applied to the output of the
                last convolutional layer, and thus
                the output of the model will be a 2D tensor.
            - `max` means that global max pooling will
                be applied.
        classes: optional number of classes to classify images
            into, only to be specified if `include_top` is True, and
            if no `weights` argument is specified.
    # Returns
        A Keras model instance.
    # Raises
        ValueError: in case of invalid argument for `weights`,
            or invalid input shape.
    """
    if weights not in {'imagenet', None}:
        raise ValueError('The `weights` argument should be either '
                         '`None` (random initialization) or `imagenet` '
                         '(pre-training on ImageNet).')

    if weights == 'imagenet' and include_top and classes != 1000:
        raise ValueError('If using `weights` as imagenet with `include_top`'
                         ' as true, `classes` should be 1000')

    # Determine proper input shape
#     input_shape = _obtain_input_shape(input_shape,
#                                       default_size=224,
#                                       min_size=197,
#                                       data_format=K.image_data_format(),
#                                       include_top=include_top)

    if input_tensor is None:
        img_input = Input(shape=input_shape)
    else:
        if not K.is_keras_tensor(input_tensor):
            img_input = Input(tensor=input_tensor, shape=input_shape)
        else:
            img_input = input_tensor
            
    if K.image_data_format() == 'channels_last':
        bn_axis = 3
    else:
        bn_axis = 1

    x = ZeroPadding2D((1, 1))(img_input)
    x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x)
    x = BatchNormalization(axis=bn_axis, name='bn_conv1')(x)
    x = Activation('relu')(x)
    x = MaxPooling2D((3, 3), strides=(2, 2))(x)

    x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')

    x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')

    x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')

    x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')

#     x = AveragePooling2D((7, 7), name='avg_pool')(x)
    x = AveragePooling2D((2, 2), name='avg_pool')(x)

    if include_top:
        x = Flatten()(x)
        x = Dense(classes, activation='softmax', name='fc1000')(x)
    else:
        if pooling == 'avg':
            x = GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = GlobalMaxPooling2D()(x)

    # Ensure that the model takes into account
    # any potential predecessors of `input_tensor`.
    if input_tensor is not None:
        inputs = get_source_inputs(input_tensor)
    else:
        inputs = img_input
    
    # Create model.
    model = Model(inputs, x, name='resnet50')

    return model

In [6]:
#To read the training data and create training set for keras model 
def prepare_data(no_classes,dataset_path,num_of_channels,image_width,image_height):
    print "===Setting the data ==="
    class_names = os.listdir(dataset_path)

    #Calculate the number of the images in the dataset
    N=0
    for p_name in class_names:
        i_path = dataset_path + "/"+p_name 
        i_count = len(os.listdir(i_path))
        N += i_count

    # Training Data
    X_train = np.zeros((N, image_width, image_height, num_of_channels), dtype=np.uint8)
    y_train = np.zeros((N,no_classes), dtype=np.int64)

    index = 0
    data_count =0
    data_class = 0

    #class list and Images to read from directory
    for class_index,class_name in enumerate(class_names):

        images_path = dataset_path + "/"+class_name 
        images_count = len(os.listdir(images_path))
        label_data = np.zeros((no_classes), dtype=np.uint8)
        data_class += 1
        data_count += images_count

        print "= class name",class_name
        images_filenames = glob.glob(images_path + '/*.png')

        train_images = [np.array(cv2.imread(f)) for f in images_filenames]
        num_of_images = len(train_images) 
        train_index = index+num_of_images 

        X_train[index:train_index] = train_images
        #Labelling data
        label_data[data_class-1] = 1
        y_train[index:train_index] = label_data

        index += num_of_images
        if data_class == no_classes:
            break

    print "Xtrain shape :- " + str(np.shape(X_train)) 
    print "Ytrain shape :- " + str(np.shape(y_train)) 
    print "==Number of classs taken into data :" + str(data_class) 

    return X_train,y_train


In [7]:
lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
early_stopper = EarlyStopping(min_delta=0.001, patience=10)
csv_logger = CSVLogger('imagenet.csv')

batch_size = 32
nb_classes = 10
nb_epoch = 3
data_augmentation = True

# input image dimensions
img_rows, img_cols = 64, 64
img_channels = 3

#Loading the imagenet dataset
X_train,Y_train = prepare_data(nb_classes,'./test_data',img_channels,img_rows,img_cols)
X_test,Y_test = prepare_data(nb_classes,'./test_data',img_channels,img_rows,img_cols)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# subtract mean and normalize
mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_test -= mean_image
X_train /= 128.
X_test /= 128.


===Setting the data ===
= class name n02130308
= class name n02165456
= class name n01632458
= class name n01798484
= class name n01968897
= class name n01582220
= class name n02112350
= class name n01532829
= class name n02091831
= class name n02102040
Xtrain shape :- (500, 64, 64, 3)
Ytrain shape :- (500, 10)
==Number of classs taken into data :10
===Setting the data ===
= class name n02130308
= class name n02165456
= class name n01632458
= class name n01798484
= class name n01968897
= class name n01582220
= class name n02112350
= class name n01532829
= class name n02091831
= class name n02102040
Xtrain shape :- (500, 64, 64, 3)
Ytrain shape :- (500, 10)
==Number of classs taken into data :10


In [8]:
#Compiling the defined models
model=ResNet50(include_top=True,weights=None,input_tensor=None,input_shape=(64,64,3),pooling=None,classes=10)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

#Summary of model
model.summary()

#Training the model
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, validation_data=(X_test, Y_test), 
          shuffle=True, callbacks=[lr_reducer, early_stopper, csv_logger])

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_1 (InputLayer)             (None, 64, 64, 3)     0                                            
____________________________________________________________________________________________________
zero_padding2d_1 (ZeroPadding2D) (None, 66, 66, 3)     0                                            
____________________________________________________________________________________________________
conv1 (Conv2D)                   (None, 30, 30, 64)    9472                                         
____________________________________________________________________________________________________
bn_conv1 (BatchNormalization)    (None, 30, 30, 64)    256                                          
___________________________________________________________________________________________



Train on 500 samples, validate on 500 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7f0870ae2250>