This problem is taken from this [https://datahack.analyticsvidhya.com/contest/practice-problem-age-detection/](https://datahack.analyticsvidhya.com/contest/practice-problem-age-detection/). 

>The task is to predict the age of a person from his or her facial attributes. For simplicity, the problem has been converted to a multi-class problem with classes as Young, Middle and Old.

I attempted this problem almost 1.5 years ago with only the knowledge of Convolutions. 

Today, I am at least in a position to apply Deep Neural Networks to problems like these. Thanks to the community and great learning resources. The code that is presented here have been taken from many places. I think being a programmer it is equally important to be able read people's code and reuse it. (Opinions are mine)

I would like to thank the following people specifically - 
- Adrian Rosebrock, for putting together a tutorial on [Deep Learning and Medical Image Analysis with Keras](https://www.pyimagesearch.com/2018/12/03/deep-learning-and-medical-image-analysis-with-keras/) which is one of the classiest tutorials I have ever read. 
- FAIZAN SHAIKH, for the blog [Hands on with Deep Learning – Solution for Age Detection Practice Problem](https://www.analyticsvidhya.com/blog/2017/06/hands-on-with-deep-learning-solution-for-age-detection-practice-problem/) which tremendously served me as a reference. 

In [None]:
# Dependencies

import os
import random
import numpy as np 

import pandas as pd
from scipy.misc import imread
from subprocess import check_output

import warnings
warnings.filterwarnings("ignore")

In [None]:
# i used Adrian's custom ResNet model. For importing modules from custom scripts in Kaggle
# Kernels, you need to do some hacks. The following ensures the initial directory 
# structure is retained if anything went wrong in order to load up the custom script. 
os.chdir("/kaggle/working/")

In [None]:
train_csv = pd.read_csv('../input/traincsv/train.csv')
test_csv = pd.read_csv('../input/testcsv/test.csv')

In [None]:
print(check_output(["ls", "../input/traintestzip/train/Train"]).decode("UTF-8"))

In [None]:
print(check_output(["ls", "../input/traintestzip/test/Test"]).decode("UTF-8"))

In [None]:
# Load up a random image and display it. Along with it display the age of the person present in the image.
# Thanks, Faizan. 
from scipy.misc import imshow
import matplotlib.pyplot as plt
i = random.choice(train_csv.index)

img_name = train_csv.ID[i]
img = imread(os.path.join('../input/traintestzip/train', 'Train', img_name))


#imshow(img)
plt.imshow(img)
plt.show()
print("Age:" + train_csv.Class[i])

# Resizing the images fron train and test set to 64 * 64 and half precision (float16).
> From my experiments, I saw that half-precision policy gave a tremendous speed boost and an improved performance score. 

In [None]:
from scipy.misc import imresize
import numpy as np

temp = []
for img_name in train_csv.ID:
    img_path = os.path.join('../input/traintestzip/train', 'Train', img_name)
    img = imread(img_path)#, flatten=True) # Remove Greyscaling
    img = imresize(img, (64, 64))
    img = img.astype('float16') # Changed the precision to 64, resource problem not done
    temp.append(img)

train_x = np.stack(temp)

In [None]:
temp = []
for img_name in test_csv.ID:
    img_path = os.path.join('../input/traintestzip/test', 'Test', img_name)
    img = imread(img_path)#, flatten = True)
    img = imresize(img, (64, 64))
    temp.append(img.astype('float16'))

test_x = np.stack(temp)

In [None]:
# # Required if greyscaling is applied
# train_x = np.expand_dims(train_x, axis=3)
# test_x = np.expand_dims(test_x, axis=3)
# train_x.shape, test_x.shape

# PyImageSearch Patches

In [None]:
!pip install --upgrade imutils

**imutils** is a utility class written by Adrian which provides many convenience functions required during Image Processing tasks. 

In [None]:
from imutils import paths
# determine the total number of image paths in training, validation,
# and testing directories
totalTrain = len(list(paths.list_images(os.path.join('../input/traintestzip/train', 'Train'))))
#totalVal = len(list(paths.list_images(config.VAL_PATH)))
totalTest = len(list(paths.list_images(os.path.join('../input/traintestzip/test', 'Test'))))

In [None]:
totalTrain, totalTest

## Normalization of the pixels

In [None]:
train_x = train_x / 255.
test_x = test_x / 255.

## Label encode the classes

In [None]:
import keras
from sklearn.preprocessing import LabelEncoder

lb = LabelEncoder()
train_y = lb.fit_transform(train_csv.Class)
train_y = keras.utils.np_utils.to_categorical(train_y)

## Split the training set into further training and test set

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train_x, train_y, test_size=0.2, random_state=42)

In [None]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

## Data augmentation for the training set to enhance the training (did not improve the performance much, hence the code for this is commented). 

In [None]:
# from keras.preprocessing.image import ImageDataGenerator

# # initialize the training training data augmentation object
# trainAug = ImageDataGenerator(
#     featurewise_center=True,
#     featurewise_std_normalization=True,
#     rotation_range=20,
#     width_shift_range=0.2,
#     height_shift_range=0.2,
#     horizontal_flip=True)

In [None]:
# trainAug.fit(X_train)
# trainAug.fit(X_test)

## Setting up the ResNet model using Adrian's code

In [None]:
os.chdir('../input/pyimagesearch/')
from resnet import ResNet

**Defining the Learning rate schedule. A small hack which speeds up the training process. But this can even be enhanced with adaptive learning rates since adaptive LRs eliminate man constraints of LR schedulers. We also define two of the most important hyperparameters required during the training - `batch size` and `epoch`.**

In [None]:
# define the total number of epochs to train for along with the
# initial learning rate and batch size
NUM_EPOCHS = 50
INIT_LR = 1e-1
BS = 64 #32 is good for CPU
def poly_decay(epoch):
    # initialize the maximum number of epochs, base learning rate,
    # and power of the polynomial
    maxEpochs = NUM_EPOCHS
    baseLR = INIT_LR
    power = 1.0
 
# compute the new learning rate based on polynomial decay
    alpha = baseLR * (1 - (epoch / float(maxEpochs))) ** power
 
# return the new learning rate
    return alpha

### Model initialization

In [None]:
from keras.optimizers import SGD
#from keras.optimizers import Adam
# initialize our ResNet model and compile it

#model = ResNet.build(64, 64, 3, 3, (3, 4, 6), (16, 32, 64, 128), reg=0.0001) # 0.0005
# We used the above build earlier. Trying a more complex architecture now

model = ResNet.build(64, 64, 3, 3, (3, 4, 6), (16, 32, 64, 128), reg=0.0005)

# model = ResNet.build(64, 64, 3, 3, (3, 4, 6),
# (64, 128, 256, 512), reg=0.0005) # Overfits >_<

# model = ResNet.build(64, 64, 3, 3, (3, 4, 6),
# (32, 64, 128, 256), reg=0.0005)

opt = SGD(lr=INIT_LR, momentum=0.9)
#opt = Adam(lr=0.01) # Trying out Adam 0.1
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

In [None]:
# # Set up CLR
# os.chdir('../input/clr-scripts/')

# from clr_callback import *
# clr_triangular = CyclicLR(mode='triangular')

### Finally setup the scheduler as a Keras callback and fit the model with it

In [None]:
# Define our set of callbacks and fit the model
from keras.callbacks import LearningRateScheduler
callbacks = [LearningRateScheduler(poly_decay)]



%time H = model.fit(X_train, y_train, batch_size=BS,epochs=NUM_EPOCHS,verbose=1,\
                    validation_data = (X_test, y_test), callbacks=callbacks)

# BS - Epoch tradeoff

# %time H = model.fit(X_train, y_train, batch_size=32,epochs=(len(X_train)//32),verbose=1,\
#                     validation_data = (X_test, y_test)) #callbacks=callbacks) # No callbacks since Adam

# %time H = model.fit_generator(trainAug.flow(X_train, y_train, batch_size=BS),\
#                               steps_per_epoch=len(X_train) // BS, \
#                               validation_data=trainAug.flow(X_test, y_test, batch_size=BS),\
#                               validation_steps=len(X_test) // BS,\
#                               epochs=NUM_EPOCHS,\
#                               callbacks=callbacks)

# Let's try with a larger batch size while keeping the no. of epochs smaller constant
# %time H = model.fit(X_train, y_train, batch_size=2000,epochs=20,verbose=1,\
#                     validation_data = (X_test, y_test), callbacks=[clr_triangular])

Image augmentation did not help much in this case. So let's try the model without image augmentation. The problem also tells us that image augmentation is not going to help much here. 

### The model takes approaximately 51.5 minutes to train on Kaggle Kernels.Let's serialize this model in .h5 format for later usage.

In [None]:
os.chdir("/kaggle/working/")
model.save('model_resnet.h5')

In [None]:
!ls -l --block-size=M

### A chart for judging the model.

In [None]:
# Plot the training loss and accuracy.
import matplotlib.pyplot as plt
 
# Plot
N = NUM_EPOCHS
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy on Dataset")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.show()

**Overfitting due to the increased complexity of the model. Hence, let's try a lesser complex model and see.  Accuracy is not improving much with increased complexities. Let's get back to the baseline and try with greyscaled images.**

### Make predicitions on the original test data i.e. `test_x`  and prepare the submission file

In [None]:
pred = model.predict_on_batch(test_x)

In [None]:
pred

In [None]:
# for each image in the testing set we need to find the index of the
# label with corresponding largest predicted probability
predIdxs = np.argmax(pred, axis=1)

In [None]:
pred_transform = lb.inverse_transform(predIdxs) # Transform labels back to original encoding.
test_csv['Class'] = pred_transform
test_csv.to_csv('submission_sayak_resnet.csv', index=False)

In [None]:
!head -5 submission_sayak_resnet.csv