# <center><b>HackerEarth Deep Learning challenge: Identify the dance form</b></center>

<center><img src="https://media-fastly.hackerearth.com/media/hackathon/hackerearth-deep-learning-challenge-identify-dance-form/images/b163aaca99-DanceForm_FB.jpg" height=400 width=700/></center>

<center>Timeline - May 21, 07:30 AM IST - Jul 05, 07:30 AM IST</center>



## Problem statement

This International Dance Day, an event management company organized an evening of Indian classical dance performances to celebrate the rich, eloquent, and elegant art of dance. Post the event, the company planned to create a microsite to promote and raise awareness among the public about these dance forms. However, identifying them from images is a tough nut to crack.


You have been appointed as a Machine Learning Engineer for this project. 
- <font color='red'><b>Build an image tagging Deep Learning model that can help the company classify these images into eight categories of Indian classical dance</b></font>.

## Dataset

The dataset consists of __364 images__ belonging to 8 categories, namely 
- manipuri, 
- bharatanatyam, 
- odissi, 
- kathakali, 
- kathak, 
- sattriya, 
- kuchipudi, and 
- mohiniyattam.



<center><img src="https://qph.fs.quoracdn.net/main-qimg-2ca0fa1346eccd87a882bc1c873e6001.webp"/></center>

## Evaluation Metric
- The evaluation metric for this competition is ```Accuracy```.

## My approach 

As the data is very limited for this hackathon if you try rigorous training, model will over fit. 

- And here comes in rescue are  the [pretrained model](https://docs.fast.ai/vision.learner.html) 

- For the given data set i used [Resnet152 pretrained model](https://www.kaggle.com/pytorch/resnet152).

- Library used is [Fastai](https://www.fast.ai/) which built on top of pytorch.

## <center><font color='brown'>Using Progressive Resizing Technique</font></center>

<center><img src="https://www.wisdomrobot.com/wp-content/uploads/2017/02/Diagram-Coins-Business-Coin-Bar-Achievement-Chart-18134-960x675.jpg"height =400 width=400/>

Basic imports

In [None]:
# To print multiple output in a cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
# install this version to avoid the multiple warning 
!pip install "torch==1.4" "torchvision==0.5.0"

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import matplotlib.pyplot as plt
%matplotlib inline
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

#from albumentations import *
import cv2
import copy
import os
import torch
print(os.listdir("../input"))

import seaborn as sns
import matplotlib.pyplot as plt
import glob

#!pip install pretrainedmodels
from tqdm import tqdm_notebook as tqdm
from torchvision.models import *
#import pretrainedmodels

from pathlib import Path
from fastai.vision import *
from fastai.vision.models import *
from fastai.vision.learner import model_meta
from fastai.callbacks import * 

#from utils import *
import sys

from sklearn.metrics import f1_score, accuracy_score

# Any results you write to the current directory are saved as output.

## Data Driven tasks.

In [None]:
## set the data folder
data_folder=Path('../input/identify-the-dance-form')

print(os.listdir(data_folder))

In [None]:
recompute_scale_factor=True
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

In [None]:
train_data_path = "../input/identify-the-dance-form/train"
train_path = os.path.join(train_data_path, "*jpg")

In [None]:
test_data_path = "../input/identify-the-dance-form/test"
test_path = os.path.join(test_data_path, "*jpg")

## For train data

In [None]:
train_files = glob.glob(train_path)
train_images=[]
for file in train_files:
    image = cv2.imread(file)
#     print(image.shape)
    train_images.append(image)

In [None]:
print(len(train_images))

## For test data

In [None]:
test_files = glob.glob(test_path)
test_images=[]
for file in test_files:
    image = cv2.imread(file)
    print(image.shape)
    test_images.append(image)

In [None]:
print(len(test_images))

From the above code cell

- All the images are of different sizes, need to resize them to one before training.


In [None]:
## read the csv data files
train_df = pd.read_csv('../input/identify-the-dance-form/train.csv')
test_df = pd.read_csv('../input/identify-the-dance-form/test.csv')

In [None]:
train_df.head(3)
test_df.head(3)

### Encode the target variable

In [None]:
train_df['target']=train_df['target'].map({'mohiniyattam':0,'odissi':1,'kathakali':2,
                                           'bharatanatyam':3,'kuchipudi':4,'sattriya':5,
                                           'kathak':6,'manipuri':7})

In [None]:
train_df.target.value_counts()

- As we can see that the training data is very less.

So training the model from the beginning is not feasible, Here comes the pretrained model in the picture so for this we gonna use different pretrained model with ```fastai library```.

## <center>Progressive Resizing</center>

> *Progressive resizing is a technique for building CNNs that can be very helpful during the training and optimization phases of a machine learning project. The technique appears most prominently in Jeremy Howard’s work, and he uses it to good effect throughout his terrific fast.ai course.(Course Part-1 - Lecture3)*

- It is the technique to sequentially resize all the images while training the CNNs on smaller to bigger image sizes.

-  Best way to use this technique is to train a model with smaller image size say ```128x128```, then use the weights of this model to train another model on images of size ```256x256, 512x512``` and so on. 

- Each larger-scale model incorporates the previous smaller-scale model layers and weights in its architecture.



## Data Transformation

We will maintain the same transformations tricks to all models of progressive resizing.

In [None]:
##transformations to be done to images

tfms = get_transforms(do_flip=True,flip_vert=False ,max_rotate=10.0, max_zoom=1.22, max_lighting=0.22, max_warp=0.4, p_affine=0.75,
                      p_lighting=0.75)


test_img = ImageList.from_df(test_df, path=data_folder, folder='test')

In [None]:
## create source of train image databunch
np.random.seed(45)

src = (ImageList.from_df(train_df, path=data_folder, folder='train')
       .split_by_rand_pct(0.2)
       #.split_none()
       .label_from_df()
       .add_test(test_img))

In [None]:
# considering image size of 128

data = (src.transform(tfms, size=128,padding_mode='reflection',resize_method=ResizeMethod.SQUISH)
        .databunch(path='.', bs=32, device= torch.device('cuda:0')).normalize(imagenet_stats));

In [None]:
print(data.classes)
data.show_batch(rows=3, figsize=(7,7))

### Create a ```Learner```.

Previously i use ```resnet152``` as base architecture as it performs well so will try that here too,

In [None]:
# acc_02 = partial(accuracy_thresh, thresh=0.2)
# f_score = partial(fbeta, thresh=0.2)

In [None]:
#lets create learner. tried with resnet152, densenet201, resnet101
# learn = cnn_learner(data=data, base_arch=models.resnet152, metrics=[FBeta(beta=1, average='macro'), accuracy],
#                     callback_fns=ShowGraph).mixup()

# will train first without mixup

#lets create learner. tried with resnet152, densenet201, resnet101
# learn = cnn_learner(data=data, base_arch=models.resnet152, metrics=[FBeta(beta=1, average='macro'), accuracy],
#                     callback_fns=ShowGraph).mixup()

learn = cnn_learner(data=data, base_arch=models.resnet50, metrics=[FBeta(beta=1, average='macro'), accuracy],
                    callback_fns=ShowGraph)



- We use the __LR Finder__ to pick a good ```learning rate```.

In [None]:
learn.fit_one_cycle(5)

In [None]:
learn.lr_find()
learn.recorder.plot()

Now we can fit the head of our network.

In [None]:
# lr=1e-03

In [None]:
learn.fit_one_cycle(10, max_lr=1e-03)

# learn.fit_one_cycle(5, slice(lr))

In [None]:
learn.fit_one_cycle(10, max_lr=1e-04)

In [None]:
learn.save('stage-1-resnet-152-img_size-128')

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(dpi=120)

In [None]:
interp.plot_top_losses(9, figsize=(15,11))


..And fine-tune the whole model:

In [None]:
learn.unfreeze()

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
# previously is is trained for 1e-4 
learn.fit_one_cycle(10, slice(1e-4),wd=0.1)

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(dpi=120)

In [None]:
learn.save('stage-2-rn152')

Using the weight of previous model we will again train it with the new image size-```256*256```

In [None]:
# considering image size of 256
data = (src.transform(tfms, size=256,padding_mode='reflection',resize_method=ResizeMethod.SQUISH)
        .databunch(path='.', bs=16, device= torch.device('cuda:0')).normalize(imagenet_stats));

learn.data = data
data.train_ds[0][0].shape

In [None]:
# As in previous layer we unfreeze the whole model so let's freeze it once again so that we will train 
# for last layers only
learn.freeze()

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
lr=8e-5

# lr=3e-06

In [None]:
# model seems to overfit try to use weight decay wd=0.1
learn.fit_one_cycle(10, slice(lr),wd=0.1)

In [None]:
learn.save('stage-1-256-rn152')

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(dpi=120)

In [None]:
torch.cuda.empty_cache()

In [None]:
learn.unfreeze()

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
# lr=1e-05

# lr=1e-04

In [None]:
learn.fit_one_cycle(10, slice(1e-4, lr/5),wd=0.2)

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(dpi=120)

In [None]:
learn.save('stage-2-256-rn152')

In [None]:
# considering image size of 512
data = (src.transform(tfms, size=512,padding_mode='reflection',resize_method=ResizeMethod.SQUISH)
        .databunch(path='.', bs=8, device= torch.device('cuda:0')).normalize(imagenet_stats));

learn.data = data
data.train_ds[0][0].shape

In [None]:
learn.freeze()

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
# lr=1e-03

lr=3e-04

In [None]:
# learn.fit_one_cycle(15, slice(5e-4, lr/5))

learn.fit_one_cycle(10, slice(lr),wd=0.1)

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(dpi=120)

In [None]:
learn.save('stage-1-512-rn152')

In [None]:
learn.unfreeze()

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
# lr=1e-05
lr=3e-06

In [None]:
learn.fit_one_cycle(10, slice(3e-06, lr/10),wd=0.1)

# In next step what you can try is to run for only 10 epochs to avoid overfitting.

In [None]:
learn.save('stage-2-512-rn152')

### Confusion Matrix Check

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(dpi=120)

## Accuracy Check

In [None]:
##learn.TTA improves score further. lets see for the validation set
pred_val,y = learn.TTA(ds_type=DatasetType.Valid)
valid_preds = [np.argmax(pred_val[i]) for i in range(len(pred_val))]
valid_preds = np.array(valid_preds)
y = np.array(y)
accuracy_score(valid_preds,y),f1_score(valid_preds,y, average='weighted')

In [None]:
# preds,y = learn.TTA(ds_type=DatasetType.Test)
preds,_ = learn.get_preds(ds_type = DatasetType.Test)
labelled_preds = [np.argmax(preds[i]) for i in range(len(preds))]

labelled_preds = np.array(labelled_preds)

## Create final submissions.

In [None]:
#create submission file
df = pd.DataFrame({'Image':test_df['Image'], 'target':labelled_preds}, columns=['Image', 'target'])

df['target']=df['target'].map({0:'mohiniyattam',1:'odissi',2:'kathakali',
                                           3:'bharatanatyam',4:'kuchipudi',5:'sattriya',
                                           6:'kathak',7:'manipuri'})

df.head()

df.to_csv('submission_mode_resnet-Stage2_512_new.csv', index=False)

Keypoints:  

- In the very last you see a final submission for image_size=512.
- It's not necessary that only the model of image_size=512 perform well other size models also performed well and sometime they outperform the model trained on bigger image_size.

- While training you will perform multile techniques, most important thing is keep track of all your parametrs and models.

## Note: 

<center>
 - If this kernel helped you:

    - Do upvote
    - Do follow
    - In case you have any query use comment section.
</center>

<img src='https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTkN7ooAwGVuRCg_9axVg1XzVLLvb_e28PR_w&usqp=CAU/'>