## TOC
- [Imports](#Imports)
- [Preperations](#Preperations)
- [Training](#Training)
- [Visualization](#Visualization)
- [Finishing](#Finishing)

## Imports
> In this section we will import the required libraries which are:  
> `Json`: to parse json file.  
> `Pandas`: to load the json into dataframe.  
> `FastAI`: The library which will be used to train our model.

## Preperations
> In this section we will open and parse the json file which includes the entries to train our model on then load it to dataframe then prepare a DataBlock which is high level api to tell FastAI how to deal with our data which will be splitted into training and validation data.

In [None]:
import json
import pandas as pd
from fastai.vision.all import *

In [None]:
dataset_root ='/kaggle/input/0x2ahack-data'

with open(dataset_root + '/' + 'cleanResult.json') as f:
    data = json.load(f)

> But before we insert the entire entries to FastAI we need to further clean the data by using verify_image function to make sure its a valid image which FastAI can deal with.

In [None]:
verifiedData = []
for x in data:
    res = verify_image(x['image_path'])
    if res == True:
        verifiedData.append(x)

> Then we insert the data:

In [None]:
df = pd.DataFrame(verifiedData)
dblock = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_x=ColReader('image_path'),
    get_y=ColReader('is_student'),
    splitter=RandomSplitter(valid_pct=0.2, seed=42), 
    item_tfms=Resize(128),
    batch_tfms=aug_transforms(mult=2)
)

> Now we load our dataframe into FastAI which will be splitted into batches for more efficent access in model training, and using visions_leaner we load the model resnet34 which is used for image classification with the data provided

In [None]:
dls = dblock.dataloaders(df, bs=32)
dls.show_batch(max_n=6, nrows=2, unique=False)
learner = vision_learner(dls,models.resnet34, pretrained=True)

In [None]:
learner.summary()

> Now the resnet34 model works by stepping on batches of data and generate parameters from it, now how far the model steps each iteration affects the quality of our final model thats why we need to choose it carefully, choosing the right value requires a lot of testing thankfully FastAI has function called lr_find which finds the optimal length of steps which has the least accuracy loss.

In [None]:
learner.lr_find()

## Training
> Now we train the pretrained resnet34 model on our data to identify the features in our images and generate paramaters from them which can be used to make predictions.

In [None]:
learner.fine_tune(4, base_lr=0.0030199517495930195)

> After the training is done we show few of the validation data results with our trained model.

In [None]:
learner.show_results()

## Visualization
> Now to identify the accuracy of our trained model we plot a visualization called confusion matrix which shows the count of correct and incorrect predictions on our validation set.

In [None]:
interp = ClassificationInterpretation.from_learner(learner)
interp.plot_confusion_matrix()

In [None]:
interp.plot_top_losses(6,figsize = (25,5))

## Finishing
> Last step is to export the model into pickel data strcture which includes all the nessesary data to make predictions offline.

In [None]:
learner.export('trained_model.pkl')