**Thanks for checking out this notebook.
**
If you find this helpful, please show your love and support by upvoting it too.

# Understanding the dataset
**Let us print the dataset**

In [None]:
import os
print("There are following directories and files in this dataset")
print(*list(os.listdir("../input/dogs-vs-cats-redux-kernels-edition")),sep = "\n")

# 1. sample_submission.csv
**This csv file is an example of csv file(of the predictions by your model) to be uploaded for this competition's evaluation.****

# 2. Train.zip
**This zip file contains the training data which will be used for training our model**

# 3. Test.zip
**This zip file contains the training data which will be used for training our model**

# Importing necessary libraries
**Import Pandas - For data analysis Import Fastai - For training of deep learning model and predictions.


Note: We are using Fastai2 (course version 4) which is the latest version. (Not previous version of Fastai- which is version 1)**

In [None]:
import glob
import numpy as np
import pandas as pd
import fastai
from fastai.vision.all import *

# Defining the variables and assigning the paths for this notebook


In [None]:
path = Path('../input/dogs-vs-cats-redux-kernels-edition')
image_path = Path('./train')

sample_submission_file = Path('../input/dogs-vs-cats-redux-kernels-edition/sample_submission.csv')

# Checking the data for model training
**We will now check the images availabe to us:**

In [None]:
import zipfile
with zipfile.ZipFile("../input/dogs-vs-cats-redux-kernels-edition/train.zip","r") as x:
    x.extractall("./")

# Training data
**Let us see the training data present in "train.zip" file and available in train folder **

In [None]:
print("Number of images present in train folder are: ")
len(glob.glob('./train/*.jpg'))

In [None]:
print("First few samples of training data is as follows: \n")
Path('./train').ls(5)

# Creating the Pandas Dataframe to be used for creating an ImageDataLoader

In [None]:
train_img_files = get_image_files('./train')

In [None]:
img_file_names = np.array([f'{x}' for x in sorted(train_img_files)])
img_file_names

In [None]:
label_names = np.array([(1 if 'dog' in file_name else 0) for file_name in img_file_names])
label_names

In [None]:
train_df = pd.DataFrame(img_file_names, columns=['id'])
train_df['label'] = label_names
print("Size of Training data \n", train_df.shape)
print("----------------------------------------------------------")
print("\nFirst few samples of data are \n",train_df.head())

# Creating the image data loader

In [None]:
dls = ImageDataLoaders.from_df(train_df, path='.', 
                         valid_pct=0.2, seed=42, 
                         fn_col='id', 
                         label_col='label', 
                         label_delim=None, 
                          
                         item_tfms=Resize(128), batch_tfms=None, 
                         bs=64, 
                         )

**Let us check the device type of our "ImageDataLoader" to make sure that we are using "GPU"**

In [None]:
image_data_loader.device

**Let us check few random images from our ImageDataLoader's batch to make sure that images and labels appears correctly in it.
**


In [None]:
dls.show_batch()

In [None]:
len(dls.train_ds), len(dls.valid_ds)

# Trainnig the image recognizer model

#We create a CNN (convolutional neural network) with the following specific details:

What data we want to train it on? </br> Our data to be used for training is "dls"

Which architecture to use? </br> We are using Resnet34

what metric to use for our training evaluation? </br> We have specified it as "error_rate"



In [None]:
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(4,freeze_epochs = 4)

In [None]:
learn.fit_one_cycle(6,lr_max=slice(1e-6,1e-4))

# Bring it on - Test data !!

# Test data
**Let us see the test data present in "test.zip" file and available in test folder **

In [None]:
!rm -r './train'
with zipfile.ZipFile("../input/dogs-vs-cats-redux-kernels-edition/test.zip","r") as y:
    y.extractall("./")

In [None]:
print("Number of images present in test folder are: ")
len(glob.glob('./test/*.jpg'))

In [None]:
print("First few samples of test data is as follows: \n")
Path('./test').ls(5)

**Defining the variables and assigning the paths for test dataset**

In [None]:
test_image_files = get_image_files('./test')

**Let us create a ImageDataLoader of our test data set**

In [None]:
fnames = [f.name for f in test_image_files]
tst_dl = dls.test_dl(test_image_files)

Make the predictions using our trained model called "learn".

Ignoring the first two outputs from the model, let us take our final result stored in variable "preds"


In [None]:
_,_,preds = learn.get_preds(dl=tst_dl, with_decoded=True)

sub = pd.DataFrame(fnames, columns=['id'])
sub['label'] = preds

In [None]:
!rm -r './test'
sub = pd.DataFrame(fnames, columns=['id'])
sub['label'] = preds

In [None]:
sample_sub = pd.read_csv(sample_submission_file)

In [None]:
sub.to_csv('submission.csv', index=False)