<a href="https://colab.research.google.com/github/SupreethRao99/Kaggle/blob/master/DeepCAPTCHA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DeepCAPTCHA
DeepCAPTHA is a ResNet architecture based convultional neural network (CNN) trained on the [Chars74K-Fonts](http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/#download) Dataset. It has been built as part of a larger project which attempts to defeat simple CAPTCHAs. The Network trained in this notebook achieves an training accuracy of 88% and a validation accuracy of 87%.

In [2]:
# importing the required libraries
from google.colab import drive
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import zipfile
import os
import random
from shutil import copyfile
import datetime

In [1]:
!nvidia-smi #displays the GPU allocated by google colab

Sun Apr  4 16:36:55 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   34C    P0    25W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

The dataset is stored on google drive. The dataset is then loaded onto colab and unzipped. training and testing directories are created for each class present in the dataset.

In [4]:
drive.mount('/content/drive') 

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [5]:
zip_ref = zipfile.ZipFile("/content/drive/MyDrive/English.zip", 'r')
zip_ref.extractall("/tmp")
zip_ref.close()

In [6]:
os.mkdir('/tmp/CAPTCHA')
os.mkdir('/tmp/CAPTCHA/testing')
os.mkdir('/tmp/CAPTCHA/training')
for i in range(0,62):
  try:
    if i>= 0 and i<10:
      os.mkdir('/tmp/CAPTCHA/training/'+chr(i+48))
      os.mkdir('/tmp/CAPTCHA/testing/'+chr(i+48))
    if i>= 10 and i<36:
      os.mkdir('/tmp/CAPTCHA/training/'+chr(i-10+65))
      os.mkdir('/tmp/CAPTCHA/testing/'+chr(i-10+65))
    if i>=36 and i<62:
      os.mkdir('/tmp/CAPTCHA/training/'+chr(i-36+97))
      os.mkdir('/tmp/CAPTCHA/testing/'+chr(i-36+97))
      
  except OSError:
    print('failed')
    pass

the `split_data` function splits the dataset into training and testing sets randomly. the size of the training and testing set is determined by the `SPLIT_SIZE` parameter

In [7]:
from random import shuffle
import shutil

def split_data(SOURCE, TRAINING, TESTING, SPLIT_SIZE):
    all_images = os.listdir(SOURCE)
    shuffle(all_images)
    splitting_index = round(SPLIT_SIZE*len(all_images))
    train_images = all_images[:splitting_index]
    test_images = all_images[splitting_index:]

    for img in train_images:
        src = os.path.join(SOURCE, img)
        dst = os.path.join(TRAINING, img)
        if os.path.getsize(src) <= 0:
            print(img+" is zero length, so ignoring!!")
        else:
            shutil.copyfile(src, dst)

    for img in test_images:
        src = os.path.join(SOURCE, img)
        dst = os.path.join(TESTING, img)
        if os.path.getsize(src) <= 0:
            print(img+" is zero length, so ignoring!!")
        else:
            shutil.copyfile(src, dst)


In [8]:
split_size = 0.90
for i in range(0,62):
  if i>=0 and i<10:
    split_data('/tmp/English/Fnt/'+chr(i+48),
               '/tmp/CAPTCHA/training/'+chr(i+48),
               '/tmp/CAPTCHA/testing/'+chr(i+48),
               split_size)
  if i>=10 and i<36:
    split_data('/tmp/English/Fnt/'+chr(i-10+65)+"-1",
               '/tmp/CAPTCHA/training/'+chr(i-10+65),
               '/tmp/CAPTCHA/testing/'+chr(i-10+65),
               split_size)
  if i>=36 and i<62:
    split_data('/tmp/English/Fnt/'+chr(i-36+97)+"-2",
               '/tmp/CAPTCHA/training/'+chr(i-36+97),
               '/tmp/CAPTCHA/testing/'+chr(i-36+97),
               split_size)

# Creation of Model



In [None]:
%load_ext tensorboard
%tensorboard --logdir logs

Images are augmented by rescaling, rotating , shearing, zooming and flipping. This provides a cheap and very effective way to provide more data for the model to learn from.

In [10]:
TRAINING_DIR = '/tmp/CAPTCHA/training'
train_datagen = ImageDataGenerator(rescale=1. / 255,
                                   rotation_range=30,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   fill_mode='nearest',
            preprocessing_function = tf.image.rgb_to_grayscale)
train_generator = train_datagen.flow_from_directory(
    TRAINING_DIR,
    target_size = (32,32),
    batch_size = 16,
    class_mode = 'categorical'
)

VALIDATION_DIR = '/tmp/CAPTCHA/testing'
validation_datagen = ImageDataGenerator(rescale=1./255,
            preprocessing_function = tf.image.rgb_to_grayscale)
validation_generator = validation_datagen.flow_from_directory(
    VALIDATION_DIR,
    target_size = (32,32),
    batch_size = 16,
    class_mode = 'categorical'
)

Found 56668 images belonging to 62 classes.
Found 6324 images belonging to 62 classes.


## The ResNet Model
the model described below uses a custom model based on the [ResNet architecture](https://arxiv.org/pdf/1512.03385.pdf) 

In [11]:
import keras
from functools import partial
DefaultConv2D = partial(keras.layers.Conv2D, kernel_size=3, strides=1,
                        padding="SAME", use_bias=False)

class ResidualUnit(keras.layers.Layer):
    def __init__(self, filters, strides=1, activation="relu", **kwargs):
        super().__init__(**kwargs)
        self.activation = keras.activations.get(activation)
        self.main_layers = [
            DefaultConv2D(filters, strides=strides),
            keras.layers.BatchNormalization(),
            self.activation,
            DefaultConv2D(filters),
            keras.layers.BatchNormalization()]
        self.skip_layers = []
        if strides > 1:
            self.skip_layers = [
                DefaultConv2D(filters, kernel_size=1, strides=strides),
                keras.layers.BatchNormalization()]

    def get_config(self):
      cfg = super().get_config()
      return cfg  

    def call(self, inputs):
        Z = inputs
        for layer in self.main_layers:
            Z = layer(Z)
        skip_Z = inputs
        for layer in self.skip_layers:
            skip_Z = layer(skip_Z)
        return self.activation(Z + skip_Z)

In [12]:
keras.backend.clear_session()
tf.random.set_seed(42)
np.random.seed(42)

model = keras.models.Sequential()
model.add(DefaultConv2D(64, kernel_size=4, strides=2,
                        input_shape=[32, 32, 3]))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation("relu"))
model.add(keras.layers.MaxPool2D(pool_size=3, strides=2, padding="SAME"))
prev_filters = 64
for filters in [64] * 2 + [128] * 2 + [256] * 1 :
    strides = 1 if filters == prev_filters else 2
    model.add(ResidualUnit(filters, strides=strides))
    prev_filters = filters
model.add(keras.layers.GlobalAvgPool2D())
model.add(keras.layers.Flatten())
model.add(keras.layers.Dropout(0.5))
model.add(keras.layers.Dense(62, activation="softmax"))

model.compile(loss="categorical_crossentropy", optimizer="nadam",
              metrics=["accuracy"])
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 16, 16, 64)        3072      
_________________________________________________________________
batch_normalization (BatchNo (None, 16, 16, 64)        256       
_________________________________________________________________
activation (Activation)      (None, 16, 16, 64)        0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 8, 8, 64)          0         
_________________________________________________________________
residual_unit (ResidualUnit) (None, 8, 8, 64)          74240     
_________________________________________________________________
residual_unit_1 (ResidualUni (None, 8, 8, 64)          74240     
_________________________________________________________________
residual_unit_2 (ResidualUni (None, 4, 4, 128)         2

In [13]:
logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

# performance scheduling is used to reduce learning rate
# to improve model training accuracy

learning_rate_reduction = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='val_accuracy', factor=0.5, patience=2,min_lr=0.00001,mode='auto')

In [14]:
history = model.fit(train_generator, epochs = 50,
                    validation_data=validation_generator,
          callbacks=[tensorboard_callback, learning_rate_reduction])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


The model is saved in the `saved_model` format and downloaded for use as part of the larger project.

In [None]:
model.save('CAPTCHA-Model')

In [None]:
!zip -r /content/CAPTCHA.zip /content/CAPTCHA-Model

In [18]:
from google.colab import files
files.download("/content/CAPTCHA.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>