In [59]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import models, layers
from IPython.display import clear_output
import cv2
import random

In [39]:
new_width = new_height = 300

Train images

In [75]:
a1 = cv2.imread("/content/a1.jpg")
a1 = cv2.resize(a1, (new_width, new_height))
a2 = cv2.imread("/content/a2.jpg")
a2 = cv2.resize(a2, (new_width, new_height))
a3 = cv2.imread("/content/a3.jpg")
a3 = cv2.resize(a3, (new_width, new_height))
a4 = cv2.imread("/content/a4.jpg")
a4 = cv2.resize(a4, (new_width, new_height))
a5 = cv2.imread("/content/a5.jpg")
a5 = cv2.resize(a5, (new_width, new_height))
a6 = cv2.imread("/content/a6.jpg")
a6 = cv2.resize(a6, (new_width, new_height))
af = cv2.imread("/content/af.jpg")
af = cv2.resize(af, (new_width, new_height))

In [103]:
p1 = cv2.imread("/content/p1.jpg")
p1 = cv2.resize(p1, (new_width, new_height))
p2 = cv2.imread("/content/p2.jpg")
p2 = cv2.resize(p2, (new_width, new_height))
p4 = cv2.imread("/content/p4.jpg")
p4 = cv2.resize(p4, (new_width, new_height))
p7 = cv2.imread("/content/p7.jpg")
p7 = cv2.resize(p7, (new_width, new_height))
p8 = cv2.imread("/content/p8.jpg")
p8 = cv2.resize(p8, (new_width, new_height))

In [88]:
train_imgs = np.array([a1,p1,a2,p2,a3,p4,a4,p7,a5,p8,a6,af])
train_lbls = np.array([0,1,0,1,0,1,0,1,0,1,0,0])

Test images

In [124]:
my_a = cv2.imread("/content/my_aadhar.jpg")
my_a = cv2.resize(my_a, (new_width, new_height))
my_p = cv2.imread("/content/my_pan.jpg")
my_p = cv2.resize(my_p, (new_width, new_height))

In [125]:
test_imgs = np.array([my_a, my_p])
test_lbls = np.array([0,1])

In [43]:
class_names = np.array(['Aadhar','Pancard'])

# Convolutional Neural Network

Convolutional neural network is a regularized type of feed-forward neural network that learns feature engineering by itself via filters optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. [Learn more](https://en.wikipedia.org/wiki/Convolutional_neural_network)

In [70]:
model = tf.keras.Sequential([
    layers.Conv2D(32, (3,3), activation = 'relu', input_shape = (300,300,3)), # if gray scale image is used then number of channels will be 1 not 3.
    layers.BatchNormalization(),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),

    layers.Conv2D(64, (3,3), activation = 'relu'),
    layers.BatchNormalization(),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),

    layers.Conv2D(128, (3,3), activation = 'relu'),
    layers.BatchNormalization(),
    layers.MaxPooling2D((2,2)),


    layers.Flatten(),
    layers.Dense(64, activation = 'relu'),
    layers.Dense(2, activation = 'softmax')
])

**Convolutional Layer**:


*   The core building block of a CNN
*   Applies convolutional filters (also called kernels) to the input image to extract features.
*   Captures local patterns and spatial relationships in the data.
*   Each filter slides across the image and computes element-wise dot products, producing a feauture map.





**Batch Normalization**


*   Batch normalization (BN) is a technique used to normalize the activations of a neural network's layer across a mini-batch of training examples.
*   It ensures that the mean activation is close to 0 and the standard deviation is close to 1.


*   Benefits
    1.   Faster convergence: BN helps in faster convergence during training by mitigating the vanishing/exploding gradient.
    2.   Increased learning rates: It allows higher learning rates, making optimization more stable.
    3.   Regularization Effect: BN has a slight regularization effect, reducing the risk of overfitting.










**Max pooling layer**


*   Downsampling operation: Max pooling is used to downsample feature maps in CNNs.
*   Sliding window: A small window slides over the input feature map
*   Pooling operation: Maximum value within each window is extracted.
*   Downsampled Feature Map: Extracted max values form a new downsized feature map.
*   Benefits: Translation invariance, reduces computation, and retains important features.







**Dropout layer**


*   Regularization Technique: Dropout is a regularization technique in neural networks.
*   Random Deactivation: During training, randomly selected neurons are "dropped out" or deactivated with a certain probability.
*   Preventing Overfitting: Dropout helps prevent overfitting by reducing reliance on specific neurons.
*   Network Variability: It forces the network to learn more robust and generalized features.
*   Ensemble Effect: Dropout can be seen as training multiple models with different subsets of neurons, creating an ensemble effect.








**Flat layer**


*   The Flatten layer converts a multidimensional tensor into a one-dimensional vector.



**Dense layer**

*   It consists of multiple neurons, each connected to every neuron from the previous layer.
*   The Dense layer is responsible for capturing complex patterns and relationships in the data.
*   It's often used as the final layer in a neural network for classification or regression tasks.



**Activation functions**


*   ReLU: Rectified Linear Activation is an activation function that outputs the input if it's positive, otherwise outputs zero, enhancing learning speed and mitigating the vanishing gradient problem.
*   Softmax: Softmax is an activation function that converts a vector of arbitrary values into a probability distribution, with each element indicating the likelihood of a class, suitable for multi-class classification tasks.



In [71]:
model.compile(optimizer = 'adam',loss = 'sparse_categorical_crossentropy',metrics = ['accuracy'])

**model.compile()** sets up the model for training by specifying the optimizer, loss function, and metrics for evaluation. The choice of these parameters impacts how the model's weights are updated and how its performance is assessed.

**optimizer**: Specifies the optimization algorithm to update model weights during training. Common choices are: **adam** and **sgd**

**loss**: Defines the loss function to measure the discrepancy between predicted and actual labels during training. For multi-class classification problems
**sparse_categorical_crossentropy**: Appropriate when your labels are integers and not one-hot encoded.
**metrics**: List of evaluation metrics to monitor during training and testing.
**accuracy**: Measures the proportion of correctly predicted instances in the entire dataset.


In [None]:
model.fit(train_imgs, train_lbls, epochs = 5)

**model.fit()** trains the model on the training set. **epochs** is the number of iteration the model uses in this case 5 (which works for most models).

In [130]:
loss, acc = model.evaluate(test_imgs, test_lbls, verbose = 1)
print(acc)

1.0


Finding the accuracy of the model on the test set.

In [91]:
predictions = model.predict(train_imgs)



Making predictions on the training images

In [126]:
pred = model.predict(test_imgs)



Making predictions on the test images

In [131]:
def run():
  for n in range(0, test_imgs.shape[0]):
    print(f"Predicted: {class_names[np.argmax(pred[n])]}")
    if test_lbls[n] == 0:
      print(f"Actual: Aadhar")
    else:
      print(f"Actual: Pancard")
    print("="*20)

**np.argmax()** returns the index of the largest element in the array. In this case the largest number will be the most likely class.

In [132]:
run()

Predicted: Aadhar
Actual: Aadhar
Predicted: Pancard
Actual: Pancard
