<a href="https://colab.research.google.com/github/Twinkle-gawri/CNN/blob/main/NLP_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# filter or kernal functions are used 3*3 example
# filter -- woh patterns ko detect karta h --> new matrix generate karta woh size small hota h --> patterns learn karta h
# padding -- make size same of input and output -- karte nhi h
# pooling -- main features learn karti h,dimensions kam karti h
# batch normalisation -- values normalise karta h , reasons -- filter different hote h -- output vary karta h bht toh output ko same range me laane ke liye

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers,Sequential
from tensorflow.keras.layers import Dense,Flatten,Conv2D,MaxPooling2D,BatchNormalization
from tensorflow.keras.datasets import cifar10

In [2]:
(x_train,y_train),(x_test,y_test) = cifar10.load_data()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 0us/step


In [3]:
x_train.shape # 32*32 size h 3 channels h -- rgb

(50000, 32, 32, 3)

In [4]:
x_test.shape

(10000, 32, 32, 3)

In [5]:
x_train = x_train/255.0
x_test = x_test/255.0

In [6]:
from sklearn.model_selection import train_test_split
x_train,x_val,y_train,y_val = train_test_split(x_train,y_train,test_size=0.2,random_state=42)

# 1. **Conv2D (Convolutional Layer)**

####What it does:
* The Conv2D layer applies a set of convolutional filters (kernels) to the input image or feature maps to extract features such as edges, textures, and patterns.
* Each filter slides (convolves) over the input, performing element-wise multiplication and summing up the results to create a feature map.

####Key Parameters:
* Filters: Number of filters (kernels) to use (e.g., 32 or 64). Each filter learns to detect a specific feature.
* Kernel Size: Size of the filters (e.g., 3x3 or 5x5).
* Strides: Step size for moving the filter across the input.
* Padding: Decides whether to keep the spatial dimensions unchanged ('same') or reduce them ('valid').



# 2. **MaxPooling2D (Pooling Layer)**

####What it does:
* The MaxPooling2D layer performs downsampling by taking the maximum value in a window (e.g., 2x2) within each feature map.
* Reduces the spatial dimensions (height and width) while retaining the most important features.

####Key Parameters:
* Pool Size: Size of the pooling window (e.g., 2x2 or 3x3).
* Strides: Step size for moving the pooling window.
* Padding: Whether or not to add padding around the input.



In [7]:
#sequential api , most basic model
model = Sequential()  # 32 -- output channel -- hyperparameter, size (3*3)
model.add(Conv2D(32,(3,3),activation='relu'))  # 32 - output channels -- how many filters you want to use  --- phele wali layers -- kam filters because woh low level featurs identify karta h
model.add(MaxPooling2D())
model.add(Conv2D(64,(3,3),activation='relu'))
model.add(MaxPooling2D())
model.add(Conv2D(128,(3,3),activation='relu'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(64,activation='relu'))  # aur non linearity learn kar sake
model.add(Dense(10,activation='softmax'))

In [8]:
model.compile(optimizer=keras.optimizers.Adam(3e-4),loss=keras.losses.SparseCategoricalCrossentropy(),metrics=['accuracy'])

In [9]:
model.fit(x_train,y_train,batch_size=64,epochs=10,validation_data=(x_val,y_val))

Epoch 1/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 6ms/step - accuracy: 0.2374 - loss: 2.0375 - val_accuracy: 0.4131 - val_loss: 1.5902
Epoch 2/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.4354 - loss: 1.5526 - val_accuracy: 0.4879 - val_loss: 1.4328
Epoch 3/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.4998 - loss: 1.3951 - val_accuracy: 0.5231 - val_loss: 1.3400
Epoch 4/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.5425 - loss: 1.2941 - val_accuracy: 0.5562 - val_loss: 1.2532
Epoch 5/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.5712 - loss: 1.2190 - val_accuracy: 0.5729 - val_loss: 1.2165
Epoch 6/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.5930 - loss: 1.1569 - val_accuracy: 0.5934 - val_loss: 1.1590
Epoch 7/10
[1m625/625[0m 

<keras.src.callbacks.history.History at 0x7bdcdf584210>

## 3. **Batch Normalization** --
Batch Normalization ensures that feature maps in CNNs maintain a stable distribution, accelerating training and improving model robustness

# **Conv Layer → BatchNorm → Activation (ReLU) → Pooling**

Why batch normalization is done before relu?

BatchNorm normalizes activations (zero mean, unit variance), which helps avoid extremely large or small values before applying the activation function.

* If BatchNorm were applied after ReLU, negative values would be removed,
distorting the mean and variance.
* When applied before ReLU, it ensures inputs are standardized, making training more stable.

In [10]:
#functional api
inputs = keras.Input(shape=(32,32,3))
x=Conv2D(32,(3,3))(inputs)
x=BatchNormalization()(x)
x=keras.activations.relu(x)
x=MaxPooling2D()(x)
x=Conv2D(64,(3,3))(x)
x=BatchNormalization()(x)
x=keras.activations.relu(x)
x=MaxPooling2D()(x)
x=Conv2D(128,(3,3))(x)
x=BatchNormalization()(x)
x=keras.activations.relu(x)
x=MaxPooling2D()(x)
x=Flatten()(x)
x=Dense(64,activation='relu')(x)
outputs=Dense(10,activation='softmax')(x)
model=keras.Model(inputs=inputs,outputs=outputs)  # yeh par model define hota h

In [11]:
model.compile(optimizer=keras.optimizers.Adam(3e-4),loss=keras.losses.SparseCategoricalCrossentropy(),metrics=['accuracy'])

In [12]:
model.fit(x_train,y_train,batch_size=64,epochs=10,validation_data=(x_val,y_val))

Epoch 1/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.3983 - loss: 1.6874 - val_accuracy: 0.4394 - val_loss: 1.6037
Epoch 2/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.6151 - loss: 1.1006 - val_accuracy: 0.6042 - val_loss: 1.1299
Epoch 3/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.6789 - loss: 0.9236 - val_accuracy: 0.5478 - val_loss: 1.3117
Epoch 4/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.7197 - loss: 0.8184 - val_accuracy: 0.6238 - val_loss: 1.1457
Epoch 5/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.7561 - loss: 0.7171 - val_accuracy: 0.6537 - val_loss: 1.0267
Epoch 6/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.7821 - loss: 0.6399 - val_accuracy: 0.6257 - val_loss: 1.1286
Epoch 7/10
[1m625/625[0m 

<keras.src.callbacks.history.History at 0x7bdcdad17190>

In [None]:
# model overfit ho rha h -- training data par normalisation lgaya h validation par nhi lgaya

In [13]:
#functional api, still overfitting is present and can be reduced
inputs = keras.Input(shape=(32,32,3))
x=Conv2D(32,(3,3),kernel_regularizer=keras.regularizers.l2(0.001))(inputs)  # L2 Regularization (0.001) → Helps reduce overfitting by penalizing large weights.
x=BatchNormalization()(x)
x=keras.activations.relu(x)
x=MaxPooling2D()(x)

x=Conv2D(64,(3,3),kernel_regularizer=keras.regularizers.l2(0.001))(x)
x=BatchNormalization()(x)
x=keras.activations.relu(x)
x=MaxPooling2D()(x)

x=Conv2D(128,(3,3),kernel_regularizer=keras.regularizers.l2(0.001))(x)
x=BatchNormalization()(x)
x=keras.activations.relu(x)
x=MaxPooling2D()(x)

x=Flatten()(x)
x=Dense(64,activation='relu')(x) # Prevents overfitting by randomly dropping 50% of neurons.
x=keras.layers.Dropout(0.5)(x)
outputs=Dense(10,activation='softmax')(x)
model=keras.Model(inputs=inputs,outputs=outputs)

In [14]:
model.compile(optimizer=keras.optimizers.Adam(3e-4),loss=keras.losses.SparseCategoricalCrossentropy(),metrics=['accuracy'])

In [15]:
model.fit(x_train,y_train,batch_size=64,epochs=10,validation_data=(x_val,y_val))

Epoch 1/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 7ms/step - accuracy: 0.2737 - loss: 2.2394 - val_accuracy: 0.4579 - val_loss: 1.6574
Epoch 2/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.4521 - loss: 1.6513 - val_accuracy: 0.5359 - val_loss: 1.4004
Epoch 3/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.5328 - loss: 1.4442 - val_accuracy: 0.5946 - val_loss: 1.2771
Epoch 4/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.5770 - loss: 1.3115 - val_accuracy: 0.5572 - val_loss: 1.3287
Epoch 5/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.6146 - loss: 1.2193 - val_accuracy: 0.6172 - val_loss: 1.1886
Epoch 6/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.6379 - loss: 1.1486 - val_accuracy: 0.6485 - val_loss: 1.1063
Epoch 7/10
[1m625/625[0m 

<keras.src.callbacks.history.History at 0x7bdcc03de390>

In [16]:
class CNNBlock(layers.Layer):   # here base class is Layers not keras.Model(deep nn -- layers only -- dense or flatten)
  def __init__(self,num_filters,kernel_size=3):  # relu -- mathematical functions not layer so yeha define nhi kiya
    super(CNNBlock,self).__init__()
    self.conv = layers.Conv2D(num_filters,kernel_size)
    self.bn = layers.BatchNormalization()
    self.pool = layers.MaxPooling2D()
  def call(self,inputs):
    x = self.conv(inputs)
    x = self.bn(x)
    x = keras.activations.relu(x)
    x = self.pool(x)
    return x

In [17]:
model=Sequential(
    [
       keras.Input(shape=(32,32,3)),
       CNNBlock(32),
       CNNBlock(64),
       CNNBlock(128),
       Flatten(),
       Dense(64,activation='relu'),
       Dense(10,activation='softmax')
    ]
)

In [18]:
model.compile(optimizer=keras.optimizers.Adam(3e-4),loss=keras.losses.SparseCategoricalCrossentropy(),metrics=['accuracy'])

In [19]:
model.fit(x_train,y_train,batch_size=64,epochs=10,validation_data=(x_val,y_val))

Epoch 1/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 7ms/step - accuracy: 0.4004 - loss: 1.6647 - val_accuracy: 0.5144 - val_loss: 1.3896
Epoch 2/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 5ms/step - accuracy: 0.6241 - loss: 1.0648 - val_accuracy: 0.6069 - val_loss: 1.1248
Epoch 3/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.6911 - loss: 0.8971 - val_accuracy: 0.6574 - val_loss: 1.0087
Epoch 4/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.7312 - loss: 0.7792 - val_accuracy: 0.6739 - val_loss: 0.9358
Epoch 5/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.7701 - loss: 0.6836 - val_accuracy: 0.6826 - val_loss: 0.9246
Epoch 6/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5ms/step - accuracy: 0.7929 - loss: 0.6134 - val_accuracy: 0.6971 - val_loss: 0.8951
Epoch 7/10
[1m625/625[0m 

<keras.src.callbacks.history.History at 0x7bdc8d6ee1d0>