In [None]:


### Part 1: Understanding Batch Normalization

#Q1a. Explain the Concept of Batch Normalization**

#Batch normalization is a technique used to improve the training of deep neural networks by normalizing the inputs of each layer. It standardizes the inputs to a layer for each mini-batch, maintaining the mean output close to 0 and the output standard deviation close to 1. This helps in stabilizing the learning process and often results in faster convergence.
'''
Q1b. Describe the Benefits of Using Batch Normalization During Training**

- **Stabilizes Training**: By normalizing the inputs, it reduces the problem of internal covariate shift, making the training more stable.
- **Accelerates Convergence**: Normalized inputs allow for higher learning rates without the risk of divergence.
- **Regularization**: Acts as a form of regularization, reducing the need for other regularization methods like dropout.
- **Reduces Dependency on Initialization**: Less sensitivity to the initial weights, which means the network can be trained with different initializations and still converge effectively.
'''
#Q1c. Discuss the Working Principle of Batch Normalization**

### Part 2: Implementation

#Q2a. Choose a Dataset and Preprocess It**

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical

# Load and preprocess the CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Normalize the input data
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Convert labels to one-hot encoding
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)


#Q2b. Implement a Simple Feedforward Neural Network Without Batch Normalization**


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten

# Define the model
model_without_bn = Sequential([
    Flatten(input_shape=(32, 32, 3)),
    Dense(512, activation='relu'),
    Dense(256, activation='relu'),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model_without_bn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history_without_bn = model_without_bn.fit(X_train, y_train, validation_split=0.2, epochs=20, batch_size=64)


#Q2c. Implement Batch Normalization Layers in the Neural Network**


from tensorflow.keras.layers import BatchNormalization

# Define the model with batch normalization
model_with_bn = Sequential([
    Flatten(input_shape=(32, 32, 3)),
    Dense(512, activation='relu'),
    BatchNormalization(),
    Dense(256, activation='relu'),
    BatchNormalization(),
    Dense(128, activation='relu'),
    BatchNormalization(),
    Dense(10, activation='softmax')
])

# Compile the model
model_with_bn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history_with_bn = model_with_bn.fit(X_train, y_train, validation_split=0.2, epochs=20, batch_size=64)

#Q2d. Compare Training and Validation Performance**


import matplotlib.pyplot as plt

# Plot training & validation accuracy values
plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
plt.plot(history_without_bn.history['accuracy'], label='Train Accuracy without BN')
plt.plot(history_without_bn.history['val_accuracy'], label='Val Accuracy without BN')
plt.plot(history_with_bn.history['accuracy'], label='Train Accuracy with BN')
plt.plot(history_with_bn.history['val_accuracy'], label='Val Accuracy with BN')
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(loc='upper left')

# Plot training & validation loss values
plt.subplot(1, 2, 2)
plt.plot(history_without_bn.history['loss'], label='Train Loss without BN')
plt.plot(history_without_bn.history['val_loss'], label='Val Loss without BN')
plt.plot(history_with_bn.history['loss'], label='Train Loss with BN')
plt.plot(history_with_bn.history['val_loss'], label='Val Loss with BN')
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(loc='upper left')

plt.show()


#Q2e. Discuss the Impact of Batch Normalization on Training Process and Performance**

#Batch normalization typically improves both training speed and model performance. From the plots, we can observe that the model with batch normalization converges faster and achieves higher accuracy on both training and validation sets.

### Part 3: Experimentation and Analysis

#Q3a. Experiment with Different Batch Sizes**


# Experiment with different batch sizes
batch_sizes = [32, 64, 128]
histories = []

for batch_size in batch_sizes:
    model_with_bn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    history = model_with_bn.fit(X_train, y_train, validation_split=0.2, epochs=20, batch_size=batch_size)
    histories.append(history)

'''
Q3b. Discuss Advantages and Potential Limitations of Batch Normalization**

**Advantages:**
- Speeds up training.
- Improves model performance.
- Reduces sensitivity to weight initialization.
- Provides some regularization.

**Potential Limitations:**
- Adds computational overhead.
- Can interact with dropout in complex ways.
- May not be effective for very small mini-batches.
'''