In [None]:
""" 
Group C: Assignment No. 12
Assignment Title: TensorFlow/Pytorch implementation of CNN
"""
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten,Dense
from tensorflow.keras.utils import to_categorical

(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(-1, 28, 28, 1) / 255.0
X_test = X_test.reshape(-1, 28, 28, 1) / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])

model.fit(X_train, y_train, batch_size=64, epochs=10, verbose=1)
loss, accuracy = model.evaluate(X_test, y_test)

print(f"Test Loss: {loss}")
print(f"Test Accuracy: {accuracy}")

''' Great! Since your assignment involves the implementation of a **Convolutional Neural Network (CNN)** using **TensorFlow**, your teacher may ask a range of **conceptual, technical, and practical questions**. Here are some **possible questions and their concise yet detailed answers** that will help you prepare:

---

### **1. What is the main advantage of using CNNs over traditional neural networks for image data?**

**Answer:**
CNNs are specifically designed to handle image data by preserving the spatial structure (i.e., the 2D grid of pixels) using convolutional layers. Unlike traditional neural networks which flatten the image and lose spatial relationships, CNNs retain these relationships, allowing them to detect features like edges, textures, and shapes more effectively.

---

### **2. What is the role of the convolutional layer in a CNN?**

**Answer:**
The convolutional layer applies filters (kernels) to the input image to extract important features such as edges, corners, and textures. Each filter scans over the image and produces a **feature map** that highlights specific patterns present in different regions of the image.

---

### **3. What does the pooling layer do in CNNs?**

**Answer:**
Pooling layers reduce the spatial dimensions (width and height) of feature maps, helping to **reduce computation**, **control overfitting**, and **extract dominant features**. Common pooling methods include **max pooling** (which takes the maximum value) and **average pooling**.

---

### **4. Why do we use activation functions like ReLU in CNNs?**

**Answer:**
Activation functions like **ReLU (Rectified Linear Unit)** introduce non-linearity into the network, allowing CNNs to learn more complex patterns. Without non-linearity, the CNN would behave like a linear classifier regardless of its depth.

---

### **5. What is the purpose of flattening in CNN?**

**Answer:**
Flattening converts the 2D output from the convolutional or pooling layers into a 1D vector. This 1D vector can then be fed into fully connected (Dense) layers for classification.

---

### **6. Which loss function and optimizer did you use and why?**

**Answer:**
For classification tasks like MNIST or CIFAR-10, we use **categorical cross-entropy** or **sparse categorical cross-entropy** as the loss function, depending on whether labels are one-hot encoded. **Adam optimizer** is widely used because it adapts the learning rate and converges faster.

---

### **7. What dataset did you use and why is it suitable?**

**Answer:**
We used the **MNIST** or **CIFAR-10** dataset. MNIST is a simple dataset of handwritten digits, ideal for testing CNN basics. CIFAR-10 contains colored images from 10 categories, making it suitable for evaluating performance on more complex image classification tasks.

---

### **8. How do you evaluate the performance of your CNN model?**

**Answer:**
We evaluate performance using metrics like **accuracy** and **loss** on the test set via `model.evaluate()`. A **confusion matrix** and **classification report** can also be used to analyze class-wise performance.

---

### **9. What techniques can help prevent overfitting in CNNs?**

**Answer:**

* **Dropout layers** (randomly deactivate neurons during training)
* **L2 regularization** (adds penalty to weights)
* **Data augmentation** (flips, rotates, crops images to create variability)
* **Early stopping** (stops training when validation loss stops improving)

---

### **10. What is data augmentation and why is it used?**

**Answer:**
Data augmentation artificially increases the size and diversity of the training dataset by applying transformations such as flipping, rotating, zooming, etc. It helps the model **generalize better** and reduces overfitting.

---

### **11. What is the difference between training accuracy and validation accuracy?**

**Answer:**

* **Training accuracy** measures performance on the data the model was trained on.
* **Validation accuracy** measures performance on unseen data to evaluate generalization.
  A large gap between them indicates overfitting.

---

### **12. What is the purpose of using `model.save()` in TensorFlow?**

**Answer:**
`model.save()` allows us to **serialize and save the trained model**, including weights and architecture, so it can be loaded later for inference or further training without retraining from scratch.

---

### **13. What happens during forward and backward propagation in CNN?**

**Answer:**

* **Forward Propagation**: Inputs pass through convolutional, pooling, and dense layers to compute predictions.
* **Backward Propagation**: The loss is calculated, and gradients are backpropagated to update weights via gradient descent.

---

### **14. What is the kernel size in convolutional layers, and how does it affect the model?**

**Answer:**
Kernel size (e.g., 3x3, 5x5) determines the **region of the input** over which the filter scans. Smaller kernels like 3x3 are common and more effective in deeper networks for capturing fine-grained patterns.

---

### **15. What is the role of batch size and number of epochs in training?**

**Answer:**

* **Batch size** determines how many samples are processed before updating weights.
* **Epochs** represent how many times the model sees the entire dataset.
  Both impact training speed and convergence.

---

Would you like me to prepare **Python code** for CNN implementation in TensorFlow or PyTorch as part of your submission?

'''