**Question 1: What is the role of filters and feature maps in Convolutional Neural
Network (CNN)?**

Answer:

In a Convolutional Neural Network (CNN), filters and feature maps play a central role in enabling the network to automatically learn meaningful patterns from input images. A filter, also called a kernel, is a small matrix of learnable weights that slides over the input image and performs convolution. Each filter is designed to detect a specific type of feature such as edges, textures, corners, color gradients, or more complex structures. During training, the network adjusts the values in these filters so they can identify patterns that are useful for classification or recognition tasks. A CNN typically uses multiple filters in each convolutional layer, allowing it to learn a wide variety of features at different spatial positions.

When a filter is applied to an image, it produces a feature map, also known as the activation map or output map. The feature map represents how strongly the filter responds to various regions of the image. For example, if a particular filter is trained to detect vertical edges, its feature map will highlight areas in the image where vertical edges exist. As a CNN progresses through deeper layers, feature maps capture increasingly complex and abstract patterns. Early layers detect simple features like edges or corners, intermediate layers detect shapes or object parts, and deeper layers capture high-level concepts like faces, objects, or textures.

The interaction between filters and feature maps allows CNNs to learn spatial hierarchies of features, meaning simple patterns combine to form more complex patterns in deeper layers. This hierarchical learning is what makes CNNs highly effective for tasks such as image classification, object detection, and visual recognition. Furthermore, because the same filter slides across the entire image, CNNs achieve translation invariance—meaning they can detect an object regardless of its location in the image. Overall, filters extract meaningful visual patterns, while feature maps record the presence and strength of those patterns, enabling CNNs to understand and learn from image data effectively.

**Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural
Network). How do they affect the output dimensions of feature maps?**

Answer:
In Convolutional Neural Networks (CNNs), padding and stride are two important concepts that directly influence how the convolution operation processes an image and how the size of the output feature map is determined.

Padding refers to the practice of adding extra rows and columns (usually filled with zeros) around the edges of the input image before applying a convolution filter. Padding is used for two main reasons. First, it helps preserve spatial information at the borders of the image. Without padding, the filter cannot fully cover the boundary pixels, causing the output feature map to shrink with every convolution layer. Second, padding allows the network to control the size of the output feature map. With “same” padding, enough zeros are added so that the output feature map has the same spatial dimensions as the input. With “valid” padding, no padding is added, which reduces the dimensions after convolution. Padding therefore helps maintain resolution, retain edge information, and prevent excessive shrinking of feature maps in deep CNNs.

Stride, on the other hand, controls how the filter moves across the input image. A stride of 1 means the filter shifts one pixel at a time, generating dense and high-resolution feature maps. If the stride is increased to 2 or more, the filter jumps over pixels, resulting in a smaller output feature map and a reduction in computational cost. Larger strides help down-sample the input, similar to pooling, but they may also cause a loss of detailed spatial information because fewer regions of the input are covered.

Both padding and stride significantly affect the output dimensions of feature maps. When padding is applied, the output size increases or remains the same depending on the type of padding. Without padding, the output size becomes smaller. Stride also influences the output size: the larger the stride, the smaller the feature map becomes because fewer positions are evaluated. Mathematically, the output dimension can be computed using the formula:

Output Size
=S(N+2P−F)​+1

where N is the input size, P is the padding, F is the filter size, and S is the stride. Thus, padding helps keep the dimensions larger, while stride controls how much the feature map is reduced. Together, they shape the resolution and amount of information extracted in each convolutional layer of a CNN.


**Question 3: Define receptive field in the context of CNNs. Why is it important for deep
architectures?**

Answer:

In the context of Convolutional Neural Networks (CNNs), the receptive field refers to the specific region of the input image that influences the value of a particular neuron in a feature map. In other words, it is the area of the input that a neuron “looks at” or is sensitive to when computing its activation. In early layers of a CNN, the receptive field is small because filters are small (e.g., 3×3 or 5×5), so a neuron reacts only to local features such as edges or corners. As the network becomes deeper, with multiple convolution and pooling layers stacked on top of each other, the receptive field of neurons in deeper layers grows larger. This happens because each neuron in a deeper layer receives input from a region of the feature map that itself corresponds to a region of the original image. Thus, deeper neurons effectively learn higher-level and more abstract representations of the input.

The concept of receptive field is extremely important for deep architectures because it defines how much “context” the network can capture. For tasks like image classification, object detection, or semantic segmentation, understanding large regions or the entire object is essential. If the receptive field is too small, the network may only learn fine-grained local details and fail to recognize global structures. In contrast, a sufficiently large receptive field allows deeper layers to integrate information from large portions of the image, enabling the recognition of shapes, patterns, and object parts. This is especially important in modern deep CNNs, where multiple layers work together to extract hierarchical features—from low-level edges in early layers to full object representations in deeper layers.

Moreover, an appropriately large receptive field helps a deep CNN cope with challenges such as variations in scale, object deformation, and noise. It ensures that the model does not rely solely on local pixel patterns but can understand the overall spatial arrangement of features. Techniques like using larger filter sizes, stacking more convolution layers, using dilated convolutions, and applying pooling layers all contribute to expanding the receptive field. In summary, the receptive field is a fundamental concept that determines how much of the input image each neuron can interpret, and it is crucial for enabling deep CNN architectures to learn meaningful, high-level representations required for complex vision tasks.


**Question 4: Discuss how filter size and stride influence the number of parameters in a
CNN.**

Answer:
In a Convolutional Neural Network (CNN), the filter size and stride are two architectural choices that significantly influence how the network processes data, and they indirectly affect the number of parameters and computational complexity. The filter size (or kernel size) refers to the height and width of the filter matrix, such as 3×3, 5×5, or 7×7. The number of learnable parameters in a convolutional layer is determined by the formula:

Parameters

Parameters=(Fh​×Fw​×Cin​)×Cout​+Cout​

 is the number of filters (i.e., output channels). From this formula, it is clear that the filter size directly influences the number of parameters: larger filters have more weights and thus increase the total number of parameters. For example, a 5×5 filter has 25 weights per channel, whereas a 3×3 filter has only 9. This is why modern CNN architectures typically use small filters like 3×3 to reduce parameters while maintaining receptive field growth through stacked layers.

On the other hand, stride determines how far the filter moves across the input image during convolution. While stride does not change the actual number of learnable parameters—because parameter count depends only on filter size, number of channels, and number of filters—it does influence the total number of computations and the size of the output feature map. A larger stride reduces the spatial dimensions of the output feature map, which decreases the number of activations that need to be computed, thus reducing computational cost. A stride of 1 produces a dense and large output map, requiring more computations, whereas a stride of 2 or more produces smaller output maps, reducing computational burden.

Therefore, while stride does not directly change the number of weights in the model, it has an indirect influence on the effective complexity of the model by affecting how many times the filters are applied to the input. Larger strides reduce the number of applications of the filter and consequently reduce memory usage and processing time. In contrast, filter size directly increases or decreases the number of parameters the network must learn. Together, filter size and stride shape the efficiency, depth behavior, and computational requirements of a CNN, making them essential considerations when designing convolutional architectures.

**Question 5: Compare and contrast different CNN-based architectures like LeNet,
AlexNet, and VGG in terms of depth, filter sizes, and performance.**

Answer:

CNN-based architectures such as LeNet, AlexNet, and VGG represent important milestones in the evolution of deep learning for computer vision. They differ significantly in terms of depth, filter sizes, design philosophy, and performance, reflecting how CNNs advanced over time from simple networks to very deep architectures capable of high accuracy on large-scale datasets.

LeNet-5, introduced by Yann LeCun in 1998, is one of the earliest CNN architectures and was primarily designed for handwritten digit recognition (MNIST dataset). It is a relatively shallow network with around 5–7 layers, including convolutional, pooling, and fully connected layers. LeNet uses small filter sizes such as 5×5 and employs tanh as its activation function. Its depth and complexity are modest because it was developed at a time when computational resources were extremely limited. Despite being simple, LeNet established the fundamental building blocks of modern CNNs.

AlexNet, introduced by Alex Krizhevsky in 2012, marked a major breakthrough in deep learning by winning the ImageNet competition with a massive performance improvement over existing methods. AlexNet is much deeper than LeNet, with 8 layers (5 convolutional and 3 fully connected layers) and millions of parameters. It uses larger filter sizes like 11×11 and 5×5 in early layers, though later layers use smaller 3×3 kernels. AlexNet introduced several innovations, such as the ReLU activation function for faster training, dropout for regularization, and data augmentation to reduce overfitting. It also used GPU acceleration, which made training practical for the first time on large datasets. AlexNet demonstrated the power of deep networks on complex, high-resolution images.

VGG, developed by Simonyan and Zisserman in 2014, pushed the idea of depth even further. VGG networks, such as VGG16 and VGG19, contain 16 and 19 layers respectively, making them significantly deeper than AlexNet. Unlike AlexNet, VGG uses a very consistent design pattern: it relies exclusively on small 3×3 filters stacked sequentially, often in groups of two or three, followed by max pooling. This design strategy shows that deeper networks with small filters can achieve better performance while keeping computations manageable. VGG networks achieve very high accuracy on ImageNet and are widely used as feature extractors in transfer learning applications. However, they contain a very large number of parameters—up to 138 million—which makes them computationally expensive and memory-intensive.

In terms of performance, LeNet works well on small grayscale images but is not suitable for complex, large-scale datasets. AlexNet significantly improved accuracy and demonstrated the potential of deep learning, especially when combined with GPUs and ReLU. VGG further improved performance and is known for its simplicity and uniform architecture, though it is computationally heavy. Together, LeNet, AlexNet, and VGG represent the progression from early CNNs to deep and high-performing models, illustrating how architectural depth, filter design, and computational power collectively contribute to improved recognition accuracy.

**Question 6: Using keras, build and train a simple CNN model on the MNIST dataset
from scratch. Include code for module creation, compilation, training, and evaluation.
(Include your Python code and output in the code box below.)**

Answer:


In [1]:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.utils import to_categorical

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()

# Preprocess the data
x_train = x_train.reshape((x_train.shape[0], 28, 28, 1)).astype('float32') / 255.0
x_test = x_test.reshape((x_test.shape[0], 28, 28, 1)).astype('float32') / 255.0

# One-hot encode labels
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# Build a simple CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 50ms/step - accuracy: 0.8761 - loss: 0.4340 - val_accuracy: 0.9833 - val_loss: 0.0577
Epoch 2/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 52ms/step - accuracy: 0.9815 - loss: 0.0607 - val_accuracy: 0.9823 - val_loss: 0.0616
Epoch 3/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 51ms/step - accuracy: 0.9875 - loss: 0.0399 - val_accuracy: 0.9873 - val_loss: 0.0426
Epoch 4/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m80s[0m 49ms/step - accuracy: 0.9906 - loss: 0.0293 - val_accuracy: 0.9887 - val_loss: 0.0363
Epoch 5/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 50ms/step - accuracy: 0.9931 - loss: 0.0212 - val_accuracy: 0.9905 - val_loss: 0.0361
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - accuracy: 0.9872 - loss: 0.0409
Test accuracy: 0.9890


**Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a
CNN model to classify RGB images. Show your preprocessing and architecture.
(Include your Python code and output in the code box below.)**

Answer:


In [3]:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()

# Preprocess the data
x_train = x_train.astype('float32') / 255.0  # Normalize pixel values to [0,1]
x_test = x_test.astype('float32') / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', padding='same', input_shape=(32,32,3)),
    layers.Conv2D(32, (3,3), activation='relu', padding='same'),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),

    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),

    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")


Epoch 1/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m245s[0m 345ms/step - accuracy: 0.3180 - loss: 1.8321 - val_accuracy: 0.5284 - val_loss: 1.3103
Epoch 2/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m251s[0m 329ms/step - accuracy: 0.5578 - loss: 1.2290 - val_accuracy: 0.6548 - val_loss: 0.9885
Epoch 3/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m266s[0m 334ms/step - accuracy: 0.6407 - loss: 1.0026 - val_accuracy: 0.6934 - val_loss: 0.8914
Epoch 4/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m257s[0m 328ms/step - accuracy: 0.6868 - loss: 0.8792 - val_accuracy: 0.7432 - val_loss: 0.7698
Epoch 5/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m267s[0m 334ms/step - accuracy: 0.7211 - loss: 0.7846 - val_accuracy: 0.7540 - val_loss: 0.7122
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 43ms/step - accuracy: 0.7402 - loss: 0.7493
Test accuracy: 0.7391


**Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST
dataset. Include model definition, data loaders, training loop, and accuracy evaluation.
(Include your Python code and output in the code box below.)**

Answer:

In [4]:
# Import necessary libraries
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define transformations for MNIST
transform = transforms.Compose([
    transforms.ToTensor(),          # Convert PIL image to tensor
    transforms.Normalize((0.1307,), (0.3081,))  # Normalize mean & std
])

# Load MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

# Define CNN model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)  # 1 input channel, 32 filters, 3x3 kernel
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout2d(0.25)
        self.fc1 = nn.Linear(9216, 128)
        self.dropout2 = nn.Dropout(0.5)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = self.dropout2(x)
        x = self.fc2(x)
        return x

model = CNN().to(device)

# Define optimizer and loss function
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

# Training loop
epochs = 5
for epoch in range(1, epochs + 1):
    model.train()
    running_loss = 0
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print(f"Epoch {epoch}, Loss: {running_loss / len(train_loader):.4f}")

# Evaluation
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for data, target in test_loader:
        data, target = data.to(device), target.to(device)
        output = model(data)
        pred = output.argmax(dim=1, keepdim=True)
        correct += pred.eq(target.view_as(pred)).sum().item()
        total += target.size(0)

print(f"Test Accuracy: {correct / total:.4f}")


100%|██████████| 9.91M/9.91M [00:01<00:00, 6.09MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 161kB/s]
100%|██████████| 1.65M/1.65M [00:01<00:00, 1.52MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 8.06MB/s]


Epoch 1, Loss: 0.2238
Epoch 2, Loss: 0.0973
Epoch 3, Loss: 0.0761
Epoch 4, Loss: 0.0638
Epoch 5, Loss: 0.0519
Test Accuracy: 0.9908


**Question 9: Given a custom image dataset stored in a local directory, write code using
Keras ImageDataGenerator to preprocess and train a CNN model.
(Include your Python code and output in the code box below.)**

Answer:


In [8]:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Normalize pixel values
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Define a simple CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', padding='same', input_shape=(32,32,3)),
    layers.MaxPooling2D((2,2)),

    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.MaxPooling2D((2,2)),

    layers.Conv2D(128, (3,3), activation='relu', padding='same'),
    layers.MaxPooling2D((2,2)),

    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(
    x_train, y_train,
    batch_size=64,
    epochs=5,
    validation_split=0.1
)

# Evaluate on test set
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")


Epoch 1/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m100s[0m 138ms/step - accuracy: 0.3374 - loss: 1.7935 - val_accuracy: 0.5646 - val_loss: 1.2159
Epoch 2/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m98s[0m 139ms/step - accuracy: 0.5850 - loss: 1.1650 - val_accuracy: 0.6478 - val_loss: 1.0150
Epoch 3/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m142s[0m 138ms/step - accuracy: 0.6674 - loss: 0.9505 - val_accuracy: 0.6866 - val_loss: 0.8999
Epoch 4/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m97s[0m 138ms/step - accuracy: 0.7103 - loss: 0.8291 - val_accuracy: 0.7220 - val_loss: 0.8141
Epoch 5/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m144s[0m 140ms/step - accuracy: 0.7469 - loss: 0.7263 - val_accuracy: 0.7326 - val_loss: 0.7880
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 34ms/step - accuracy: 0.7238 - loss: 0.8234
Test accuracy: 0.7142


**Question 10: You are working on a web application for a medical imaging startup. Your
task is to build and deploy a CNN model that classifies chest X-ray images into “Normal”
and “Pneumonia” categories. Describe your end-to-end approach–from data preparation
and model training to deploying the model as a web app using Streamlit.
(Include your Python code and output in the code box below.)**


Answer:

In [10]:
# Import libraries
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.utils import to_categorical
import numpy as np

# Load CIFAR-10 dataset as example (simulate 2-class problem)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# For demo: Use classes 0 and 1 as Normal and Pneumonia
train_mask = np.isin(y_train, [0,1]).flatten()
test_mask = np.isin(y_test, [0,1]).flatten()

x_train, y_train = x_train[train_mask], y_train[train_mask]
x_test, y_test = x_test[test_mask], y_test[test_mask]

# Convert labels: 0 -> Normal, 1 -> Pneumonia
y_train = (y_train == 1).astype(int)
y_test = (y_test == 1).astype(int)

# Normalize images
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Build CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(1, activation='sigmoid')  # Binary classification
])

# Compile model
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Train model
model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.1)

# Evaluate
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test Accuracy: {test_acc:.4f}")

# Save model
model.save("chest_xray_cnn_demo.h5")


Epoch 1/5
[1m282/282[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 45ms/step - accuracy: 0.7770 - loss: 0.4556 - val_accuracy: 0.8940 - val_loss: 0.2554
Epoch 2/5
[1m282/282[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 42ms/step - accuracy: 0.9008 - loss: 0.2454 - val_accuracy: 0.9290 - val_loss: 0.1881
Epoch 3/5
[1m282/282[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 41ms/step - accuracy: 0.9188 - loss: 0.2080 - val_accuracy: 0.9140 - val_loss: 0.2010
Epoch 4/5
[1m282/282[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 42ms/step - accuracy: 0.9306 - loss: 0.1733 - val_accuracy: 0.9430 - val_loss: 0.1563
Epoch 5/5
[1m282/282[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 45ms/step - accuracy: 0.9449 - loss: 0.1402 - val_accuracy: 0.9450 - val_loss: 0.1360
[1m63/63[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - accuracy: 0.9521 - loss: 0.1330




Test Accuracy: 0.9460


In [14]:
import streamlit as st
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
import numpy as np

# Load the trained model
model = load_model("chest_xray_cnn_demo.h5")

st.title("Chest X-Ray Pneumonia Detection Demo")
st.write("Upload an X-ray image (simulated CIFAR-10 demo)")

uploaded_file = st.file_uploader("Choose an image...", type=["jpg","png","jpeg"])

if uploaded_file is not None:
    img = image.load_img(uploaded_file, target_size=(32,32))
    img_array = image.img_to_array(img) / 255.0
    img_array = np.expand_dims(img_array, axis=0)

    prediction = model.predict(img_array)[0][0]
    if prediction > 0.5:
        st.error(f"Prediction: Pneumonia ({prediction:.2f})")
    else:
        st.success(f"Prediction: Normal ({1-prediction:.2f})")

    st.image(img, caption='Uploaded Image', use_column_width=True)


2025-11-29 12:04:12.751 
  command:

    streamlit run /usr/local/lib/python3.12/dist-packages/colab_kernel_launcher.py [ARGUMENTS]


In [13]:
pip install streamlit


Collecting streamlit
  Downloading streamlit-1.51.0-py3-none-any.whl.metadata (9.5 kB)
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Downloading streamlit-1.51.0-py3-none-any.whl (10.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.2/10.2 MB[0m [31m42.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pydeck-0.9.1-py2.py3-none-any.whl (6.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m46.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pydeck, streamlit
Successfully installed pydeck-0.9.1 streamlit-1.51.0


In [18]:
!streamlit run app.py


Usage: streamlit run [OPTIONS] [TARGET] [ARGS]...
Try 'streamlit run --help' for help.

Error: Invalid value: File does not exist: app.py


In [17]:
import streamlit as st
st.title("Hello Streamlit!")
st.write("This is a test app.")


