<a href="https://colab.research.google.com/github/PranavKharade/Fake-Review-Detection/blob/main/AI_Pneumonia_Detector.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# ðŸ©º AI Pneumonia Detector: ResNet-18 with Regularization
# Author: Pranav Kharade
# Objective: Fine-tune a pre-trained ResNet-18 model for pneumonia detection.
# Techniques Used:
# - Transfer Learning
# - Data Augmentation
# - Dropout
# - L2 Regularization (Weight Decay)
# - Weighted Cross-Entropy

In [3]:
# 1. Install required libraries
!pip install -q gradio

# 2. Consolidate all imports
import os
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models
from torch.utils.data import DataLoader
from google.colab import files
import gradio as gr
from PIL import Image

# 3. Kaggle API Setup & Data Download
print("Please upload your kaggle.json file:")
files.upload()

!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
print("âœ… Kaggle API Key configured!")

!kaggle datasets download -d paultimothymooney/chest-xray-pneumonia
!unzip -q chest-xray-pneumonia.zip -d chest_xray_data
print("âœ… Medical Data Downloaded and Extracted!")

# Set device to GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"âœ… Compute Device: {device}")

Please upload your kaggle.json file:


Saving kaggle.json to kaggle.json
âœ… Kaggle API Key configured!
Dataset URL: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
License(s): other
Downloading chest-xray-pneumonia.zip to /content
 99% 2.26G/2.29G [00:16<00:00, 157MB/s]
100% 2.29G/2.29G [00:16<00:00, 150MB/s]
âœ… Medical Data Downloaded and Extracted!
âœ… Compute Device: cuda


In [6]:
## 1. Data Augmentation & Preprocessing
#To combat overfitting and force the model to learn the actual pathology (rather than memorizing the image orientation), we apply random horizontal flips and rotations to the training data. Test data remains unaltered.

In [4]:
# 1. Training Pipeline (With Data Augmentation)
train_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(10),
    transforms.Grayscale(num_output_channels=3),
    transforms.ToTensor()
])

# 2. Testing Pipeline (Pure standard resizing)
test_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.Grayscale(num_output_channels=3),
    transforms.ToTensor()
])

# 3. Load Datasets and create DataLoaders
train_dir = "chest_xray_data/chest_xray/train"
test_dir = "chest_xray_data/chest_xray/test"

train_dataset = datasets.ImageFolder(root=train_dir, transform=train_transforms)
test_dataset = datasets.ImageFolder(root=test_dir, transform=test_transforms)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

print(f"âœ… Total training images: {len(train_dataset)}")
print(f"âœ… Total test images: {len(test_dataset)}")

âœ… Total training images: 5216
âœ… Total test images: 624


In [5]:
## 2. Model Architecture: Anti-Overfitting ResNet-18
#The dataset is highly imbalanced (~3x more Pneumonia cases than Normal cases). We counter this by applying class weights to the loss function. We also freeze the core vision layers and replace the final classification head with a `Sequential` pipeline containing a 50% `Dropout` layer to penalize memorization.

In [7]:
# 1. Download Pre-trained ResNet-18 and freeze core layers
resnet_model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
for param in resnet_model.parameters():
    param.requires_grad = False

# 2. Build the Regularized Classification Head
num_features = resnet_model.fc.in_features
resnet_model.fc = nn.Sequential(
    nn.Dropout(p=0.5), # 50% Dropout to prevent overfitting
    nn.Linear(num_features, 2)
)

resnet_model = resnet_model.to(device)

# 3. Setup Loss (Addressing Class Imbalance)
# Normal=0, Pneumonia=1. Penalize missing Normal 3x more.
weights = torch.tensor([3.0, 1.0]).to(device)
criterion = nn.CrossEntropyLoss(weight=weights)

# 4. Setup Optimizer (Addressing Memorization)
# L2 Regularization via weight_decay=1e-4
optimizer = optim.Adam(resnet_model.fc.parameters(), lr=0.001, weight_decay=1e-4)

print("âœ… Model Architecture & Regularization Configured!")

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth


100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 44.7M/44.7M [00:00<00:00, 149MB/s]


âœ… Model Architecture & Regularization Configured!


In [8]:
## 3. Training & Evaluation

In [9]:
print(f"Starting Training on {device}...")

for epoch in range(3):
    resnet_model.train() # Automatically turns Dropout ON
    running_loss = 0.0

    for i, data in enumerate(train_loader, 0):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = resnet_model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f'[Epoch {epoch + 1}] Training Loss: {running_loss / len(train_loader):.4f}')

print("âœ… Training Complete!")

Starting Training on cuda...
[Epoch 1] Training Loss: 0.3886
[Epoch 2] Training Loss: 0.2774
[Epoch 3] Training Loss: 0.2798
âœ… Training Complete!


In [10]:
resnet_model.eval() # Switch to Testing Mode (Turns Dropout OFF)
correct = 0
total = 0

print("Evaluating model on unseen medical scans...")

with torch.no_grad():
    for data in test_loader:
        images, labels = data
        images, labels = images.to(device), labels.to(device)

        outputs = resnet_model(images)
        _, predicted = torch.max(outputs.data, 1)

        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total
print(f'âœ… Final Diagnostic Accuracy: {accuracy:.2f}%')

Evaluating model on unseen medical scans...
âœ… Final Diagnostic Accuracy: 89.42%


In [11]:
## 4. Gradio Web Application Deployment
#Wrapping the trained PyTorch model into an interactive UI for live inference.

In [12]:
def predict_xray(img):
    # Process image and add batch dimension
    tensor_img = test_transforms(img).unsqueeze(0).to(device)

    # Inference
    with torch.no_grad():
        outputs = resnet_model(tensor_img)
        probabilities = F.softmax(outputs[0], dim=0)

    # Format for Gradio
    labels = ['Healthy Lungs (Normal)', 'Pneumonia Detected']
    return {labels[i]: float(probabilities[i]) for i in range(2)}

# Build and launch interface
demo = gr.Interface(
    fn=predict_xray,
    inputs=gr.Image(type="pil", label="Upload Chest X-Ray"),
    outputs=gr.Label(num_top_classes=2, label="AI Diagnosis"),
    title="ðŸ©º AI Pneumonia Detector",
    description="Upload a chest X-ray image and our fine-tuned ResNet-18 Deep Learning model will diagnose it in real-time.",
    theme="default"
)

demo.launch(share=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a7c7316999ec892c13.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


