# Comprehensive Research: Pneumonia CNN (Deep Learning)

## 1. Environment & Data Augmentation
**Objective**: Detect Pneumonia from Chest X-Rays.
**Metric**: Recall (Don't miss a sick patient) + AUC.
**Strategy**: Transfer Learning (EfficientNet) + Heavy Augmentation.


In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
import matplotlib.pyplot as plt
import numpy as np

# --- 1. DATA GENERATORS (SIMULATED) ---
# In Research, visualizing augmentations is critical to ensure we don't distort medical meaning.
# e.g., Horizontal Flip is OK, but Vertical Flip is medically wrong for lungs.

datagen = ImageDataGenerator(
    rotation_range=20,
    zoom_range=0.2,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True, # Lungs are roughly symmetric
    validation_split=0.2
)

# Visualize the Augmentation
# Mock Image (Random Noise for demo)
mock_img = np.random.rand(1, 224, 224, 3)

plt.figure(figsize=(12, 3))
i = 0
for batch in datagen.flow(mock_img, batch_size=1):
    plt.subplot(1, 4, i+1)
    plt.imshow(batch[0])
    plt.axis('off')
    i += 1
    if i >= 4: break
plt.suptitle("Augmented Samples")
plt.show()

## 2. Model Architecture (Transfer Learning)
We use **EfficientNetB0**. It achieves better accuracy than ResNet50 with 10x fewer parameters.

In [None]:
base_model = EfficientNetB0(include_top=False, weights='imagenet', input_shape=(224, 224, 3))
base_model.trainable = False # Freeze Base

x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dropout(0.5)(x) # Strong regularization
output = Dense(1, activation='sigmoid')(x)

model = Model(inputs=base_model.input, outputs=output)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy', 'AUC'])

model.summary()

## 3. Training Dynamics (Callbacks)
We define the "Guardrails" for training: Stop if validation loss rises (Overfitting), Reduce LR if stuck (Local Minima).

In [None]:
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=1e-6)

# Mock Training
history = model.fit(
    mock_img, [1], # Mock data
    epochs=5,      # Short run for demo
    callbacks=[early_stop, reduce_lr],
    verbose=1
)

## 4. Diagnostics: Loss Curves
A good researcher always plots Loss Curves to check for Overfitting (Divergence).

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(history.history['loss'], label='Train Loss')
# plt.plot(history.history['val_loss'], label='Val Loss') # Uncomment in real run
plt.title("Training Dynamics")
plt.xlabel("Epoch")
plt.ylabel("Log Loss")
plt.legend()
plt.show()