# Chest X-Ray Pneumonia Detection with fastai

This notebook reproduces the training workflow contributed by **PLGaultier** using the [fastai](https://docs.fast.ai/) API on top of PyTorch. The code expects the dataset to be laid out under `../data/chest_xray` with the `train`, `val`, and `test` sub-folders containing `NORMAL` and `PNEUMONIA` classes.

In [None]:
from pathlib import Path

from fastai.vision.all import (ImageDataLoaders, Resize, aug_transforms,
                               cnn_learner, error_rate, resnet34)

DATA_ROOT = Path('..') / 'data' / 'chest_xray'
BATCH_SIZE = 32
IMG_SIZE = 256

## Create the data loaders

The `ImageDataLoaders.from_folder` factory automatically infers labels from sub-folder names. We use a validation split matching the published dataset.

In [None]:
train_path = DATA_ROOT / 'train'
valid_path = DATA_ROOT / 'val'

dls = ImageDataLoaders.from_folder(
    train_path,
    valid_pct=0.0,
    valid=valid_path,
    train='.',
    valid_pct=None,
    bs=BATCH_SIZE,
    item_tfms=Resize(IMG_SIZE),
    batch_tfms=aug_transforms(do_flip=True, flip_vert=True, max_rotate=20)
)

print(dls.vocab)
dls.show_batch(max_n=9, figsize=(8, 8))

## Transfer learning with ResNet-34

The learner is configured with `error_rate` as an additional metric to track performance beyond accuracy.

In [None]:
learn = cnn_learner(dls, resnet34, metrics=[error_rate], pretrained=True)
learn.model

## Train with discriminative learning rates

FastAI's `fine_tune` automatically performs frozen and unfrozen phases. Feel free to experiment with the number of epochs to match the original paper's results.

In [None]:
learn.fine_tune(5)

## Evaluate on the test set

Once the learner is trained we can create a dataloader that points to the held-out test directory and compute accuracy metrics.

In [None]:
test_files = (DATA_ROOT / 'test').rglob('*.jpeg')

test_dl = learn.dls.test_dl(list(test_files))
test_preds, test_targets = learn.get_preds(dl=test_dl)
accuracy = (test_preds.argmax(dim=1) == test_targets).float().mean().item()
print(f"Test accuracy: {accuracy:.4f}")

## Export the trained learner

Saving the learner allows the model to be re-used for inference without retraining.

In [None]:
learn.export('plgaultier_fastai_pneumonia.pkl')