# Chest X-Ray Pneumonia Detection with fastai

This notebook mirrors the training workflow from **PLGaultier**'s project. It fine-tunes a pre-trained ResNet-34 model using the [fastai](https://docs.fast.ai/) library on the canonical chest X-ray dataset prepared in this repository.


## 1. Environment setup

Import the Python packages and ensure that a GPU is visible for accelerated training.


In [None]:
from pathlib import Path
import torch
from fastai.vision.all import (
    ImageDataLoaders,
    Resize,
    aug_transforms,
    vision_learner,
    resnet34,
    error_rate,
    accuracy,
    ClassificationInterpretation
)

import fastai
print(f'fastai version: {fastai.__version__}')
print(f'Using device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU"}')


## 2. Dataset organisation

The dataset is expected at `../data/chest_xray` with `train`, `val`, and `test` folders each containing `NORMAL` and `PNEUMONIA` sub-directories.


In [None]:
data_root = Path('..') / 'data' / 'chest_xray'
train_path = data_root / 'train'
valid_path = data_root / 'val'
test_path = data_root / 'test'

for split in ('train', 'val', 'test'):
    for label in ('NORMAL', 'PNEUMONIA'):
        path = data_root / split / label
        if not path.exists():
            raise FileNotFoundError(f'Missing expected directory: {path}')

print('Dataset folders located successfully.')


## 3. Data loaders

Create augmented data loaders with fastai's helper that applies light augmentations to the training split.


In [None]:
item_tfms = Resize(460)
batch_tfms = aug_transforms(mult=1.0, max_warp=0.)

dls = ImageDataLoaders.from_folder(
    train_path.parent,
    train='train',
    valid='val',
    seed=42,
    item_tfms=item_tfms,
    batch_tfms=batch_tfms,
    bs=32
)

dls.show_batch(max_n=9, figsize=(6, 6))


## 4. Model definition

Instantiate a learner backed by a ResNet-34 encoder pre-trained on ImageNet.


In [None]:
learn = vision_learner(dls, resnet34, metrics=[accuracy, error_rate])
learn.model


## 5. Training

Perform transfer learning with discriminative learning rates and frozen early layers, then fine-tune the entire network.


In [None]:
learn.freeze()
learn.fit_one_cycle(3, 1e-3)

learn.unfreeze()
learn.fine_tune(5, base_lr=1e-4)


## 6. Evaluation

Inspect predictions, compute the confusion matrix, and generate a classification report.


In [None]:
learn.show_results(max_n=6, figsize=(8, 8))
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(4, 4))
interp.print_classification_report()


## 7. Test set performance

Load the held-out test split and measure accuracy and loss.


In [None]:
test_dl = learn.dls.test_dl(test_path)
loss, acc, err = learn.validate(dl=test_dl)
print(f'Test loss: {loss:.4f}')
print(f'Test accuracy: {acc:.4f}')
print(f'Test error rate: {err:.4f}')


## 8. Exporting the learner

Persist the trained model for later inference.


In [None]:
learn.export('plgaultier_fastai_pneumonia.pkl')
print('Model exported to plgaultier_fastai_pneumonia.pkl')
