A minimal, reproducible BERT baseline for extractive question answering over ASR-transcribed passages (Spoken-SQuAD). Trains a span-prediction head on top of bert-base-uncased, logs EM/F1/WER, and saves figures/tables for the report.
-
Bert.py — main training + evaluation script
-
run_outputs/ — generated artifacts after running
- figures/ — plots (training loss, EM/F1, WER, length histograms)
- tables/ —
history.json,preds_epoch_*.json(sample preds vs. gold) - base_model_wer.txt — WER per epoch (plain text)
-
data/ — dataset folder
Spoken-SQuAD JSON files:
spoken_train-v1.1.jsonspoken_test-v1.1.json
Python 3.10+, CUDA (optional), and:
pip install torch transformers tqdm matplotlib
# if using evaluate/jiwer variants:
pip install evaluate jiwer
## Train and Evaluation
python3 Bert.py \
--train_json /data/spoken_train-v1.1.json \
--valid_json /data/spoken_test-v1.1.json