DeepLearning HW3 — Spoken-SQuAD Extractive QA

A minimal, reproducible BERT baseline for extractive question answering over ASR-transcribed passages (Spoken-SQuAD). Trains a span-prediction head on top of bert-base-uncased, logs EM/F1/WER, and saves figures/tables for the report.

Project Structure

Bert.py — main training + evaluation script
run_outputs/ — generated artifacts after running
- figures/ — plots (training loss, EM/F1, WER, length histograms)
- tables/ — history.json, preds_epoch_*.json (sample preds vs. gold)
- base_model_wer.txt — WER per epoch (plain text)
data/ — dataset folder

Dataset

Spoken-SQuAD JSON files:

spoken_train-v1.1.json
spoken_test-v1.1.json

Environment

Python 3.10+, CUDA (optional), and:

pip install torch transformers tqdm matplotlib
# if using evaluate/jiwer variants:
pip install evaluate jiwer

## Train and Evaluation
python3 Bert.py \
  --train_json /data/spoken_train-v1.1.json \
  --valid_json /data/spoken_test-v1.1.json

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
run_outputs		run_outputs
.gitignore		.gitignore
Bert.py		Bert.py
Homework3_DNN.pdf		Homework3_DNN.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepLearning HW3 — Spoken-SQuAD Extractive QA

Project Structure

Dataset

Environment

About

Uh oh!

Releases

Packages

Languages

fgmoradi/DeepLearning_HW3

Folders and files

Latest commit

History

Repository files navigation

DeepLearning HW3 — Spoken-SQuAD Extractive QA

Project Structure

Dataset

Environment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages