Jason Dury — Eridos AI, Perth, Australia
Dense retrieval ranks passages by embedding similarity to a query, but multi-hop questions require passages that are associatively related through shared reasoning chains rather than semantically similar. Association-Augmented Retrieval (AAR) trains a lightweight MLP (4.2M parameters) to learn passage-to-passage associations from co-occurrence annotations using CLIP-style contrastive learning, then reranks dense retrieval results via bi-directional association scoring. AAR is transductive by design: it learns associations over the target corpus, mirroring how RAG systems are deployed in practice. On HotpotQA, AAR improves Recall@5 by +8.6 points without evaluation-set tuning, with a +28.5 point gain on the hardest questions. On MuSiQue it achieves +10.1 points. An inductive variant shows no significant improvement, consistent with corpus-specific co-occurrence learning. The method trains in under two minutes on a single GPU, adds 3.7ms per query, and requires no LLM-based indexing.
Paper: arXiv (forthcoming) | PAM framework: Zenodo
| Setting | Dataset | R@5 | Delta R@5 | 95% CI |
|---|---|---|---|---|
| Dense baseline | HotpotQA | 0.831 | — | — |
| AAR transductive | HotpotQA | 0.916 | +8.6 | [+8.1, +9.0] |
| AAR inductive | HotpotQA | 0.832 | +0.1 | [-0.3, +0.5] |
| Dense baseline | MuSiQue | 0.387 | — | — |
| AAR transductive | MuSiQue | 0.488 | +10.1 | — |
- Python 3.10+
- PyTorch 2.0+ (CUDA recommended)
- FAISS (
faiss-gpuorfaiss-cpu) - sentence-transformers (for BGE-large-en-v1.5 embeddings)
- datasets (HuggingFace)
pip install torch faiss-gpu sentence-transformers datasets numpy# Download HotpotQA and MuSiQue, embed passages, build FAISS index
python -c "from src.utils import prepare_data; prepare_data()"This downloads the datasets, extracts ~66K unique passages from HotpotQA, embeds them with BGE-large-en-v1.5, and builds a FAISS index. Takes ~15 minutes (mostly embedding).
python -m src.trainTrains on combined train+validation association pairs (~20,742 pairs). Completes in ~2 minutes on an RTX 4080 Super. Saves to models/association_mlp.pt.
python -m src.evaluate --model models/association_mlp.pt --alpha 0.50
python -m src.evaluate --model models/association_mlp.pt --alpha-sweeppython -m src.train_true_inductiveTrains on training-split pairs only (~8,758), evaluates on validation set.
AAR/
├── README.md
├── LICENSE
├── paper/
│ └── aar_paper_submission.md # Full paper (markdown)
├── src/
│ ├── model.py # AssociationMLP architecture
│ ├── train.py # Main training script (transductive)
│ ├── train_true_inductive.py # Inductive training script
│ ├── evaluate.py # Evaluation / retrieval pipeline
│ └── utils.py # Data loading, metrics, retrieval
├── results/
│ ├── retrieval_matched_hp.csv # Main results (Table 2)
│ ├── true_inductive_evaluation.csv # Inductive evaluation (Table 4)
│ ├── scoring_ablation.csv # Scoring method ablation (Table 1)
│ ├── bm25_baseline.csv # BM25 comparison (Table 7)
│ ├── qa_sanity_check.csv # Downstream QA (Table 8)
│ ├── answer_coverage_matched_hp.csv# Answer coverage (Table 9)
│ ├── candidate_pool_sensitivity.csv# FAISS expansion depth (Table B1)
│ ├── latency_breakdown.csv # Latency (Table 10)
│ ├── iteration_log.csv # Development iteration log
│ └── bootstrap_ci.csv # Bootstrap confidence intervals
├── data/
│ └── README.md # Data acquisition instructions
└── models/
└── README.md # Model reproduction instructions
-
Candidate retrieval: FAISS top-100 by cosine similarity
-
Association reranking: For each candidate, compute blended score:
score(q, p) = (1 - lambda) * cos(q, p) + lambda * a(q, p)where
a(q, p) = 0.5 * [f(q) . p + f(p) . q]is the bi-directional association score andfis the trained MLP. -
Return top-k by blended score.
@misc{dury2026aar,
author = {Dury, Jason},
title = {Association $\neq$ Similarity: Learning Corpus-Specific
Associations for Multi-Hop Retrieval},
year = {2026},
note = {arXiv preprint (forthcoming)}
}
@misc{dury2026pam,
author = {Dury, Jason},
title = {Predictive Associative Memory: Unified Retrieval, Imagination,
and Creative Recombination Through Predictive Traversal of
Meaning Space},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.18595537}
}MIT