RAG Tutorials

This repository contains a small tutorial-style codebase for several RAG fusion strategies. The current focus is not completeness, but getting each method into a minimal runnable form that can be extended later.

What is implemented

none: no retrieval augmentation, plain base model evaluation.
query: BM25 retrieval + prompt template insertion via {context}.
logits: BM25 retrieval + neighbor next-token aggregation + lambda * p + (1 - lambda) * q.
latent: FAISS retrieval + embedding-weighted latent vector + trainable projection injected into QKV modules.
parametric: document-level LoRA adapters trained from retrieved neighbor documents and fused at inference time.

Repository layout

src/main.py: unified inference entry for none/query/logits/latent/parametric.
src/train_latent.py: training script for latent fusion adapters.
src/train_parametric.py: training script for document-level parametric LoRA adapters.
src/fusion/: fusion implementations.
src/retriever/: BM25 and FAISS retrievers.
scripts/: ready-to-run shell scripts for training and inference.

Environment

The code is expected to run inside your existing ragdemo environment.

conda activate ragdemo

Inference scripts

Plain baseline:

bash scripts/run.sh

Query fusion:

bash scripts/run_query.sh

Logits fusion:

bash scripts/run_logits.sh

Latent fusion inference:

bash scripts/run_latent.sh

Parametric fusion inference:

bash scripts/run_parametric.sh

Training scripts

Train the latent fusion adapter:

bash scripts/train_latent.sh

Train the parametric document-level LoRA bank:

bash scripts/train_parametric.sh

Main CLI

The unified entrypoint is:

python -m src.main \
  --dataset hotpotqa/hotpot_qa \
  --config distractor \
  --split validation \
  --model-name Qwen/Qwen2.5-1.5B \
  --fusion none

Important arguments:

--fusion: one of none, query, logits, latent, parametric
--retriever: bm25 or faiss
--encoder-model-name: required for faiss
--user-prompt: prompt template; for query-style text insertion, use {context}
--top-k: number of retrieved neighbors
--max-samples: evaluate only a small subset for smoke tests
--logits-lambda: blend weight used by logits fusion
--latent-checkpoint: trained latent adapter checkpoint
--parametric-checkpoint: trained parametric adapter bank
--lora-rank, --lora-alpha: LoRA configuration for parametric fusion

Method notes

Query fusion

This path currently implements the simplest form: retrieve with BM25, insert retrieved text into the reserved {context} slot in the user prompt, and then run generation.

Example prompt template:

Use the retrieved context to answer the question.

Question: {question}

{context}

Answer:

Logits fusion

This is a minimal version. For each query, the model:

retrieves neighbors with BM25
builds one augmented prompt per neighbor
reads the next-token distribution from each neighbor prompt
weights neighbor targets by retrieval score
blends the neighbor distribution with the base model distribution

Latent fusion

This path currently assumes:

retrieval uses FAISS with sentence-transformer embeddings
the base model is frozen
only the latent projection layers are trained
the weighted retrieval embedding is injected into QKV projection outputs

Parametric fusion

This path currently assumes:

each document owns one LoRA adapter
training uses retrieved neighbor documents to help reconstruct the target document
inference retrieves relevant documents, loads their adapters, and computes a weighted average adapter before generation

Status

This repository is still tutorial code. The implementations are intentionally simple and are meant to be iterated on method by method.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Tutorials

What is implemented

Repository layout

Environment

Inference scripts

Training scripts

Main CLI

Method notes

Query fusion

Logits fusion

Latent fusion

Parametric fusion

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Tutorials

What is implemented

Repository layout

Environment

Inference scripts

Training scripts

Main CLI

Method notes

Query fusion

Logits fusion

Latent fusion

Parametric fusion

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages