Skip to content

luffy06/RAG-Tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Tutorials

This repository contains a small tutorial-style codebase for several RAG fusion strategies. The current focus is not completeness, but getting each method into a minimal runnable form that can be extended later.

What is implemented

  • none: no retrieval augmentation, plain base model evaluation.
  • query: BM25 retrieval + prompt template insertion via {context}.
  • logits: BM25 retrieval + neighbor next-token aggregation + lambda * p + (1 - lambda) * q.
  • latent: FAISS retrieval + embedding-weighted latent vector + trainable projection injected into QKV modules.
  • parametric: document-level LoRA adapters trained from retrieved neighbor documents and fused at inference time.

Repository layout

  • src/main.py: unified inference entry for none/query/logits/latent/parametric.
  • src/train_latent.py: training script for latent fusion adapters.
  • src/train_parametric.py: training script for document-level parametric LoRA adapters.
  • src/fusion/: fusion implementations.
  • src/retriever/: BM25 and FAISS retrievers.
  • scripts/: ready-to-run shell scripts for training and inference.

Environment

The code is expected to run inside your existing ragdemo environment.

conda activate ragdemo

Inference scripts

Plain baseline:

bash scripts/run.sh

Query fusion:

bash scripts/run_query.sh

Logits fusion:

bash scripts/run_logits.sh

Latent fusion inference:

bash scripts/run_latent.sh

Parametric fusion inference:

bash scripts/run_parametric.sh

Training scripts

Train the latent fusion adapter:

bash scripts/train_latent.sh

Train the parametric document-level LoRA bank:

bash scripts/train_parametric.sh

Main CLI

The unified entrypoint is:

python -m src.main \
  --dataset hotpotqa/hotpot_qa \
  --config distractor \
  --split validation \
  --model-name Qwen/Qwen2.5-1.5B \
  --fusion none

Important arguments:

  • --fusion: one of none, query, logits, latent, parametric
  • --retriever: bm25 or faiss
  • --encoder-model-name: required for faiss
  • --user-prompt: prompt template; for query-style text insertion, use {context}
  • --top-k: number of retrieved neighbors
  • --max-samples: evaluate only a small subset for smoke tests
  • --logits-lambda: blend weight used by logits fusion
  • --latent-checkpoint: trained latent adapter checkpoint
  • --parametric-checkpoint: trained parametric adapter bank
  • --lora-rank, --lora-alpha: LoRA configuration for parametric fusion

Method notes

Query fusion

This path currently implements the simplest form: retrieve with BM25, insert retrieved text into the reserved {context} slot in the user prompt, and then run generation.

Example prompt template:

Use the retrieved context to answer the question.

Question: {question}

{context}

Answer:

Logits fusion

This is a minimal version. For each query, the model:

  1. retrieves neighbors with BM25
  2. builds one augmented prompt per neighbor
  3. reads the next-token distribution from each neighbor prompt
  4. weights neighbor targets by retrieval score
  5. blends the neighbor distribution with the base model distribution

Latent fusion

This path currently assumes:

  • retrieval uses FAISS with sentence-transformer embeddings
  • the base model is frozen
  • only the latent projection layers are trained
  • the weighted retrieval embedding is injected into QKV projection outputs

Parametric fusion

This path currently assumes:

  • each document owns one LoRA adapter
  • training uses retrieved neighbor documents to help reconstruct the target document
  • inference retrieves relevant documents, loads their adapters, and computes a weighted average adapter before generation

Status

This repository is still tutorial code. The implementations are intentionally simple and are meant to be iterated on method by method.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors