DiffRAG-SQL: Differentiable Retrieval + SQL Reasoning for Faithful QA
Overview
DiffRAG-SQL is an end-to-end differentiable retrieval-augmented generation (RAG) framework that integrates structured SQL reasoning for faithful question answering. This repository represents my independent research on generative reasoning, factual consistency, and explainable AI, forming a cornerstone of my doctoral research direction for the University of Toronto PhD in Computer Science (Fall 2026).
The system grounds large-language-model (LLM) outputs in real database queries, creating an interpretable bridge between symbolic retrieval and neural text generation.
Key Features
Retrieval (TF-IDF) for context selection from the Hugging Face SQuAD dataset (default).
Reader (DistilBERT QA) for span-level answer generation.
Faithfulness and Abstention Evaluation: exact match (EM), F1, retrieval precision, and abstention thresholding.
Differentiable SQL grounding: retrieval and answer generation are linked through differentiable loss propagation.
Auto-generated paper artifacts: results and tables sync directly into a LaTeX manuscript for transparent documentation.
API-driven design: easily extensible to other datasets or retrieval schemes.
Research Motivation
Large language models excel at fluent text generation but often lack verifiable grounding. By combining neural retrieval and symbolic SQL execution, DiffRAG-SQL demonstrates how factual reasoning and data transparency can coexist with the flexibility of modern generative architectures.
This project extends my broader interest in explainable, data-aware AI, complementing my other works on causal inference and adaptive reinforcement learning.
Conceptually, DiffRAG-SQL parallels research by:
Prof. Sheila McIlraith, whose studies on structured reasoning and explainable AI highlight the importance of transparency in decision systems.
Prof. Jimmy Ba, whose advances in optimization and deep learning stability (e.g., Adam, Layer Norm) underpin the model training used here.
Prof. Alán Aspuru-Guzik, whose goal-directed generative models inspire the property-guided reasoning approach embedded in this project.
Methodology
Data Retrieval
Loads SQuAD via datasets.load_dataset("squad").
Extracts passages relevant to each question using TF-IDF retrieval with cosine similarity ranking.
Reader Model
Fine-tunes distilbert-base-cased for extractive QA.
Outputs span predictions and confidence scores.
SQL Grounding
Converts retrieved text into synthetic relational tables.
Generates SQL queries corresponding to the reasoning path of each answer.
Enables gradient flow through retrieval and selection layers.
Evaluation
Calculates EM/F1 scores.
Computes Faithfulness (alignment between generated answers and database rows).
Supports Abstention thresholds to prevent ungrounded predictions.
Automation and Paper Sync
Scripts automatically push evaluation metrics into paper/results.tex for LaTeX inclusion.
CI/CD ensures paper artifacts remain consistent with current code results.
Example Usage
pip install -r requirements.txt
python run_diffrag_sql.py --dataset squad --retriever tfidf --reader distilbert
python scripts/export_to_latex.py --metrics results/metrics.json
Repository Structure diffrag-sql/ ├─ data/ # Dataset loading and preprocessing ├─ retriever/ # TF-IDF and differentiable retrieval modules ├─ reader/ # DistilBERT QA model ├─ sql_grounding/ # SQL query generation + differentiable loss ├─ evaluation/ # EM/F1, Faithfulness, Abstention metrics ├─ scripts/ # Automation utilities (LaTeX export, CI) ├─ paper/ # Auto-built LaTeX manuscript ├─ tests/ # Unit + integration tests ├─ results/ # JSON logs and metrics ├─ requirements.txt ├─ LICENSE └─ README.md
Results Snapshot Metric Score Notes Exact Match (EM) 71.4 Consistent with baseline SQuAD reader F1 Score 78.6 Higher due to structured retrieval context Faithfulness 0.91 Fraction of answers supported by retrieved evidence Abstention Rate 0.08 Controlled by confidence thresholding
Results are automatically updated in results/metrics.json and synced to the LaTeX manuscript via CI.
Relevance
This work illustrates my capability to:
Integrate neural and symbolic paradigms (LLM + SQL).
Implement evaluation-driven pipelines for trustworthy AI.
Publish reproducible research artifacts with CI/CD verification.
It complements my Ontario Health Causal Analysis (causal reasoning) and Adaptive Cyber Defence RL+NLP (adaptive learning) projects, establishing a coherent research focus on interpretable, data-grounded AI systems.
License
This project is released under the MIT License — see LICENSE .
Citation @software{Kazi_DiffRAGSQL_2025, author = {Kazi, Jibran Rafat Samie}, title = {DiffRAG-SQL: Differentiable Retrieval + SQL Reasoning for Faithful Question Answering}, year = {2025}, url = {https://github.com/jibrankazi/diffrag-sql}, license = {MIT} }
Contact
Kazi Jibran Rafat Samie 📍 Toronto, Canada 📧 jibrankazi@gmail.com
🔗 github.com/jibrankazi
🔗 linkedin.com/in/jibrankazi
© 2025 Kazi Jibran Rafat Samie Independent research project on retrieval-augmented reasoning and faithful generation. Part of my doctoral research direction.