Code examples for building safer RAG (Retrieval-Augmented Generation) systems. Learn techniques to prevent hallucinations, improve retrieval quality, and add traceability to your AI applications.
This repository contains implementations in Python and TypeScript for:
- Semantic Stress (ΔS) - Measure relevance between queries and retrieved chunks
- Semantic Filtering - Filter out low-quality retrievals before they reach your LLM
- Threshold Tuning - Find optimal filtering thresholds for your specific use case
- Citation Tracking - Add traceability to know which sources influenced responses
- Reranking - Optimize chunk ordering for better context
Choose your language:
- Python Examples - Uses
sentence-transformersfor embeddings - TypeScript Examples - Uses HuggingFace Inference API
Each directory has its own README with setup instructions and runnable examples.
Complete Semantic Firewall Implementation
- Production-ready filtering with configurable thresholds
- Batch processing support for efficiency
- Metadata preservation for downstream processing
Before/After Comparison Scripts
- Visualize filtering impact with side-by-side comparisons
- Statistics on pass rates, avg ΔS, and quality improvements
- Impact analysis showing hallucination prevention
Threshold Tuning Utilities
- Sweep multiple thresholds to find optimal values
- Precision/recall/F1 optimization with ground truth
- Quick-tune mode for exploration without labels
- Compare course-recommended 0.60 against your data
Real-World Test Cases
- 15+ production scenarios across 5 domains
- Customer Support, Technical Docs, E-commerce, Healthcare, Legal
- Edge cases: multi-intent, temporal context, jargon
- Failure mode detection: topic drift, keyword traps, context pollution
- See python/tests/README.md for details
This repository accompanies my Gumroad course where you can follow along with detailed explanations and best practices for building a production-ready RAG firewall.
Python:
from src.core.semantic_stress import SemanticStressCalculator
calculator = SemanticStressCalculator()
result = calculator.calculate_delta_s(
question="How do I cancel?",
chunk="To cancel, visit settings."
)
print(f"ΔS: {result['delta_s']:.3f}")TypeScript:
import { SemanticStressCalculator } from "./src/core/semanticStress";
const calculator = new SemanticStressCalculator();
const result = await calculator.calculateDeltaS(
"How do I cancel?",
"To cancel, visit settings."
);
console.log(`ΔS: ${result.delta_s.toFixed(3)}`);See LICENSE for details.