This project implements a fully local, agentic RAG (Retrieval-Augmented Generation) pipeline using LangGraph. It is designed to:
- Run entirely on macOS, offline
- Use local embeddings with
llama.cpp - Perform iterative query refinement when initial retrieval fails
- Be clean, minimal, and beginner-friendly
The goal is to empower the LLM not just to answer, but to reason, evaluate context, and rewrite queries when needed.
flowchart TD
A[User Query Input] --> B[Initial Document Retrieval]
B --> C[LLM Grades Retrieved Docs]
C -->|Relevant| D[Generate Final Answer]
C -->|Not Relevant| E[LLM Rewrites Query]
E --> F[New Retrieval from Alternate Source]
F --> G[LLM Generates Final Answer]
D --> H[Return Answer to User]
G --> H
This shows the agentic loop: query → retrieve → grade → optionally rewrite → final answer.
This project builds on the excellent LangChain tutorial: 🔗 Agentic RAG with LangGraph
Compared to the original tutorial, this project includes:
- Local Embeddings via
llama.cpp– no cloud dependency - Cleaner Parsing with
trafilaturafor robust HTML extraction - Smarter Chunking using sentence-aware segmentation
Download and install the latest stable Python 3.13.x for macOS:
-
Under the Python 3.13.x section, download:
- macOS 64-bit universal2 installer (
python-3.13.x-macos11.pkg)
- macOS 64-bit universal2 installer (
-
Run the
.pkginstaller and follow the GUI instructions. -
Confirm installation:
python3.13 --version
uv is a high-performance, Rust-based tool for managing Python packages and environments.
-
Install
uv:curl -LsSf https://astral.sh/uv/install.sh | sh -
Add
uvto your shell’sPATH(if not already):export PATH="$HOME/.local/bin:$PATH"
-
Verify installation:
which uv uv --version
uv venv --python $(which python3.13)
source .venv/bin/activateuv pip install -r requirements.txttouch .envAdd the following:
WATSONX_PROJECT=
WATSONX_APIKEY=
Replace values with your Watsonx credentials.
python -m ensurepip --upgrade
python -m spacy download en_core_web_smwget -O granite-embedding-30m-english-Q6_K.gguf \
https://huggingface.co/lmstudio-community/granite-embedding-30m-english-GGUF/resolve/main/granite-embedding-30m-english-Q6_K.ggufuv pip install jupyter ipykernelpython -m ipykernel install --user --name=myenv --display-name "Python (.venv)"-
Cmd + Shift + P→ Python: Select Interpreter -
Press
Cmd + Shift + .to show hidden.venvfolder -
Choose
.venv/bin/python -
Cmd + Shift + P→ Jupyter: Select Interpreter to Start Jupyter Server -
Choose the same
.venvPython -
If kernel doesn’t show:
- Temporarily select a different one
- Re-select
.venv - Run
Developer: Reload Window