RAG chatbot hoi dap van ban hanh chinh, ho tro hai nguon tai lieu:
- Tai lieu he thong: van ban/thu tuc hanh chinh co san.
- Tai lieu user upload: file Word/PDF trong session hien tai hoac cac session truoc.
Backend dung FastAPI, MongoDB, Chroma, LangGraph, LangChain/OpenRouter/OpenAI va LangSmith tracing. Frontend dung Next.js.
frontend/
Next.js UI cho chat, session, upload tai lieu
backend/
app/api/ FastAPI routes
app/services/ business logic: chat, session, document
app/repositories/ MongoDB repositories
app/models/ data models
app/schemas/ API schemas
app/rag/ RAG pipeline
graph/ LangGraph workflow
rewrite/ rewrite gate + query rewriter
query/ intent router
retrieval/ filters, resolvers, retriever, strategy, validation
generation/ LLM answer + source formatter
vectorstore/ Chroma adapter
QAPipeline.run() van la facade cho API cu, nhung ben trong goi RAGGraphRunner.
User Query
-> ChatService
-> QAPipeline.run
-> RAGGraphRunner
-> load_context_node
-> rewrite_detector_node
-> rewrite_query_node neu can
-> use_original_query_node neu khong can rewrite
-> intent_router_node
-> scope_resolver_node
-> neu should_reuse_last_filter=true: build_filter_node
-> neu should_reuse_last_filter=false: document_resolver_node
-> candidate_selector_node
-> build_filter_node
-> retrieval_strategy_node
-> retrieval_node
-> evidence_validation_node
-> answer_node / no_context_node / direct_answer_node / clarification_node / unsupported_node
-> update_state_node
-> Return API Response
Chi tiet flow va thiet ke node nam trong pipeline.md.
rewrite_detector_node: quyet dinh query co can rewrite khong.rewrite_query_node: rewrite follow-up/mo ho thanh standalone query.intent_router_node: phan loai user muon lam gi, vi duask_question,compare_documents,general_query.scope_resolver_node: state-aware + LLM structured output de xac nhan scope cuoi.document_resolver_node: resolve document/procedure bang MongoDB metadata; voi system docs, uu tien pre-filter theoprocedure_title + summarytruoc khi search chunk.candidate_selector_node: chon document candidate neu du chac, hoi lai neu mo ho.build_filter_node: build metadata filter cuoi cung bang code.retrieval_node: search Chroma trong metadata filter.evidence_validation_node: chan chunk sai metadata/score thap, fallback neu khong du bang chung.answer_node: sinh cau tra loi dua tren context.
LLM khong duoc tu build metadata filter va khong duoc quyet dinh quyen truy cap.
Filter cuoi cung duoc build deterministic trong build_filter_node.
Quy tac:
- User upload bat buoc co
owner_user_id. - Current session upload bat buoc co
owner_user_id + session_id. - User hoi filename bat buoc co
owner_user_id + filename. - System docs bat buoc co
source_type=system, voi system public covisibility=global. selected_document_idstu UI/request phai duoc check quyen tung document truoc khi dua vao filter.- Mixed retrieval phai tach branch
system_chunksvauser_upload_chunks. - Khong search toan bo Chroma khi chua co scope/filter hop le.
system_docs: tai lieu he thong chung.system_procedure: mot thu tuc he thong cu the theoprocedure_title.current_session_uploads: file user upload trong session hien tai.user_all_uploads: file user da upload o cac session truoc.user_file_name: file user upload theo filename.hybrid_system_and_user: so sanh/doi chieu system docs va user upload.general_query: cau hoi khong can retrieval.need_clarification: can hoi lai user.
He thong theo huong controlled agentic RAG:
- LLM duoc dung cho:
- rewrite gate
- query rewrite
- intent router neu bat
- scope analyzer structured output neu bat
- answer generation
- Rule/code/metadata duoc dung cho:
- retrieval planner fast path
- document resolver
- system document pre-filter theo
procedure_title + summary - metadata filter
- permission check
- evidence validation
Prompt techniques dang dung:
- Structured JSON output.
- Enum constraint cho intent/scope/resolution mode.
- Negative instruction: khong build filter, khong quyet quyen.
- Few-shot ngan cho query rewrite.
- Temperature
0. - Fallback deterministic khi LLM loi.
- Security guard bang code sau LLM.
Da them trace cho:
rag_qa_pipelinerag_langgraph_run- cac node dieu phoi nhu rewrite, intent, planner, scope, document resolver, candidate selector, build filter, retrieval strategy.
Khong trace truc tiep full retrieved chunks de tranh day context/log qua lon.
Bat tracing bang env:
LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=...
LANGSMITH_PROJECT=RAGMultiDocsBackend can cac bien chinh:
MONGODB_URI=mongodb://localhost:27017
MONGODB_DB_NAME=rag_chatbot
CHROMA_PERSIST_DIR=chroma
CHROMA_COLLECTION_NAME=rag_chunks
OPENAI_API_KEY=...
OPENAI_MODEL=gpt-4o-mini
LLM_PROVIDER=openrouter
OPENROUTER_API_KEY=...
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
OPENROUTER_MODEL=google/gemini-2.5-flash-lite
OPENROUTER_QUERY_REWRITE_MODEL=google/gemini-2.5-flash-lite
OPENROUTER_REWRITE_GATE_MODEL=google/gemini-2.5-flash-lite
OPENROUTER_INTENT_MODEL=google/gemini-2.5-flash-lite
OPENROUTER_SCOPE_MODEL=google/gemini-2.5-flash-lite
OPENROUTER_SCOPE_MAX_TOKENS=128
INTENT_ROUTER_USE_LLM=true
SCOPE_RESOLVER_USE_LLM=true
UPLOAD_DIR=storage/raw
MARKDOWN_DIR=storage/markdown
CORS_ORIGINS=http://localhost:3000Frontend:
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000cd backend
pip install -r requirements.txt
uvicorn app.main:app --reloadBackend mac dinh chay tai:
http://localhost:8000
cd frontend
npm install
npm run devFrontend mac dinh chay tai:
http://localhost:3000
docker compose up --buildPOST /auth/...: dang nhap/dang ky tuy implementation hien tai.POST /sessions: tao session.GET /sessions: list session cua user.POST /documents/upload: upload PDF/DOC/DOCX.POST /chat: hoi dap RAG.
Frontend hien goi:
POST /chat
Payload chat gom:
{
"question": "...",
"session_id": "...",
"scope": "auto",
"selected_document_ids": []
}Flow tai lieu:
Upload PDF/DOC/DOCX
-> convert sang Markdown
-> chunking
-> tao metadata
-> embedding
-> luu vector vao Chroma
-> luu document/chunk metadata vao MongoDB
Metadata quan trong:
document_idchunk_idsource_typeowner_user_idsession_idfilenameprocedure_titlesummarytren document system, sinh boi script metadata summaryvisibilitypage_numbersection_title
System documents co field summary luu truc tiep tren document trong MongoDB. Summary nay duoc dung trong DocumentResolver cung voi procedure_title de pre-filter system document truoc khi retrieval xuong chunk.
Tao hoac cap nhat summary bang OpenRouter:
python backend/scripts/generate_system_document_metadata_summaries.py --overwriteChay thu khong ghi DB:
python backend/scripts/generate_system_document_metadata_summaries.py --limit 1 --dry-runScript mac dinh dung model:
google/gemini-2.5-flash-lite
Field duoc ghi vao MongoDB:
summary
Khong luu cac field phu nhu keyword/procedure type.
Benchmark retrieval system docs nam tai:
backend/tests/fixtures/system_retrieval_benchmark.jsonl
Moi case gom:
queryexpected_document.document_idexpected_chunk_idsevidence_chunks[].chunk_id
Sinh benchmark tu system markdown va chunk da luu trong MongoDB:
python backend/scripts/generate_system_retrieval_benchmark.py --limit 10 --queries-per-doc 2 --chunk-source storedNeu MongoDB chua co chunk va muon fallback tam sang chunk markdown:
python backend/scripts/generate_system_retrieval_benchmark.py --limit 10 --queries-per-doc 2 --chunk-source stored --allow-markdown-fallbackKet qua benchmark gan nhat sau khi dung pre-filter procedure_title + summary:
total: 20
doc hit: 20/20 = 100%
chunk hit: 19/20 = 95%
chunk top-1: 10/20
chunk top-3: 19/20
MRR: 0.7167
Miss con lai la case co nhieu chunk cung document cung chua thong tin dung ve thoi han; retrieval top-1 tra ve chunk dung noi dung nhung khac expected_chunk_id.
Chay toan bo backend tests:
cd backend
$env:PYTHONPATH='.'
$env:INTENT_ROUTER_USE_LLM='false'
$env:SCOPE_RESOLVER_USE_LLM='false'
pytest -qKet qua gan nhat:
56 passed, 17 warnings
Warnings hien tai chu yeu la:
- FastAPI
on_eventdeprecated. datetime.utcnow()deprecated.
Da co:
- LangGraph RAG pipeline.
- Rewrite gate + query rewriter.
- Intent router.
- Scope analyzer co LLM structured output va fallback.
- OpenRouter Gemini 2.5 Flash Lite cho answer, rewrite gate, query rewrite, intent router, scope analyzer.
- Deterministic metadata filter.
- Permission check theo user/session.
- System document summary va document pre-filter theo
procedure_title + summary. - Candidate selector.
- Evidence validation.
- Source formatter.
- LangSmith trace.
- API chat tu frontend toi backend.
Can cai thien tiep:
- Chuyen logic persist state that vao
update_state_node, deChatServicechi persist ket qua. - Cai thien candidate selector voi confidence scoring/LLM selector top 3-5 metadata.
- Them BM25/hybrid retrieval/reranking neu can.
- Doi FastAPI startup sang lifespan.
- Thay
datetime.utcnow()bang timezone-aware datetime.