Skip to content

bpnace/Documentation-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Documentation RAG

Sanitized case study for a documentation assistant based on retrieval augmented generation.

This repository is a public portfolio version of a private automation concept. It explains the architecture and engineering decisions without publishing a usable n8n export, live endpoints, server details, provider configuration, credentials, prompts, account identifiers, vector table names, or production payloads.

Case Study

Problem

Documentation grows faster than teams can read it. Search alone often returns pages, not answers. A useful assistant needs refreshed source content, chunked knowledge, retrieval, and answer generation with enough context to stay grounded.

Solution

The private system indexes selected documentation pages, cleans and chunks the content, stores embeddings, and exposes a chat interface that retrieves relevant context before answering. This public case study keeps the architecture visible without publishing the importable graph.

flowchart LR
  A["Documentation Source"] --> B["Content Collector"]
  B --> C["Cleaner"]
  C --> D["Chunker"]
  D --> E["Embedding Model"]
  E --> F["Knowledge Store"]
  G["User Question"] --> H["Retriever"]
  F --> H
  H --> I["Answer Agent"]
Loading

Engineering Decisions

  • Split ingestion from query-time answering.
  • Deduplicate pages before indexing to avoid repeated context.
  • Keep chunks small enough for retrieval while preserving useful sections.
  • Refresh the knowledge store on a schedule instead of relying on stale exports.
  • Keep provider settings and storage schema outside public documentation.

Outcome

The private workflow turns a documentation set into a practical support assistant for implementation questions. The public version demonstrates RAG architecture, indexing strategy, and operational hygiene without exposing a reproducible setup.

What This Shows

  • Documentation ingestion and cleaning
  • Scheduled re-indexing
  • Chunking and embedding strategy
  • Vector retrieval before answer generation
  • Chat assistant architecture for technical documentation

What Is Intentionally Missing

  • No importable n8n workflow
  • No node parameters
  • No server URL or source URL list
  • No vector database schema
  • No prompts, credentials, or provider configuration
  • No production examples or execution payloads

Repository Structure

docs/architecture.md                     Sanitized architecture notes
scripts/validate-public-case-study.mjs   Leak and export-shape validation
SECURITY.md                              Public handling policy
package.json                             Validation command

Validate

npm test

The validator blocks common secret formats, URLs, webhook paths, credential blocks, local paths, n8n export shapes, and private terms supplied through SHOWCASE_PRIVATE_TERMS.

Note

This is a case study, not a template. It is designed to show retrieval architecture and automation judgment without giving away a working implementation.

About

Sanitized case study for a documentation ingestion and RAG assistant system.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors