Skip to content

[codex] Add RAG knowledge UI and vector backend scaffold#12

Merged
2002yy merged 1 commit into
mainfrom
codex/rag-ui-backends
Jun 5, 2026
Merged

[codex] Add RAG knowledge UI and vector backend scaffold#12
2002yy merged 1 commit into
mainfrom
codex/rag-ui-backends

Conversation

@2002yy
Copy link
Copy Markdown
Owner

@2002yy 2002yy commented Jun 5, 2026

What changed

  • Added P4-C Streamlit RAG debug/knowledge panel: index summary, indexed document table, chunk preview, retrieval controls, vector backend status, and score breakdown table.
  • Added P5 embedding/vector backend scaffolding: EmbeddingProvider, LocalHashEmbeddingProvider, VectorBackend, LocalVectorBackend, env-driven backend selection, and optional Chroma adapter with lazy import.
  • Added backend_vector retrieval mode wired through indexing, querying, evaluation, and UI.
  • Updated RAG/TECH_STACK/README/TESTING docs and .env.example with local/chroma backend configuration.

Boundary

This stops before P7. Chroma is an optional adapter scaffold using the current local hash embedding provider by default; production embedding providers remain planned.

Validation

  • python -m pytest -q: 273 passed
  • python -m ruff check .: passed
  • detect-secrets scan over source/doc/test/config paths: results empty
  • python tools/package_project_helper.py . NUL 0: OK
  • git diff --check: passed with CRLF warnings only
  • python -m mypy --explicit-package-bases src: existing 18 soft-check errors remain
  • Playwright smoke on http://localhost:8504: RAG panel, indexed docs, chunk preview, vector backend, mode, min_score, debug toggle visible; no Traceback

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3b5750c834

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/rag/chroma_backend.py
top_k: int = 5,
min_score: float = 0.05,
) -> list[RagSearchResult]:
_ = index
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Filter Chroma queries to the active index

When RAG_VECTOR_BACKEND=chroma, rebuilding an index after removing a document (or using different index_path values through the API) leaves the old chunk ids in the persistent Chroma collection because upsert_index only upserts current chunks and query discards the index argument. A backend-vector search can therefore return stale chunks from documents that are no longer in the active JSON index, corrupting the source context and eval results; delete/replace stale ids or constrain the Chroma query to the current index's chunk ids/document hashes.

Useful? React with 👍 / 👎.

@2002yy 2002yy merged commit 337adc7 into main Jun 5, 2026
2 checks passed
@2002yy 2002yy deleted the codex/rag-ui-backends branch June 5, 2026 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant