Local web UI to list Chroma collections, page through documents and metadata, run semantic search (LangChain Chroma + OpenAIEmbeddings), and visualize embeddings in 2D (UMAP with PCA fallback for tiny sets, optional k-means colors).
Repo context: this app reads the same chroma_data/ produced by vector_builder (ingest.py). For a prompt + RAG → HTML flow in the browser, see prototype. How everything connects: docs/REPO-MAP.md.
Feature reference (what each control does + how the backend reads Chroma): USER-GUIDE.md. Manual QA: TESTING.md.
- Python 3.11+
- Node 20+ (or 18 LTS)
- A populated Chroma persist directory (e.g. after
python vector_builder/scripts/ingest.py) OPENAI_API_KEYfor semantic search (embeddings must match the model used when the index was built). For a self-hosted OpenAI-compatible embedding server (e.g. vLLM), setOPENAI_BASE_URLinbackend/.env(with trailing/v1) and pointOPENAI_EMBED_MODELat the served model id; use any non-empty key if the server does not enforce auth.
cd chromaUI/backend
python -m venv .venv
# Windows: .venv\Scripts\activate
# Git Bash: source .venv/Scripts/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env: set CHROMA_PERSIST_DIRECTORY to your Chroma folder (absolute or relative to cwd)
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000API docs: http://127.0.0.1:8000/docs
In a second terminal:
cd chromaUI/frontend
npm install
npm run devOpen http://localhost:5173 — the dev server proxies /api to the backend on port 8000.
To call a backend on another origin, set VITE_API_BASE in frontend/.env (see frontend/.env.example).
UI layout
On xl and up (1280px+), the shell uses resizable panels (react-resizable-panels): drag the grips between Collections and the workspace, and between the main column and Selection detail. Double-click a grip to collapse the panel on its left to a slim bar; drag the bar to expand. Sizes persist in localStorage (chromaui-sidebar, chromaui-main-detail). The collections list stays in the left sidebar; the main column stacks semantic search, metadata filter, 2D projection controls, then documents and embedding plot (Split / Table / Viz). Selection detail sits in the right panel on xl+. Below xl, the sidebar is a drawer (header menu), detail stacks under the workspace, and panel resizing is disabled so touch layout stays predictable.
Shortcuts and accessibility: Use Tab from the top of the page to reveal Skip to main content (jumps to #main-workspace). Press / (when not focused in an input or textarea) to focus semantic search. With the collections drawer open on small screens, Escape closes it and returns focus to the menu button; the drawer also locks background scroll. The 2D chart loads Plotly on demand so the first paint stays small.
| Variable | Where | Purpose |
|---|---|---|
CHROMA_PERSIST_DIRECTORY |
chromaUI/backend/.env |
Path to Chroma persist_directory (absolute, or relative to chromaUI/backend/, not the shell cwd — so ../vector_builder/chroma_data works even if you start uvicorn from the repo root) |
OPENAI_API_KEY |
backend .env |
Required for /search |
OPENAI_EMBED_MODEL |
optional | Default text-embedding-3-small — must match the model used to build the index |
OPENAI_BASE_URL |
optional | OpenAI-compatible root URL, e.g. vLLM behind ngrok — must end with /v1; when set, LangChain sends embedding requests there instead of api.openai.com |
FRONTEND_ORIGINS |
optional | CORS allowlist, comma-separated; default http://localhost:5173 |
VITE_API_BASE |
frontend .env |
Optional; empty uses same-origin /api (Vite proxy in dev) |
Pass a Chroma where clause as JSON (see Chroma filtering). Examples:
{"sectionType": "hero"}{"sectionType": {"$eq": "hero"}}(explicit operator form)
Use Apply filter in the UI, then reload documents and the 2D plot.
chromaUI/
README.md
USER-GUIDE.md
TESTING.md
backend/
app/
main.py
config.py
routers/collections.py
services/chroma_service.py
requirements.txt
.env.example
frontend/
src/
App.tsx
api.ts
components/
package.json
- Scores: search returns Chroma/LangChain
distance(lower is closer for typical L2 setups) and a derivedsimilarity = 1/(1+distance)for quick comparison. - UMAP: collections with fewer than 15 points use PCA (or raw coords) instead of UMAP.
- No auth: bind to
127.0.0.1and do not expose on a public network.
- Empty collections / blank sidebar:
CHROMA_PERSIST_DIRECTORYmust point at the same foldervector_builder/scripts/ingest.pywrites to (vector_builder/chroma_dataat repo root). Relative values are resolved fromchromaUI/backend/; the legacy../vector_builder/chroma_datain.envis rewritten to that repo path automatically. If you previously opened the UI with an older backend, delete any stray empty folderchromaUI/vector_builder/chroma_dataso you are not confused. - Port 5173 already in use: the prototype app’s Vite dev server also defaults to 5173. Run only one frontend, or change one app’s Vite port (docs/REPO-MAP.md).
pipresolution errors (ResolutionImpossible): upgrade pip (python -m pip install -U pip) and reinstall. This repo pinsnumpy<2inbackend/requirements.txtfor compatibility with Chroma and umap-learn on many platforms.Read timed outduring install: retrypip installor use a faster index mirror.
Step-by-step checks for the ChromaUI app (browser + optional /docs): TESTING.md. Feature semantics and backend loading: USER-GUIDE.md. To (re)build the index this UI reads, use the vector builder — vector_builder/README.md (ingest.py → chroma_data/).