Conversation
Add Groq cloud LLM backend for enrichment pipeline (non-private content only). Privacy enforced via build_external_prompt() + Sanitizer — PII stripped before any content reaches Groq. Also adds --recent N flag for on-demand enrichment of chunks from the last N hours. - call_groq(): OpenAI-compatible API with auth, usage logging - --backend groq CLI flag (enrichment.py + CLI) - --recent HOURS flag for on-demand enrichment after indexing - since_hours param threaded through VectorStore → enrich_batch → run_enrichment - 17 new tests (14 groq, 3 recent), 520 total passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review infoConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds Groq cloud backend support to the enrichment pipeline with new CLI options Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/CLI
participant Enrich as enrich()
participant RunEnrich as run_enrichment()
participant Store as Vector Store
participant Groq as Groq API
participant Log as Supabase Log
User->>Enrich: enrich(..., backend='groq', recent=24)
Enrich->>Enrich: Set BRAINLAYER_ENRICH_BACKEND='groq'
Enrich->>RunEnrich: run_enrichment(..., since_hours=24)
RunEnrich->>RunEnrich: Validate GROQ_API_KEY
RunEnrich->>Store: get_unenriched_chunks(..., since_hours=24)
Store-->>RunEnrich: Chunks from last 24 hours
RunEnrich->>RunEnrich: For each chunk, build_external_prompt (sanitize)
RunEnrich->>Groq: call_groq(sanitized_prompt, timeout)
Groq-->>RunEnrich: Enrichment response
RunEnrich->>Log: Log token usage with model tag
Log-->>RunEnrich: Logged
RunEnrich-->>User: Enrichment complete
sequenceDiagram
participant User as User/CLI
participant Enrich as enrich()
participant CallLLM as call_llm()
participant RouteCheck as Backend Check
participant CallGroq as call_groq()
participant CallOther as call_ollama/mlx()
User->>Enrich: enrich(..., backend='groq')
Enrich->>Enrich: Set BRAINLAYER_ENRICH_BACKEND
Enrich->>CallLLM: call_llm(prompt, backend='groq')
CallLLM->>RouteCheck: Check backend == 'groq'
RouteCheck-->>CallLLM: Yes, route to Groq
CallLLM->>CallGroq: call_groq(prompt)
CallGroq-->>CallLLM: Response
CallLLM-->>Enrich: Enriched result
alt Alternative Backend
Enrich->>CallLLM: call_llm(prompt, backend='mlx')
CallLLM->>RouteCheck: Check backend != 'groq'
CallLLM->>CallOther: call_ollama/mlx(prompt)
CallOther-->>CallLLM: Response
CallLLM-->>Enrich: Enriched result
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| from .sanitize import Sanitizer | ||
|
|
||
| sanitizer = Sanitizer.from_env() | ||
| prompt, _sanitize_result = build_external_prompt(chunk, sanitizer, context_chunks) |
There was a problem hiding this comment.
New Sanitizer and spaCy model loaded per chunk
High Severity
Sanitizer.from_env() is called inside _enrich_one() for every single chunk when using the groq backend. Each new Sanitizer instance initializes self._nlp = None, so every chunk triggers a fresh spacy.load("en_core_web_sm") call (~1 second per model load), plus regex recompilation. The existing cloud_backfill.py correctly creates the sanitizer once via _init_sanitizer() and reuses it across all chunks. This pattern makes groq enrichment orders of magnitude slower than necessary — for a 100-chunk batch, this adds ~100 seconds of pure overhead just from spaCy loads.
|
|
||
| # Try primary backend | ||
| if effective == "mlx": | ||
| if effective == "groq": |
There was a problem hiding this comment.
Fallback-active check can silently bypass groq routing
Low Severity
In call_llm, the _fallback_active check at line 535 runs before the groq routing at line 543. If _fallback_active is ever True when effective == "groq", the function falls through to a local backend (mlx) instead of calling call_groq, silently sending sanitized prompts to the wrong backend and likely failing on every chunk. While run_enrichment resets the flag, direct callers of call_llm or future code changes could trigger this path.


Summary
--backend groq)build_external_prompt()+ Sanitizer strips PII before any content reaches Groq cloud--recent Nflag for on-demand enrichment of chunks from last N hoursChanges
enrichment.pycall_groq(),GROQ_*config,--backend/--recentCLI args, privacy routing in_enrich_one()cli/__init__.py--backendand--recentoptions forbrainlayer enrichvector_store.pysince_hoursparam onget_unenriched_chunks()test_groq_backend.pytest_recent_enrichment.pyUsage
Test plan
brainlayer enrich --backend groq --max 5with real API key🤖 Generated with Claude Code
Note
Medium Risk
Introduces a new cloud LLM backend (
groq) and routes prompts through sanitization to avoid sending raw PII externally; misconfiguration or sanitization gaps could leak data. Also changes chunk selection SQL for--recent, which could affect enrichment coverage/performance.Overview
Adds an optional Groq cloud backend to the enrichment pipeline, including
call_groq()(OpenAI-compatible) with usage logging and a requiredGROQ_API_KEY, and wires backend selection throughcall_llm()and the CLI via--backend.Enforces privacy for cloud calls by switching
_enrich_one()to usebuild_external_prompt()+Sanitizerwhen the effective backend isgroq, while keeping local backends on the existingbuild_prompt()path.Adds on-demand enrichment via
--recent HOURS, threadingsince_hoursthroughrun_enrichment()/enrich_batch()intoVectorStore.get_unenriched_chunks()to only fetch chunks created within the specified window. Includes new tests covering Groq routing/privacy andsince_hourspropagation.Written by Cursor Bugbot for commit e2b488c. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
Release Notes
--backendflag to explicitly select enrichment backend (ollama, mlx, groq, or auto-detect)--recentflag to enrich only chunks created within the specified number of hours