Skip to content

feat: groq backend + on-demand enrichment#43

Merged
EtanHey merged 1 commit intomainfrom
feat/groq-backend
Feb 26, 2026
Merged

feat: groq backend + on-demand enrichment#43
EtanHey merged 1 commit intomainfrom
feat/groq-backend

Conversation

@EtanHey
Copy link
Copy Markdown
Owner

@EtanHey EtanHey commented Feb 26, 2026

Summary

  • Add Groq cloud LLM backend for enrichment pipeline (--backend groq)
  • Privacy enforced: build_external_prompt() + Sanitizer strips PII before any content reaches Groq cloud
  • Add --recent N flag for on-demand enrichment of chunks from last N hours
  • GROQ_API_KEY in 1Password (vault: development)

Changes

File Change
enrichment.py call_groq(), GROQ_* config, --backend/--recent CLI args, privacy routing in _enrich_one()
cli/__init__.py --backend and --recent options for brainlayer enrich
vector_store.py since_hours param on get_unenriched_chunks()
test_groq_backend.py 14 tests: call_groq, routing, privacy, config
test_recent_enrichment.py 3 tests: since_hours threading

Usage

# Groq for non-private content (youtube transcripts, public docs)
GROQ_API_KEY=gsk_... brainlayer enrich --backend groq --max 100

# On-demand: enrich only recent chunks
brainlayer enrich --recent 2 --max 50

Test plan

  • 520 tests pass (17 new), 0 failures
  • Lint clean (ruff)
  • Privacy: groq backend uses build_external_prompt() with Sanitizer
  • Local backends (mlx/ollama) unchanged — use build_prompt() as before
  • Manual: brainlayer enrich --backend groq --max 5 with real API key

🤖 Generated with Claude Code


Note

Medium Risk
Introduces a new cloud LLM backend (groq) and routes prompts through sanitization to avoid sending raw PII externally; misconfiguration or sanitization gaps could leak data. Also changes chunk selection SQL for --recent, which could affect enrichment coverage/performance.

Overview
Adds an optional Groq cloud backend to the enrichment pipeline, including call_groq() (OpenAI-compatible) with usage logging and a required GROQ_API_KEY, and wires backend selection through call_llm() and the CLI via --backend.

Enforces privacy for cloud calls by switching _enrich_one() to use build_external_prompt() + Sanitizer when the effective backend is groq, while keeping local backends on the existing build_prompt() path.

Adds on-demand enrichment via --recent HOURS, threading since_hours through run_enrichment()/enrich_batch() into VectorStore.get_unenriched_chunks() to only fetch chunks created within the specified window. Includes new tests covering Groq routing/privacy and since_hours propagation.

Written by Cursor Bugbot for commit e2b488c. This will update automatically on new commits. Configure here.

Summary by CodeRabbit

Release Notes

  • New Features
    • Added support for Groq as a cloud-based LLM backend option for enrichment
    • Added --backend flag to explicitly select enrichment backend (ollama, mlx, groq, or auto-detect)
    • Added --recent flag to enrich only chunks created within the specified number of hours

Add Groq cloud LLM backend for enrichment pipeline (non-private content only).
Privacy enforced via build_external_prompt() + Sanitizer — PII stripped before
any content reaches Groq. Also adds --recent N flag for on-demand enrichment
of chunks from the last N hours.

- call_groq(): OpenAI-compatible API with auth, usage logging
- --backend groq CLI flag (enrichment.py + CLI)
- --recent HOURS flag for on-demand enrichment after indexing
- since_hours param threaded through VectorStore → enrich_batch → run_enrichment
- 17 new tests (14 groq, 3 recent), 520 total passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@EtanHey EtanHey merged commit 58371cb into main Feb 26, 2026
4 of 6 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 26, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8eaf49e and e2b488c.

📒 Files selected for processing (5)
  • src/brainlayer/cli/__init__.py
  • src/brainlayer/pipeline/enrichment.py
  • src/brainlayer/vector_store.py
  • tests/test_groq_backend.py
  • tests/test_recent_enrichment.py

📝 Walkthrough

Walkthrough

Adds Groq cloud backend support to the enrichment pipeline with new CLI options --backend and --recent. Implements call_groq function, introduces time-based chunk filtering via since_hours, and updates backend routing logic. Includes comprehensive test coverage for Groq integration and recent enrichment workflows.

Changes

Cohort / File(s) Summary
CLI Enrich Command
src/brainlayer/cli/__init__.py
Added backend and recent options to the enrich command. Backend allows runtime LLM backend selection (ollama, mlx, groq); recent filters enrichment to chunks from the last N hours. Sets environment variable and overrides enrichment backend before calling run_enrichment.
Enrichment Pipeline Core
src/brainlayer/pipeline/enrichment.py
Introduces Groq backend support with call_groq function, GROQ_API_KEY, GROQ_URL, GROQ_MODEL constants. Extends enrich_batch and run_enrichment with since_hours and backend parameters. Updates backend routing logic to call Groq for external backend, uses build_external_prompt with Sanitizer for groq backend, and adds API key validation at runtime.
Vector Store Filtering
src/brainlayer/vector_store.py
Added optional since_hours parameter to get_unenriched_chunks to filter chunks by recency (created_at newer than now minus given hours). When provided, appends SQL WHERE condition for time-based filtering.
Groq Backend Tests
tests/test_groq_backend.py
Comprehensive test suite validating Groq backend integration: call_groq function behavior, backend routing via call_llm, privacy enforcement through prompt sanitization, API key requirement, Authorization header presence, model configuration, and Supabase logging of token usage.
Recent Enrichment Tests
tests/test_recent_enrichment.py
Tests verify since_hours parameter propagation through enrich_batch to store.get_unenriched_chunks, confirms default behavior when since_hours is None, and validates run_enrichment signature includes the new parameter.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant Enrich as enrich()
    participant RunEnrich as run_enrichment()
    participant Store as Vector Store
    participant Groq as Groq API
    participant Log as Supabase Log

    User->>Enrich: enrich(..., backend='groq', recent=24)
    Enrich->>Enrich: Set BRAINLAYER_ENRICH_BACKEND='groq'
    Enrich->>RunEnrich: run_enrichment(..., since_hours=24)
    RunEnrich->>RunEnrich: Validate GROQ_API_KEY
    RunEnrich->>Store: get_unenriched_chunks(..., since_hours=24)
    Store-->>RunEnrich: Chunks from last 24 hours
    RunEnrich->>RunEnrich: For each chunk, build_external_prompt (sanitize)
    RunEnrich->>Groq: call_groq(sanitized_prompt, timeout)
    Groq-->>RunEnrich: Enrichment response
    RunEnrich->>Log: Log token usage with model tag
    Log-->>RunEnrich: Logged
    RunEnrich-->>User: Enrichment complete
Loading
sequenceDiagram
    participant User as User/CLI
    participant Enrich as enrich()
    participant CallLLM as call_llm()
    participant RouteCheck as Backend Check
    participant CallGroq as call_groq()
    participant CallOther as call_ollama/mlx()

    User->>Enrich: enrich(..., backend='groq')
    Enrich->>Enrich: Set BRAINLAYER_ENRICH_BACKEND
    Enrich->>CallLLM: call_llm(prompt, backend='groq')
    CallLLM->>RouteCheck: Check backend == 'groq'
    RouteCheck-->>CallLLM: Yes, route to Groq
    CallLLM->>CallGroq: call_groq(prompt)
    CallGroq-->>CallLLM: Response
    CallLLM-->>Enrich: Enriched result
    
    alt Alternative Backend
        Enrich->>CallLLM: call_llm(prompt, backend='mlx')
        CallLLM->>RouteCheck: Check backend != 'groq'
        CallLLM->>CallOther: call_ollama/mlx(prompt)
        CallOther-->>CallLLM: Response
        CallLLM-->>Enrich: Enriched result
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • EtanHey/brainlayer#14: Both PRs modify enrichment backend selection and runtime routing in src/brainlayer/pipeline/enrichment.py, adding backend override propagation and CLI options for LLM backend selection.

Poem

🐰✨ The rabbit hops with glee,
Groq clouds now set enrichment free,
Whisper --recent for chunks so fine,
Time-bound treasures in every line!
🌙🚀

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/groq-backend

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

from .sanitize import Sanitizer

sanitizer = Sanitizer.from_env()
prompt, _sanitize_result = build_external_prompt(chunk, sanitizer, context_chunks)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New Sanitizer and spaCy model loaded per chunk

High Severity

Sanitizer.from_env() is called inside _enrich_one() for every single chunk when using the groq backend. Each new Sanitizer instance initializes self._nlp = None, so every chunk triggers a fresh spacy.load("en_core_web_sm") call (~1 second per model load), plus regex recompilation. The existing cloud_backfill.py correctly creates the sanitizer once via _init_sanitizer() and reuses it across all chunks. This pattern makes groq enrichment orders of magnitude slower than necessary — for a 100-chunk batch, this adds ~100 seconds of pure overhead just from spaCy loads.

Fix in Cursor Fix in Web


# Try primary backend
if effective == "mlx":
if effective == "groq":
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback-active check can silently bypass groq routing

Low Severity

In call_llm, the _fallback_active check at line 535 runs before the groq routing at line 543. If _fallback_active is ever True when effective == "groq", the function falls through to a local backend (mlx) instead of calling call_groq, silently sending sanitized prompts to the wrong backend and likely failing on every chunk. While run_enrichment resets the flag, direct callers of call_llm or future code changes could trigger this path.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant