-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Labels
benchmarkPerformance/cost benchmarkingPerformance/cost benchmarkingllmLLM integration featuresLLM integration featuresv0.9.0v0.9.0 LLM-Augmented Intelligencev0.9.0 LLM-Augmented Intelligence
Description
Summary
Augment the import pipeline with LLM-powered fact extraction to capture the WHY behind decisions, implicit relationships, and confidence calibration — things the rule-based extractor misses.
Priority: HIGH IMPACT — improves quality of every future import.
Spec
Modified Pipeline: internal/extract/extract.go
Current flow: file → chunk → rule-based extract → facts
New flow: file → chunk → rule-based extract → LLM enrichment (optional) → facts
func EnrichFacts(ctx context.Context, llm llm.Provider, chunk string, ruleFacts []Fact) ([]Fact, error)What the LLM adds:
- Decision reasoning: "Q locked ORB config" → also extracts "because IEX volume filter was the problem"
- Implicit relationships: "SB needs this for Eyes Web" → links SB ↔ Eyes Web ↔ health
- Confidence calibration: "we might try X" (tentative, 0.4) vs "X is locked" (definitive, 0.9)
- Missed facts: Things the rule extractor skipped but are clearly important
LLM Prompt Strategy
- Send: the raw chunk + the rule-extracted facts
- Ask: "What additional facts are missing? What reasoning/relationships did the rules miss?"
- Format: structured JSON output matching existing Fact schema
- Dedup: compare LLM facts against rule facts before inserting (fuzzy match on subject+predicate)
CLI Integration
cortex import notes.md --extract --enrich # rule extract + LLM enrichment
cortex import notes.md --extract --enrich --llm google/gemini-3-flash
cortex import notes.md --extract # unchanged (rule-based only)Sync Integration
cortex connect sync --provider file --extract --enrich # enriched syncFiles to Create/Modify
internal/extract/enrich.go— LLM enrichment logicinternal/extract/enrich_test.go— tests (mock LLM)internal/search/prompts/enrich_facts.txt— prompt templateinternal/extract/extract.go— wire enrichment into pipelinecmd/cortex/main.go— add--enrichflag to import command
Benchmark Test Spec
Test Corpus
Use 10 real memory files from the Cortex test fixtures (or anonymized versions):
- 3 daily notes (decisions, conversations, progress)
- 2 MEMORY.md sections (curated facts)
- 2 trading journal entries (technical decisions)
- 1 agent handoff doc
- 1 meeting notes
- 1 config change log
Metrics (per file, per model)
| Metric | Target |
|---|---|
| Latency (per chunk) | <3s |
| Tokens in | <500 |
| Tokens out | <300 |
| Cost per import | <$0.01 |
| New facts found (vs rule-only) | ≥20% more |
| New fact quality (0-5 rubric) | ≥3.0 avg |
| False positive rate | <15% |
Quality Rubric for New Facts
- 5: Critical fact that rule extractor missed entirely
- 4: Useful relationship or reasoning not in rule output
- 3: Valid but somewhat obvious fact
- 2: Marginally useful, borderline noise
- 1: Duplicate of existing rule-extracted fact
- 0: Wrong or hallucinated fact
Benchmark Script
Create scripts/benchmark_enrich.go:
- Runs all 10 files through both models
- Compares LLM-enriched facts vs rule-only facts
- Human rates new facts on quality rubric
- Outputs: new fact count, quality scores, cost, latency per model
Acceptance Criteria
-
--enrichflag works oncortex import - LLM enrichment is additive (never removes rule-extracted facts)
- Dedup prevents inserting near-duplicate facts
- Confidence calibration adjusts fact confidence scores
- Relationship extraction creates proper subject→predicate→object triples
- Graceful fallback on LLM error (rule-only results still saved)
- Benchmark results documented in PR
- All existing tests pass
Dependencies
- 🧠 Query Expansion (Pre-Search) #216 (Query Expansion) — uses
internal/llm/adapter
Estimated Cost
$0.01-0.05 per import cycle × 8 cycles/day = **$0.08-0.40/month**
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
benchmarkPerformance/cost benchmarkingPerformance/cost benchmarkingllmLLM integration featuresLLM integration featuresv0.9.0v0.9.0 LLM-Augmented Intelligencev0.9.0 LLM-Augmented Intelligence