taOSmd: triple-gate dedup + claim-level memify granularity

Follow-up to feedback from @m13v on #198. Two concrete changes to the memify / compression pipeline.

## Triple-gate before cosine dedup

Current compression pass dedups purely on cosine similarity over normalised chunks. This merges memories that are semantically close but factually distinct:

- "met with Jay on Monday about budgets"
- "met with Jay on Tuesday about hiring"

Cosine >0.9 on both, but they're different events. Fix:

1. Extract `(subject, relation, object)` triple from each candidate chunk.
2. Bucket candidates by overlapping triples.
3. Apply cosine similarity dedup **within a bucket only** — cosine becomes a secondary filter, not the primary key.
4. Across-bucket merges are never allowed regardless of cosine score.

Needs an entity+predicate extractor in the compression pipeline. We already have the machinery in the KG extraction step; the work is wiring it as a pre-dedup gate instead of a parallel concern.

## Claim-level memify granularity

Current pipeline over-chunks: one paragraph can produce 4+ overlapping facts that all carry identical metadata. The storage waste is manageable, but retrieval quality suffers because redundant atomic facts dilute the ranking.

Move to one-assertable-claim-per-node, and let the graph edges carry composition. Example:

- ❌ Today: 4 nodes for "Jay works at JAN LABS", "JAN LABS is a company", "Jay is employed", "Jay's employer is JAN LABS"
- ✅ Target: 1 node `{subject: Jay, predicate: works_at, object: JAN LABS}` — composition lives on the edges

## Acceptance

- [ ] Triple extractor gates dedup; Monday/Tuesday test case stays as two distinct memories
- [ ] memify output drops to ~1 claim per assertable statement; benchmark LongMemEval-S retrieval to verify Recall@5 doesn't regress
- [ ] Graph traversal replaces node-level composition in at least one retrieval path

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

taOSmd: triple-gate dedup + claim-level memify granularity #210

Triple-gate before cosine dedup

Claim-level memify granularity

Acceptance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

taOSmd: triple-gate dedup + claim-level memify granularity #210

Description

Triple-gate before cosine dedup

Claim-level memify granularity

Acceptance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions