Single tag taxonomy across memory rows

## Background

Two prior discussions explain the memory system architecture and the mempalace-inspired tagging idea: #300 (semantic memory rationale) and #301 (mempalace-inspired additions, including topic pre-classification as a transparency feature).

## Context

The memory system currently has two tag-related concepts in flight:

1. **LLM-generated topical tags on episodes.** Stage-2 episode generation (#385, PR #387) attaches tags to each `source: episode` row at write time. Tags are stored on the row but not yet used for filtering at retrieval and not displayed in `/memory`.

2. **Topic pre-classification (#309).** A planned keyword-based tagger. Originally intended to pre-filter facts before vector search and to provide a fixed-vocabulary browse axis in `/memory`. Deferred after the memory evaluation work in #362 showed retrieval is robust without pre-filtering (the original retrieval-quality premise for #309 was disproven).

If both ship and both surface to end users, `/memory` ends up with two parallel tag taxonomies: one free-form LLM-generated set on episodes, one fixed-keyword set on facts. End users see two unrelated tag namespaces over the same memory store.

## Decision

One user-visible tag taxonomy across all memory rows. The LLM-generated tags from stage-2 episode generation become the single tag system. The keyword-based topic pre-classifier in #309 is retired (closed without implementation; reasoning preserved for future revisit).

## Concrete deltas

1. **Extend the stage-2 tag generator to fact extraction.** Tags currently attach only to `source: episode` rows at the moment they are created. Extend the same tag-generation step (or an equivalent prompt) to attach tags to `source: extracted` rows when facts are written. Implementation can reuse the existing tag prompt with content-appropriate framing.

2. **Close #309.** Add a closing comment explaining the deferral rationale (memory evaluation result in #362 falsified the retrieval-quality justification; the parallel-taxonomy concern is resolved by this issue) and link back to this issue.

3. **`/memory` UI v1: tags are not a primary browse axis.** Tags display as a per-row decoration (for example, "tagged: alpha, beta, gamma") and inform retrieval scoring. Users browse `/memory` by recency, source, or search; there is no "show me all memories tagged X" filter axis in v1. This sidesteps the unbounded-vocabulary concern (free-form LLM tags) for the v1 surface.

## Open question (deferred to implementation)

Vocabulary control for free-form LLM tags. Two candidate approaches:

- **(a) Soft-vocabulary prompt plus periodic dedup.** The tag-generation prompt includes a list of preferred tags ("prefer these N tags when applicable; only invent new ones when nothing fits"), seeded from existing usage. A periodic job merges near-duplicates (case differences, plural/singular, close semantic neighbors). Keeps the tag space tractable if tags are ever promoted to a browse axis.

- **(b) Accept unbounded vocabulary.** Tags remain free-form indefinitely. Acceptable while tags are decoration-only, but blocks promoting tags to a browse axis later without a separate cleanup pass.

(a) is preferred if and when tags become a browse axis. (b) is fine for the v1 decoration-only role and requires no extra work. The choice can wait until promoting tags to a filter axis is on the table.

## Why

The user-visible surface needs a single tag taxonomy. Two parallel systems (LLM tags plus keyword categories) mean two unrelated tag namespaces over the same memory store, which is confusing for end users and dilutes the trust/transparency value of `/memory`. Collapsing to one system before `/memory` exposes tags is cheaper than discovering the conflict at rollout time and walking it back.

The LLM tagger is the survivor for two reasons. First, it is already running in production for episodes, so extending it to facts is one prompt change rather than building a parallel system. Second, the original retrieval-quality justification for the keyword classifier in #309 was disproven by the memory evaluation work in #362, leaving #309 with only a "browse axis" purpose, which the v1 `/memory` UI explicitly does not need.

## Acceptance

- [ ] The stage-2 tag generator (or equivalent) is invoked on fact-extraction writes, producing `tags: [...]` on `source: extracted` rows.
- [ ] Tag display appears in the `/memory` per-row detail view. No filter-by-tag axis.
- [ ] #309 is closed with a link back to this issue and a short closing rationale.
- [ ] No regression in the existing episode tag pipeline.

## Out of scope

- Vocabulary control mechanism (deferred until tags become a browse axis; see Open question).
- Filter-by-tag UI in `/memory` (deferred; revisit only if the tag space stays well-shaped enough to make a closed list worthwhile).
- Tag-based retrieval-boost tuning beyond what is already in place.

## Related

- Epic: #306 (semantic memory system).
- Stage-2 episode tagger: #385, PR #387.
- Memory evaluation that falsified the #309 retrieval premise: #362.
- To be closed by this issue: #309.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single tag taxonomy across memory rows #388

Background

Context

Decision

Concrete deltas

Open question (deferred to implementation)

Why

Acceptance

Out of scope

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Single tag taxonomy across memory rows #388

Description

Background

Context

Decision

Concrete deltas

Open question (deferred to implementation)

Why

Acceptance

Out of scope

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions