Skip to content

seed_project: delete_memories_by_tag('seeded') is global, ignores domain scope #16

@PSGSupport

Description

@PSGSupport

Summary

mcp_server/handlers/seed_project.py:180 purges memories globally by tag, regardless of the domain argument:

purged = 0
if not dry_run:
    purged = _get_store().delete_memories_by_tag("seeded")

This means every call to seed_project wipes all previously-seeded memory rows across every domain — only the most recent seed survives.

Reproducer

Sequentially seed two repos with distinct domains:

seed_project(directory="/path/to/repoA", domain="repo-a")
# stored 4, purged_stale 0

seed_project(directory="/path/to/repoB", domain="repo-b")
# stored 7, purged_stale 4   <-- wiped repoA's rows, despite different domain

memory_stats()
# only repoB's 7 rows + any pre-existing non-"seeded" memories

Expected: purged_stale 0 on the second call, since repo-b has no prior seeded memories.
Actual: every prior seeded-tagged memory is deleted regardless of domain.

Impact

  • Bootstrapping multiple repos in one session (the natural workflow on a multi-project dev machine) leaves only the last repo's structural-summary / configs / docs / entry-points in memory.
  • Knowledge-graph entities and relationships accumulate fine across calls (different table, different lifecycle), so cross-repo recall via the KG still works. The loss is the per-repo discovery memory rows.
  • Workaround: end with the repo whose rows you most want to retain.

Suggested fix

Scope the purge by domain when one is provided:

purged = 0
if not dry_run:
    purged = _get_store().delete_memories_by_tag_and_domain("seeded", domain) if domain \
             else _get_store().delete_memories_by_tag("seeded")

Or keep the global purge but filter by directory (since each seed has a directory field on the stored memory).

Environment

  • Cortex 3.14.5, Docker install
  • PostgreSQL 15 + pgvector
  • Discovered while bootstrapping 8 sibling repos with seed_project per repo

Notes from same session (separate, lower priority)

While verifying the above, also surfaced:

  1. remember, recall, and get_telemetry return structured_content must be a dict or None. Got str: ... from the FastMCP layer despite the underlying op succeeding. Likely missing/incorrect output_schema declaration on those handlers.
  2. query_methodology(cwd="C:/Users/...") does not slug-normalize Windows-style paths to the existing domain; passing project= works.

Happy to file these as separate issues if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions