Skip to content

v3.1.0

Choose a tag to compare

@github-actions github-actions released this 27 May 17:05
· 185 commits to main since this release

3.1.0 - 2026-05-27

The loop story: a sibling tier for journal-shaped run-state, plus
five surface improvements that close adjacent audit gaps. Tool count
goes from 18 to 22; on-disk format unchanged (SCHEMA_VERSION stays at
1, additive-only).

Added — episode tier (sibling to memory)

  • episode_write(body, takeaway?, scopes?). Journal-shaped writes
    for run-state and iteration takeaways. The durability gate
    (TRANSIENT_PHRASE_MARKERS) that rejects state-shaped content on
    memory_write doesn't apply here — episodes are the alternative
    home transient content used to lack. Storage at
    <root>/episodes/<session_id>/<ulid>.md, 30-day TTL, pruned on
    each write. Invisible to memory_search / memory_health /
    memory_list.
  • episode_handoff(prior_session_id?, max_episodes?). Read the
    most-recent N takeaways from a prior session in this worktree.
    Auto-resolves the session via the event-log boundary when
    prior_session_id is omitted. Designed as the first MCP call at
    /loop iteration entry.
  • episode_search(scopes?, parent_session_id?, since?, max_results?).
    Cross-session lookup. Not ranked — episodes are chronological,
    filtered by scope intersection / session id / ISO timestamp.
  • episode_promote(episode_id, scopes, category?, ..., use_body?).
    Distill a takeaway into a durable memory via the standard
    memory_write path (full durability gate fires). Deletes the
    source episode on commit; leaves it intact on any non-committed
    status so the caller can adjust and retry.
  • bettermemory episodes list | prune CLI. Mirrors
    bettermemory tombstones in shape — offline inspection and a
    manual TTL-based cleanup pass.

Added — memory_search proactive surface

  • memory_search(since_prior_session=True) filter. Restricts
    candidates to memories updated at or after the prior-session
    boundary (find_prior_session_boundary against the recorder's
    session_id). Loop entry can now ask "what's changed in this
    worktree since last time I was here?" without scanning. Empty
    return when no prior session exists — caller distinguishes
    "nothing new" from "no baseline" via
    curation_pending_new_since_last_session.
  • depends_on_resolved on hits. When a hit's memory carries
    MemoryLink(type="depends_on", ...) links, the targets'
    summaries (and link notes) are inlined automatically. Bounded:
    3 per hit, 10 total. Closes the "graph in the schema, retrieval
    ignores it" gap that's been open since 2.x.

Added — proactive curation surface

  • HealthReport.recommendations. Distills the bucket rollups
    (dead_weight, contradicted, endorsement_debt, drifted, rare_scopes)
    into actionable one-line suggestions with {kind, summary, action, count, memory_ids} shape. Closed enum RECOMMENDATION_KINDS so a
    consumer can switch over them. Size-driven kinds fire at 3+; per-row
    kinds at 1+.
  • Inline curation_hint on memory_write responses. One-shot per
    session: when dead + drifted + endorsement_debt >= threshold
    (configurable, default 5), the first successful write inlines a
    one-line nudge. New curation_hint_threshold and
    curation_hint_enabled config knobs; 0 / false disables.
  • endorsement_debt_ratio_threshold config knob. Default 0.0
    preserves the existing strict "zero explicit applies" rule. Setting

    0 also flags memories whose explicit/total-applied ratio falls
    below the threshold — catches the "1 explicit endorsement out of 50
    auto" case the binary check misses.

  • recently_removed_in_worktree on memory_scope_overview. Count
    of tombstones removed in the last 7 days, filtered by
    origin.worktree_root under auto_scope=True. Hint when the model
    is about to re-cover ground it already explicitly trimmed.

Fixed

The loops-phase-1 surface ran two audit drains and a post-merge polish
pass before this release; the fixes below catch the airtight-blocking
items the audit cycle surfaced. Each line names the user-visible win;
the commit hash carries the implementation.

Concurrency / multi-MCP correctness:

  • Empty-dir prune branch now serialised against concurrent
    episode_write under the per-session flock, with a recheck after
    acquisition so a sibling worktree can't lose its just-written
    episode to a stale prune decision [cef3e23].
  • _delete_source_episode (called from episode_promote) holds the
    per-session flock for the unlink + empty-dir rmdir window, so a
    concurrent episode_write to the same session can't observe a
    half-deleted directory tree [5910a39].
  • prune_old_sessions past-cutoff branch acquires the same
    per-session flock and re-checks the prune predicate after lock
    acquisition, closing a TOCTOU window where a concurrent
    episode_write could land an episode mid-prune [a4565b8].

Durability (POSIX fsync discipline):

  • Episode._write_path now uses fsync_file + fsync_dir on the
    atomic rename, matching the memory writer's crash-durability
    guarantees [7017b2c].
  • First write of a fresh event log calls fsync_dir on the parent so
    an OS-level crash between create-and-write doesn't leave the dirent
    in flight [0ea5094].
  • Episode prune (rmdir + rmtree), the first write to a brand-new
    session_dir, and _delete_source_episode all now fsync_dir the
    parent after dirent-mutating operations so the directory state
    survives a crash on the same footing as the file contents
    [36fc35f].
  • semantic.flush_persistent_cache chmods the temp file before
    atomic-rename so a process crash between rename and chmod can't
    leave the cache world-readable [d77217b].

Scoping / privacy:

  • episode_search and episode_handoff now honor disabled_scopes,
    so the same session-local opt-out users already trust on the memory
    side applies to the episode tier [b982ad0].
  • episode_handoff filters prior_session_id candidates by the
    caller's worktree before adoption, so an episode-bearing prior
    session from a sibling worktree isn't accidentally adopted
    [2988fff].
  • episode_handoff applies the same worktree filter to the
    zero-episode candidate-adoption path, closing the gap where a
    prior session with no episodes could still be adopted across
    worktrees [1a77999].
  • memory_search re-applies the active scope filter to the FTS
    prefilter result before depends_on auto-pull, so a graph edge
    can't drag in a target from a disabled scope [bf92912].
  • depends_on auto-pull targeted-load applies scope and origin
    filters to targets fetched outside the FTS prefilter set, so the
    graph-edge expansion respects the same isolation as direct hits
    [00ac037].

Size caps / data integrity:

  • episode_write enforces max_content_bytes on the body and
    returns a structured rejection, so an oversized journal entry
    can't silently truncate at the storage layer [a60bce2].
  • episode_write enforces max_takeaway_bytes on the takeaway
    field separately from the body so a giant takeaway can't silently
    drop on commit [4d36967].
  • Episode.scopes and Memory.scopes both cap at 64 entries on
    load, preventing pathological scope lists from blowing up FTS
    prefilter cost or scope-overview pagination [e928b33].

Search correctness:

  • memory_search(since_prior_session=True) bypasses the FTS
    prefilter so candidates that genuinely matter after the prior
    session boundary aren't dropped by a pre-boundary token-frequency
    cutoff [3bd27dc].
  • since_prior_session boundary is now strict-after — the
    prior-session boundary memory itself is excluded from results, so
    the count aligns with curation_pending_new_since_last_session's
    delta semantics [ffad750].
  • endorsement_debt_ratio_threshold config knob now threads through
    every callsite (compute_health, curation_counts, the CLI), so
    setting it once actually changes the rollups everywhere they're
    surfaced [3db9cfc].
  • episode_search(max_results=N) returns the most-recent N
    episodes instead of the oldest N, matching the loop-iteration
    intent where recent run-state is the relevant slice [3d77bac].

Tests pinning previously-implicit invariants:

  • recently_removed_in_worktree worktree filter pinned by an
    explicit test so a future refactor of memory_scope_overview
    can't quietly drop the per-worktree slicing [0c131b9].
  • max_total cross-hit cap for depends_on auto-pull pinned in
    00ac037 so a graph-heavy memory can't blow the global budget
    even when each hit stays under its per-hit cap.

Documentation accuracy (model-facing):

  • Public API docs sync for the episode tier and the curation
    surface so a consumer reading docs/api.md gets shape-accurate
    return values for every tool in the 22-tool surface [1b41b51].
  • Handler DESC strings synced across memory_search,
    memory_health, memory_scope_overview, and memory_write so
    the FastMCP-published descriptions match the implementation's
    field enumeration [053ab9d].
  • docs/api.md sweep: episode_search shape, memory_show full
    field enumeration, memory_health timestamp surface, and
    episode_handoff filter semantics all corrected so the page
    ships as the canonical reference [8c072f9].
  • DESC drift cleanup: memory_audit_turn predicate language,
    since_prior_session wording, and the four episode-tier DESCs
    all reworded for accuracy against the implementation [a2076d8].

Internal

  • New Episode Pydantic model + EpisodeStore (lazy directory
    creation, atomic frontmatter writes, traversal-safe session_id
    validation, TTL-based prune that exempts the active session).
  • _TOOL_REF_RE in tests/test_prompts.py broadened to cover both
    memory_* and episode_* families so SKILL.md / system prompt
    parity catches drift in either tool group.
  • FastMCP instructions block carries a one-line loop pointer
    (episode_handoff at entry, episode_write(takeaway) at exit)
    under the 1700-char ceiling.
  • plugin/skills/bettermemory/SKILL.md gains a full Episodes:
    the sibling tier for run-state
    section with the loop-iteration
    pattern, storage layout, and the episode_promote lifecycle note.
  • README, docs/api.md, docs/ROADMAP.md, CONTRIBUTING.md, and
    plugin/README.md all updated to the new 22-tool count with the
    memory_* + episode_* split called out.
  • 50+ new tests across test_episodes.py, test_server.py,
    test_health.py, test_cli_smoke.py, test_config.py,
    test_direct_imports.py, test_prompts.py, test_eval.py.

Full diff: v3.0.2...v3.1.0

Distributions: bettermemory-3.1.0-py3-none-any.whl + bettermemory-3.1.0.tar.gz published to PyPI via trusted publishing.