Skip to content

refactor: Refactor core agent flow, tighten messaging delivery, and add QA coverage#3

Merged
Fullstop000 merged 17 commits intomainfrom
codex/update-agents-testing
Mar 22, 2026
Merged

refactor: Refactor core agent flow, tighten messaging delivery, and add QA coverage#3
Fullstop000 merged 17 commits intomainfrom
codex/update-agents-testing

Conversation

@Fullstop000
Copy link
Copy Markdown
Owner

@Fullstop000 Fullstop000 commented Mar 21, 2026

Summary

  • refactor the Rust backend into dedicated server/ and store/ modules and align repo guidance/docs with the new layout
  • improve agent lifecycle, bridge, and prompt behavior so wakeups, split message tools, and activity logging are clearer and more reliable
  • tighten messaging behavior across DMs and threads, including scoped thread delivery to implicit thread participants only
  • add reusable QA infrastructure under qa/, expand messaging and activity cases, and add regression tests for delivery/read-path behavior
  • add CI and local guardrails with the Rust/UI workflow and pre-commit formatting/lint checks

Highlights

  • thread replies now reach only the parent agent author plus agents that have already replied in that same thread
  • unread message filtering now matches delivery policy, so unrelated agents no longer see thread messages just because they share the parent channel
  • wake-message handling carries concrete message context into agent restart flows
  • the activity timeline is more explicit about message send/receive and agent state transitions
  • QA now has static case docs, presets, report templates, and stronger messaging/activity coverage

Verification

  • cargo test
  • cargo build
  • focused browser QA for MSG-003 thread participant routing
  • browser DM smoke for MSG-002

Notes

  • per-run QA artifacts under qa/runs/ remain gitignored; the tracked qa/ docs/templates/cases are part of this PR

Move all cases from QA_CASES.md into qa/cases/ with one file per domain:
- cases/agents.md   — agent creation, lifecycle, profile, activity, navigation, workspace, recovery
- cases/channels.md — channel create, validation, membership, delete
- cases/messaging.md — channel fan-out, DMs, threads, history, attachments, error recovery
- cases/tasks.md    — task board create/advance and composer integration

QA_CASES.md becomes a lightweight index: preconditions, result definitions,
module table, and maintenance notes only.
…ements

Codex driver:
- rewrite system prompt to match Claude prompt structure (identity, tool
  list, startup sequence, messaging, tasks, memory sections)
- add view_file (mcp_chat_view_file) to tool list
- clarify block=false receive-only startup; no indefinite blocking
- add formatting rules: no HTML, no backtick-wrapped mentions/channels

Claude driver prompt:
- add view_file as tool #11 in the tool list
- improve startup sequence: explicit block=true idle loop, block=false
  mid-task check, never block while work is in progress
- add attachments section with view_file usage guidance
- add backtick mention formatting rule

Models:
- add suppress_agent_delivery flag on send request for human-only
  side effects (e.g. send + create task without triggering agent replies)

Store:
- expose data_dir() and attachments_dir() helpers on Store

Tests:
- extend server_tests.rs and add driver_tests.rs coverage

Docs:
- update AGENTS.md: generalize branch prefix to {agent}/, trim
  redundant multi-agent setup bullet (now in QA_CASES.md)
- update qa/README.md and QA_REPORT_TEMPLATE.md
- add qa/BUG_FIX_REPORT_TEMPLATE.md

Misc:
- gitignore _prompts/ and ui/tsconfig.tsbuildinfo
messaging.md:
- MSG-002: expand DM round-trip with token-based verification, target
  switching check, and stronger failure signals
- MSG-004 (new): DM wake and reply visibility — verifies a sleeping agent
  can wake from a DM and render the reply back into the correct timeline

agents.md:
- ACT-001: expand activity checks with per-row field verification
  (sender, target, preview) and refresh stability
- ACT-002 (new): activity timeline ordering during wake and recovery —
  verifies correct sequencing of wake-up flow tied to MSG-004

misc: QA_CASES.md module index, QA_REPORT_TEMPLATE, README updates,
handlers.rs formatting
@Fullstop000 Fullstop000 changed the title Refactor core modules, improve activity UX, and add QA regression docs Refactor core agent flow, tighten messaging delivery, and add QA coverage Mar 22, 2026
@Fullstop000 Fullstop000 changed the title Refactor core agent flow, tighten messaging delivery, and add QA coverage refactor: Refactor core agent flow, tighten messaging delivery, and add QA coverage Mar 22, 2026
@Fullstop000 Fullstop000 merged commit 6d1d647 into main Mar 22, 2026
3 checks passed
Fullstop000 added a commit that referenced this pull request Apr 27, 2026
…an label test

- templates.rs: pass human.id and result.id instead of human.name/result.name
  to join_channel (Copilot review comments #2, #3)
- store_tests: add UUID-id human join assertion verifying label resolution
  (Copilot review comment #1)
- agents.rs auto-join path already fixed in prior refactor commit 9e10c5c
  (Copilot review comment #4)
Fullstop000 added a commit that referenced this pull request Apr 27, 2026
…an label test

- templates.rs: pass human.id and result.id instead of human.name/result.name
  to join_channel (Copilot review comments #2, #3)
- store_tests: add UUID-id human join assertion verifying label resolution
  (Copilot review comment #1)
- agents.rs auto-join path already fixed in prior refactor commit 9e10c5c
  (Copilot review comment #4)
Fullstop000 added a commit that referenced this pull request Apr 28, 2026
…ng (#116)

* feat: system message when member joins a channel (#114)

When a member joins a channel via API handlers (creation, invite, team
assignment), post a server-authored system message into the channel so
the join is visible in chat history.

Backend:
- Added Store::resolve_member_label_tx to resolve human-readable labels
  (display_name for agents, name for humans)
- Added join_channel_by_id_with_system_message and
  join_channel_with_system_message: atomically insert membership row
  and create a system message, then emit both member_joined and
  message.created stream events. Idempotent — returns false and skips
  the system message when the member is already present.
- Updated all runtime API handlers to use the new methods:
  handle_create_channel, handle_invite_channel_member,
  handle_create_agent, handle_create_team, handle_add_team_member,
  handle_launch_trio

Tests:
- Added test_join_channel_with_system_message_creates_notice_and_is_idempotent
  verifying human join, agent join with 'Agent' prefix, and idempotency

* fix: ensure system message on agent creation by moving auto-join out of inner helper

The  function was directly inserting into
 for the #all channel. This meant that when
 later called
for auto-join channels, the INSERT OR IGNORE returned rows=0 (already a
member), so no system message was ever created.

Fix: remove the channel_members INSERT from
and have  /  call
 instead. The connection lock is
dropped first to avoid deadlock with the method's own lock acquisition.

QA verified: creating an agent now shows 'Agent <name> joined #all' in chat.

* refactor: eliminate join_channel duplication, promote system-message variants to canonical API

The old  and  duplicated the INSERT logic
and were only used by tests. The  variants were the
actual production API but had verbose names.

Changes:
- Removed old silent  /  from public API
- Renamed  →
- Renamed  →
-  delegates to  after name resolution, eliminating duplication
- Added  /  for unit tests
- Added  for integration tests
- Updated all test files to use silent helpers where they assert on message counts
- Fixed test data bugs where  was passed by name instead of ID

* fix(copilot-review): use stable IDs in template handler, add UUID human label test

- templates.rs: pass human.id and result.id instead of human.name/result.name
  to join_channel (Copilot review comments #2, #3)
- store_tests: add UUID-id human join assertion verifying label resolution
  (Copilot review comment #1)
- agents.rs auto-join path already fixed in prior refactor commit 9e10c5c
  (Copilot review comment #4)

* style: cargo fmt

* refactor: unify system-message structured payloads

Rename `messages.notice` column to `messages.payload` and migrate task
events from JSON-in-content to the same payload column. Two roles, one
column:

  - `content` — always-readable English fallback
  - `payload` — kind-discriminated JSON (`{kind, audience?, ...}`)

Producers:
  - `member_joined` → payload `{kind, audience: "humans", actor, verb, target}`,
    content `"alice joined #planning"`
  - `task_event` → payload (existing camelCase shape) + English sentence in
    content via new `as_human_sentence()` (no `[task]` prefix)

Agent visibility filter is structural — `payload.audience != 'humans'`,
not a kind allowlist. Adding new ambient kinds = set audience humans.
Adding new operational kinds = omit audience (defaults to all). Honors
the project memory rule "no typed event allowlists."

Frontend `Notice/NoticeActor/NoticeTarget` interfaces collapse to a loose
`MessagePayload` (`{kind, [k]: unknown}`); `SystemNotice` and
`parseTaskEvent` narrow at use time. `format_message_for_agent` deleted —
agents read `content` raw now that producers always write it.

No data migration. Existing dev DBs need to be reset on this branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.0.4.0)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: simplify v0.0.4.0 changelog entry

Drop implementation detail in favor of two user-facing bullets.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant