fix(agent-room): enforce 1:1 invariant + admin view + migrate legacy multi-member pods by lilyshen0722 · Pull Request #232 · Team-Commonly/commonly

lilyshen0722 · 2026-04-25T04:59:03Z

Summary

ADR-001 §3.10 says agent-rooms are personal 1:1 DMs — one user, one agent, no third party. The §3.10 amendment that corrected the original "N humans × 1 agent office" framing was documented but never enforced in code. Live data on api-dev showed agent-room pods with 5–6 members — a privacy/UX bug where what users perceive as a private DM was joinable by other people.

Surfaced during ADR-005 Phase 2 codex smoke (the commonly pod list output included agent-rooms with 5–6 members).

What changed

File	Change
`backend/controllers/podController.ts`	`joinPod` rejects any third-person join on `agent-room` (HTTP 403). `getAllPods` / `getPodsByType` skip the membership filter when caller is a global admin.
`backend/services/agentIdentityService.ts`	`ensureAgentInPod` refuses to add an agent to an agent-room when that agent isn't already a member — closes the auto-install path that was the most likely entry route for the multi-member legacy data.
`backend/services/dmService.ts`	Comment refresh — was still documenting the rejected N×1 "office" design.
`backend/scripts/migrate-agent-room-multimember.ts` (NEW)	Convert `agent-room` pods with 3+ members to `type: 'chat'`. Preserves messages, members, createdBy. Run with `--dry` to preview. Idempotent.
`backend/__tests__/unit/controllers/podController.test.js`	4 new tests: joinPod 403 on agent-room, chat-pod regression guard, admin bypass present, non-admin still filtered.
`backend/__tests__/unit/services/agentIdentityService.ensureAgentInPod.test.js` (NEW)	3 tests: refuse intruder agent, no-regression on chat pod, no-op for host agent re-running for its own DM.

Tests

PASS __tests__/unit/controllers/podController.test.js                12/12  (4 new)
PASS __tests__/unit/services/agentIdentityService.ensureAgentInPod   3/3   (all new)
PASS __tests__/unit/routes/pods.test.js                              5/5   (regression check)

20/20 across affected suites. No regression in adjacent test files.

Deploy steps after merge

Roll out the backend image (standard deploy).

Once running, trigger the migration on dev:

kubectl exec -n commonly-dev <backend-pod> -- \
  npx ts-node backend/scripts/migrate-agent-room-multimember.ts --dry
# review the offender list, then re-run without --dry

Verify commonly pod list --instance dev shows no agent-room pod with > 2 members.

Out of scope (filed as follow-ups)

agent-admin type: ADR-001 §3.10 says this is legacy and slated for deprecation as LiteLLM session observability matures. This PR doesn't touch it; agent-admin can still have multiple admins (its design intent).
Frontend admin UI for the new "view all agent-rooms" affordance — backend now supports it; UI surface is a separate ticket.
A periodic invariant-check cron that warns if any agent-room ever drifts back to >2 members (defense in depth) — not added since the three write paths are now all guarded.

Test plan

Unit tests for joinPod reject + chat regression
Unit tests for ensureAgentInPod refuse + chat regression + self-noop
Unit tests for getPodsByType admin bypass + non-admin filter
Migration script idempotent + dry-run mode
Manual smoke after merge: log in as a non-admin, attempt to join an existing agent-room owned by another user → expect 403
Manual smoke after merge: log in as admin, hit /api/pods?type=agent-room → expect ALL agent-rooms returned regardless of membership
Run migration against dev DB, verify commonly pod list shows no agent-room with >2 members afterward

🤖 Generated with Claude Code

…multi-member pods ADR-001 §3.10 says agent-rooms are personal 1:1 DMs (one user, one agent, no third party). The amendment that corrected the original "N humans × 1 agent office" framing was documented but never enforced in code. Live data showed agent-room pods with 5–6 members on dev, which is a privacy/UX bug: an agent's "DM" had multiple humans in it. Changes: 1. `joinPod` (podController.ts): reject any third-person join on a pod whose `type === 'agent-room'` with HTTP 403. Applies to humans AND agents — the room is closed once the host agent + opening user are in. Multi-party human↔agent surfaces should use `type: 'chat'`. 2. `ensureAgentInPod` (agentIdentityService.ts): refuse to add an agent to an agent-room when that agent isn't already a member. Closes the auto-install path that was sneaking second agents into existing DMs. Existing-member case is unaffected (host agent re-running for its own DM still no-ops). 3. `getAllPods` / `getPodsByType` (podController.ts): admin-view bypass. Global admins skip the membership filter on agent-room + agent-admin queries so the moderation surface for these private pod types actually exists. Non-admins still see only their own. 4. Migration script `migrate-agent-room-multimember.ts`: convert `agent-room` pods with 3+ members to `type: 'chat'`. Preserves all messages, members, createdBy — only the type field changes. Run with `--dry` to preview. Idempotent. 5. `getOrCreateAgentRoom` (dmService.ts): comment refresh — was documenting the rejected N×1 "office" design that the §3.10 amendment killed. Tests: 4 new podController cases (joinPod reject + chat-pod regression, admin bypass present + non-admin still filtered) + 3 ensureAgentInPod cases (refuse, regress chat, no-op for self). All 20 affected tests pass; no regression in adjacent suites. Migration on dev needs a manual run after deploy: kubectl exec -n commonly-dev <backend-pod> -- \\ npx ts-node backend/scripts/migrate-agent-room-multimember.ts --dry # review offenders, then drop --dry Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…comments + getAllPods test Three findings from the code-reviewer pass on PR #232. 1. CRITICAL — ObjectId equality bug in ensureAgentInPod (pre-existing, exposed by this PR). `pod.members.includes(agentUser._id)` uses JS `===` which compares object references, not ObjectId values. In production with real Mongoose docs, the host agent is always treated as "not a member" of its own room, the new agent-room guard fires, and `null` propagates to callers as "pod not found." Replace with `.some((m) => m.equals?.(id) ?? String(m) === String(id))` — same pattern already used by removeAgentFromPod a few lines below. The previous test masked this bug by overriding pod.members.includes with a hand-rolled .equals lookup. Test now uses fakeObjectId() with real `.equals()` semantics — no override — so it would fail under the broken `.includes()` and pass only with the fixed `.some(.equals)`. 2. Test gap on getAllPods admin bypass — only getPodsByType was directly tested. Add a parity case asserting that an admin caller hitting getAllPods with type=agent-room sees all agent-rooms, not just their own. (getAllPods has an extra .populate(parentPod) call vs. getPodsByType — the test mock chain mirrors that.) 3. The joinPod guard comment said "not even the creator (the agent itself)" can rejoin. Tighten to make explicit that `pod.createdBy` for agent-rooms is the host agent, not a human user, so the creator-bypass branch in the invite-only block is effectively dead for this pod type — preventing future readers from misreading the comment as implying a creator-bypass exists. 4. Migration script header — document the downstream consequences a reviewer flagged: UI tab move, privacy-filter behavior change after conversion, agentAutoJoinService scan visibility. These are correct behaviors but worth surfacing for the operator running the migration on prod. 21/21 tests pass on affected suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

samxu01 · 2026-04-25T21:56:10Z

Self-review pass via code-reviewer subagent — addressed the critical finding + two important items in fbd93c7512.

Finding	Severity	Resolution
`pod.members.includes(agentUser._id)` uses JS `===`, fails for Mongoose ObjectIds — host agent treated as "not a member" of its own room, the new agent-room guard fires, and `null` propagates to callers as "pod not found." Pre-existing bug; this PR exposed it.	Critical	Replace with `.some((m) => m.equals?.(id) ?? String(m) === String(id))` — same pattern as `removeAgentFromPod` below. Test mock that hand-rolled the override deleted; new test uses real `.equals()`-bearing fakes so it would fail under the broken `.includes()`.
`getAllPods` admin-bypass not directly tested (only `getPodsByType` was)	Important	Added parity case asserting an admin sees all agent-rooms via `getAllPods`
`joinPod` comment said "not even the creator" without clarifying that `pod.createdBy` for agent-rooms is the host agent (not a human user), so the creator-bypass in the invite-only block is effectively dead	Important	Tightened comment to make this explicit
Migration script silently changed `pod.type = 'chat'` without documenting downstream effects (UI tab move, privacy filter behavior, agentAutoJoinService scan visibility)	Important	Header now lists each consequence so the operator sees them before running on prod

Reviewer findings deliberately deferred:

Q: 8+ other ensureAgentInPod callers ignore the return value — true, and now null can mean "refused" not just "pod not found." For most callers (install paths) the membership prerequisite is already enforced upstream, so the new refuse path closes a back door rather than silently breaking. Worth a follow-up audit but not blocking this PR.
Q: Migration UX — the post-migration "5-member chat pod" is the simplest reasonable shape. All messages preserved, members preserved, just shows up under Chat instead of Agent DMs. Now documented in the script header.
Q: agentAutoJoinService visibility post-migration — same point, now documented as a known consequence the operator should audit before running on prod.

Verdict: Request changes → critical + 3 important addressed. Tests now 21/21 across affected suites (was 20/20 before; +1 for the getAllPods admin parity case).

This reverts commit 99f2ffd.

… chat PR #232's migration script converted any agent-room with >2 members to type: 'chat'. Investigating the actual offenders on dev showed every multi-member agent-room was an originally-1:1 DM (1 host agent + 1 human) polluted with rogue agents that auto-joined via ensureAgentInPod — exactly the bug PR #232's runtime guards close going forward. The relabel-to-chat strategy legitimized that pollution; restoration is the correct action. New migration logic: for each agent-room with members.length > 2: humans = members where User.isBot === false case humans.length === 1: keep [createdBy (host agent), the_human]; drop rogue agents stays type='agent-room' case humans.length === 0: agent↔agent DM. Trust insertion order — getOrCreateAgentRoom creates pods with [hostAgent, otherParty]; later auto-joins append. Keep [members[0], members[1]]; drop the rest. stays type='agent-room' case humans.length >= 2: Was never a DM (multi-human surfaces are chat pods, not DMs). type → 'chat'. All members preserved. 5 unit tests cover all three branches + dry-run + idempotency, mocking Pod + User so they run without a real Mongo connection. Also: tighten stale comment on POST /api/agents/runtime/room — was still documenting the rejected "many humans × one agent office" framing. Now matches ADR-001 §3.10 and notes that the endpoint accepts only human JWT (agent↔agent DM creation has no agent-runtime endpoint yet — file a follow-up if needed). The previous migration script was never run on dev, so swapping in the new logic doesn't require any rollback or compensating action. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

lilyshen0722 and others added 2 commits April 24, 2026 21:58

lilyshen0722 added 2 commits April 25, 2026 15:04

Align agent guidance and skill symlink

99f2ffd

Revert "Align agent guidance and skill symlink"

ee767a2

This reverts commit 99f2ffd.

samxu01 merged commit 9ea8ee0 into main Apr 25, 2026
10 checks passed

samxu01 deleted the fix/agent-room-1to1-invariant branch April 25, 2026 22:16

samxu01 mentioned this pull request Apr 25, 2026

fix(agent-room): rewrite migration to RESTORE 1:1 DMs, not relabel as chat #235

Merged

4 tasks

samxu01 mentioned this pull request Apr 26, 2026

feat(cli): close cross-agent loop — auto-handle agent.ask + ${COMMONLY_AGENT_TOKEN} substitution #237

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent-room): enforce 1:1 invariant + admin view + migrate legacy multi-member pods#232

fix(agent-room): enforce 1:1 invariant + admin view + migrate legacy multi-member pods#232
samxu01 merged 4 commits intomainfrom
fix/agent-room-1to1-invariant

lilyshen0722 commented Apr 25, 2026

Uh oh!

samxu01 commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lilyshen0722 commented Apr 25, 2026

Summary

What changed

Tests

Deploy steps after merge

Out of scope (filed as follow-ups)

Test plan

Uh oh!

samxu01 commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants