Skip to content

docs: update skill docs with dead event recovery, multi-server routing, timestamp convention#515

Merged
PureWeen merged 3 commits intomainfrom
fix/skill-docs-388
Apr 6, 2026
Merged

docs: update skill docs with dead event recovery, multi-server routing, timestamp convention#515
PureWeen merged 3 commits intomainfrom
fix/skill-docs-388

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

@PureWeen PureWeen commented Apr 6, 2026

Summary

Addresses the 3 remaining documentation gaps from issue #388.

Changes

.claude/skills/processing-state-safety/SKILL.md:

  • Added Dead Event Stream Recovery section documenting LoadHistoryFromDiskAsync
  • Covers the method's purpose, its 3 call sites in Organization.cs, and timestamp filtering
  • Explains the defense-in-depth pattern: FlushCurrentResponse → events.jsonl → LoadHistoryFromDiskAsync

.claude/skills/multi-agent-orchestration/SKILL.md:

  • Added Multi-Server Routing via GetClientForGroup section with decision tree, callers, thread safety notes
  • Added CopilotService.Codespace.cs to Key Files table
  • Added Timestamp Convention section explaining why DateTime.Now (local time) is used for dispatch filtering and how PendingOrchestration.StartedAt is UTC-to-local converted on resume

Closes #388

PureWeen and others added 3 commits April 5, 2026 19:17
…g, timestamp convention

Addresses issue #388 — remaining documentation gaps:

- processing-state-safety: Add Dead Event Stream Recovery section documenting
  LoadHistoryFromDiskAsync, its 3 call sites, and timestamp filtering
- multi-agent-orchestration: Add GetClientForGroup multi-server routing docs
  with decision tree, callers, and thread safety notes
- multi-agent-orchestration: Add timestamp convention section explaining why
  DateTime.Now (local time) is used for dispatch filtering and how
  PendingOrchestration.StartedAt is converted on resume

Closes #388

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses review findings:
- Call site #2 is RecoverFromPrematureIdleIfNeededAsync (dispatch-phase),
  not synthesis fallback
- Update line numbers to match current code (~2466/~2762/~3059)
- Fix LoadBestHistoryAsync description: uses most-recent-user-message
  selection (5s threshold), not completeness comparison
- Fix LoadBestHistoryAsync line reference (793, not 795)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Timestamp: parsed event timestamp is primary, DateTime.Now is fallback
  (doc had it backwards)
- GetClientForGroup callers: 'all' → 'primary' — codespace health-check
  and permission-recovery resumes bypass this method

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Apr 6, 2026

🔍 Multi-Model Code Review — PR #515

docs: update skill docs with dead event recovery, multi-server routing, timestamp convention

CI Status

⚠️ No CI checks reported (documentation-only PR)


Findings

🟡 MODERATE — Timestamp fallback priority described backwards (1/3 reviewers)

File: .claude/skills/multi-agent-orchestration/SKILL.md — Timestamp Convention section

The doc stated ChatMessage.Timestamp "defaults to DateTime.Now... falls back to parsed event timestamp" — but the code does the opposite: sets DateTime.Now as a fallback, then overwrites with the parsed event timestamp if present. The parsed timestamp is primary, DateTime.Now is the fallback.

Status:FIXED in fb1ce43 — corrected to "uses the parsed event timestamp if present, falling back to DateTime.Now"


🟡 MODERATE — "all" callers overstated for GetClientForGroup (1/3 reviewers)

File: .claude/skills/multi-agent-orchestration/SKILL.md — Multi-Server Routing section

The doc said callers were "all session creation/resume paths" but ResumeCodespaceSessionsAsync and TryRecoverPermissionAsync use their client references directly, bypassing GetClientForGroup.

Status:FIXED in fb1ce43 — changed "all" to "primary" and documented the exceptions.


Verified Correct ✅

All 3 reviewers confirmed these claims are accurate:

  • GetClientForGroup at Codespace.cs:38, uses SnapshotGroups(), fail-fast design
  • LoadHistoryFromDiskAsync parses 4 event types, uses FileShare.ReadWrite
  • Three call sites at ~2466, ~2762, ~3059 with correct triggers and context
  • Call site Add mobile sidebar top-bar & flyout UI #2 is RecoverFromPrematureIdleIfNeededAsync (dispatch-phase, [DISPATCH-RECOVER] tag)
  • LoadBestHistoryAsync at line 793: most-recent-user-message selection, 5s threshold
  • Timestamp convention code snippet is verbatim match of lines 3028–3030
  • PendingOrchestration.StartedAt set with DateTime.UtcNow, converted to local on resume

Test Coverage

Documentation-only PR — no tests needed.

Recommendation

Approve — both findings already fixed in fb1ce43. All factual claims verified accurate by all 3 reviewers.

@PureWeen PureWeen merged commit 7ebf7c2 into main Apr 6, 2026
@PureWeen PureWeen deleted the fix/skill-docs-388 branch April 6, 2026 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update skill docs: document lazy resume, premature idle recovery, stale references

1 participant