feat: track context window usage #260
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for tracking ACP context window usage by capturing usage_update session notifications and exposing the latest used/size values for downstream, context-aware features (e.g., /doctor, auto-compact triggers).
Changes:
- Added
UsageUpdate { used, size }toAcpEventand extendedclassify_notificationto parseusage_update. - Added
context_used/context_sizeatomic counters toAcpConnection. - Captured
usage_updatevalues in the ACP reader loop and stored them into the new atomics.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/acp/protocol.rs |
Adds a UsageUpdate event and parses used/size from usage_update notifications. |
src/acp/connection.rs |
Stores latest context usage counters on the connection and updates them when session/update emits usage_update. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Capture usage_update for context window tracking | ||
| if msg.method.as_deref() == Some("session/update") { | ||
| if let Some(upd) = msg.params.as_ref() | ||
| .and_then(|p| p.get("update")) | ||
| { | ||
| if upd.get("sessionUpdate").and_then(|v| v.as_str()) == Some("usage_update") { |
There was a problem hiding this comment.
The reader loop duplicates usage_update parsing logic that now also exists in protocol::classify_notification (and currently uses different semantics). Consider reusing the shared classifier or a small helper to avoid divergence as the ACP schema evolves.
|
@copilot apply changes based on the comments in this thread |
|
@copilot re-review |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let used = update.get("used").and_then(|v| v.as_u64()).unwrap_or(0); | ||
| let size = update.get("size").and_then(|v| v.as_u64()).unwrap_or(0); |
There was a problem hiding this comment.
In usage_update classification, missing/invalid used or size currently falls back to 0, which can silently clobber previously observed values and makes 0 indistinguishable from “field absent/unknown” (notably for backends that emit the event but omit these fields). Consider returning None unless both fields are present, or only updating the fields that are present (e.g., keep prior value when a field is missing).
| let used = update.get("used").and_then(|v| v.as_u64()).unwrap_or(0); | |
| let size = update.get("size").and_then(|v| v.as_u64()).unwrap_or(0); | |
| let used = update.get("used").and_then(|v| v.as_u64())?; | |
| let size = update.get("size").and_then(|v| v.as_u64())?; |
| if let Some(crate::acp::protocol::AcpEvent::UsageUpdate { used, size }) = | ||
| crate::acp::protocol::classify_notification(&msg) | ||
| { | ||
| ctx_used.store(used, Ordering::Relaxed); | ||
| ctx_size.store(size, Ordering::Relaxed); | ||
| } |
There was a problem hiding this comment.
context_used and context_size are updated independently with relaxed atomics. Any reader that loads both to compute a ratio can observe a mixed snapshot (e.g., used from the new update with size from the previous one). If downstream consumers will read them as a pair, consider providing a snapshot getter using a version/sequence approach, or storing both values behind a small lock so reads are consistent.
| /// Context window usage from the latest `usage_update` ACP notification. | ||
| pub context_used: Arc<std::sync::atomic::AtomicU64>, | ||
| pub context_size: Arc<std::sync::atomic::AtomicU64>, |
There was a problem hiding this comment.
The public fields context_used/context_size default to 0, which conflates “unknown/not provided by backend yet” with a legitimate 0 value. Since these are part of the connection’s public API surface, consider documenting the sentinel semantics (e.g., 0 == unknown) or representing “unknown” explicitly (Option/sentinel like u64::MAX) to avoid ambiguous downstream behavior.
|
Re: atomic consistency concern — intentional trade-off. The only current reader is The other two suggestions (require both fields via |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
嘿嘿終於 這是我第一次PR 下次不會再出現這些問題了 抱歉抱歉 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
160b698 to
572f88b
Compare
Capture `used` and `size` from ACP `usage_update` session notifications
into atomic counters on AcpConnection. This enables downstream features
like context display in /doctor and auto-compact triggers.
- Add `context_used` / `context_size` AtomicU64 fields to AcpConnection
- Capture values from `usage_update` in the reader loop
- Add `UsageUpdate { used, size }` variant to AcpEvent enum
- Add `usage_update` branch to classify_notification
Verified with Claude Code (1M ctx), Codex CLI (258K ctx), and Gemini CLI
(schema-defined). Zero-cost when no usage_update is emitted (Copilot).
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Replace inline usage_update parsing in the reader loop with a call to the shared classify_notification() helper (protocol.rs), keeping the parsing logic in one place and avoiding divergence as ACP evolves.
Address review feedback: - protocol.rs: use `?` instead of unwrap_or(0) so a usage_update missing either field returns None (preserves previous counters). - connection.rs: document that 0 means "no usage_update received yet". The two atomic fields are intentionally independent (no lock) because the only current reader is /doctor display, which tolerates a brief mixed snapshot. A snapshot helper can be added when paired reads become critical.
Add `.github/copilot-instructions.md` to guide GitHub Copilot's automated PR reviews with project-specific context: - ACP protocol correctness (JSON-RPC routing, notification handling) - Concurrency safety (atomic fields, Mutex across await, pool locks) - Discord API constraints (2000 char limit, 3s autocomplete deadline) - Skip CI-covered checks (rustfmt, clippy, tests) - >80% confidence threshold to reduce noise Modeled after block/goose's production-grade instructions. 3440 chars — within the 4000 char limit.
efb623c to
2d53709
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pub last_active: Instant, | ||
| pub session_reset: bool, | ||
| /// Context window usage from the latest `usage_update` ACP notification. | ||
| /// A value of 0 means no `usage_update` has been received yet (unknown). |
There was a problem hiding this comment.
The doc comment says a value of 0 means no usage_update has been received yet, but used can legitimately be 0 (e.g., empty/new context), so this makes the “unknown” state ambiguous and potentially misleading for downstream consumers. Consider tracking “unknown” separately (e.g., an AtomicBool flag), or document a sentinel based on context_size == 0 / Option-like encoding instead.
| /// A value of 0 means no `usage_update` has been received yet (unknown). | |
| /// `context_used` may legitimately be 0 (for example, for a new or empty context). | |
| /// Treat `context_size == 0` as meaning that no `usage_update` has been received yet. |
| - `session/request_permission` must always get a response (auto-allow or forwarded) | ||
| - `session/update` notifications must not be consumed — forward to subscriber after capture | ||
| - `usage_update`, `available_commands_update`, `tool_call`, `agent_message_chunk` must be classified correctly | ||
| - Timeout values: initialize=90s, session/new=120s, others=30s (Gemini cold-start is slow) |
There was a problem hiding this comment.
This instruction claims timeout values include initialize=90s, but the current implementation in acp/connection.rs uses 30s for all methods except session/new (120s); this mismatch can cause reviewers to flag the wrong behavior or miss real issues. Update the instruction to match the current code, or explicitly label these values as a desired target rather than current behavior.
| - Timeout values: initialize=90s, session/new=120s, others=30s (Gemini cold-start is slow) | |
| - Timeout values currently implemented: `session/new`=120s; all other methods, including `initialize`, use 30s |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Context window usage from the latest `usage_update` ACP notification. | ||
| /// `context_used` may legitimately be 0 (e.g., new or empty context). | ||
| /// Treat `context_size == 0` as meaning no `usage_update` has been received yet. | ||
| pub context_used: Arc<std::sync::atomic::AtomicU64>, | ||
| pub context_size: Arc<std::sync::atomic::AtomicU64>, |
There was a problem hiding this comment.
context_used and context_size are updated via separate atomic stores, so readers can observe a mixed snapshot (new used with old size, or vice versa). Either document that these values are best-effort and may be temporarily inconsistent, or provide a helper that reads them consistently (e.g., read size, then used, then re-read size and retry if it changed).
|
✦ 是的,在没有这个 PR 之前,现在的 Discord 完全没有这个功能。 从刚才的搜索结果来看:
PR #260 补齐了这块拼图:它在底层让 OpenAB 能够“听懂” Agent 发出的 Context 使用量通知,并实时存下来。这样之后就可以在 Discord 消息里通过 Emoji 进度条、或者特定命令(如 没有这个 PR 的话,用户在 Discord 上对 Agent 的上下文消耗情况是完全未知的。 |
|
Re: atomic consistency — this is an intentional trade-off documented in the field's doc comment. The two counters are updated from the same |
|
通过分析 PR #260 的变动文件:
这个 PR 是如何“准备”让 Discord 展示这些值的? 1 // src/acp/connection.rs 这意味着,虽然这个 PR 还没在 src/discord.rs 里加代码,但它已经打通了数据链路。后续在处理 Discord 消息(例如用户输入一个 /usage 指令,或者在每一轮对话结束时)时,Discord 的 Handler 1 let used = connection.context_used.load(Ordering::Relaxed); 为什么作者不直接把 Discord 展示也做了?
结论:这个 PR 目前只是“准备好了弹药”。它让 OpenAB 的核心引擎具备了感知 Context 使用量的能力,但真正的 Discord 展示逻辑(比如在回复消息末尾加一个进度条 [▓▓░░░░] 25%)可能计划在下一个 |
Append a small footer to each agent response showing the current context window utilization, e.g.: 📊 Context: 31434/1000000 tokens (3%) Only shown when context_size > 0 (i.e., at least one usage_update has been received). Uses Discord's -# small text syntax. No impact on backends that don't emit usage_update.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let ctx_size = conn.context_size.load(std::sync::atomic::Ordering::Relaxed); | ||
| let final_content = if ctx_size > 0 { | ||
| let ctx_used = conn.context_used.load(std::sync::atomic::Ordering::Relaxed); | ||
| let pct = (ctx_used as f64 / ctx_size as f64 * 100.0).round() as u64; |
There was a problem hiding this comment.
The % computation can produce nonsensical values if ctx_used > ctx_size (backend bug or a mixed atomic snapshot) and the float→int cast will saturate for huge ratios; clamp the percentage to 100 and prefer integer math (e.g., via u128) to avoid float precision/saturation surprises.
| let pct = (ctx_used as f64 / ctx_size as f64 * 100.0).round() as u64; | |
| let pct = (((ctx_used as u128) * 100 + (ctx_size as u128) / 2) / (ctx_size as u128)) | |
| .min(100) as u64; |
| // Append context window usage footer if available. | ||
| let ctx_size = conn.context_size.load(std::sync::atomic::Ordering::Relaxed); | ||
| let final_content = if ctx_size > 0 { | ||
| let ctx_used = conn.context_used.load(std::sync::atomic::Ordering::Relaxed); | ||
| let pct = (ctx_used as f64 / ctx_size as f64 * 100.0).round() as u64; | ||
| format!("{final_content}\n-# 📊 Context: {ctx_used}/{ctx_size} tokens ({pct}%)") |
There was a problem hiding this comment.
This PR adds new user-visible output formatting for the final Discord message (context footer) but there are no tests covering the formatting/percent logic; consider extracting the footer formatting into a small pure helper and adding unit tests for the ctx_size == 0 and ctx_used > ctx_size cases.
| // Append context window usage footer if available. | |
| let ctx_size = conn.context_size.load(std::sync::atomic::Ordering::Relaxed); | |
| let final_content = if ctx_size > 0 { | |
| let ctx_used = conn.context_used.load(std::sync::atomic::Ordering::Relaxed); | |
| let pct = (ctx_used as f64 / ctx_size as f64 * 100.0).round() as u64; | |
| format!("{final_content}\n-# 📊 Context: {ctx_used}/{ctx_size} tokens ({pct}%)") | |
| fn format_context_footer(ctx_used: u64, ctx_size: u64) -> Option<String> { | |
| if ctx_size == 0 { | |
| return None; | |
| } | |
| let pct = (ctx_used as f64 / ctx_size as f64 * 100.0).round() as u64; | |
| Some(format!( | |
| "-# 📊 Context: {ctx_used}/{ctx_size} tokens ({pct}%)" | |
| )) | |
| } | |
| // Append context window usage footer if available. | |
| let ctx_size = conn.context_size.load(std::sync::atomic::Ordering::Relaxed); | |
| let ctx_used = conn.context_used.load(std::sync::atomic::Ordering::Relaxed); | |
| let final_content = if let Some(footer) = format_context_footer(ctx_used, ctx_size) { | |
| format!("{final_content}\n{footer}") |
- Skip footer when message is >1900 chars to prevent it splitting into a separate Discord message - Shorten "📊 Context: X/Y tokens (N%)" to "📊 X/Y tokens (N%)"
Instead of hiding the context footer on long messages, append it to the last chunk if it fits within the 2000-char limit. This ensures users always see context usage — especially important for long responses that consume more context window.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Append context footer to the last chunk if it fits. | ||
| let content = if i == last_idx && !ctx_footer.is_empty() | ||
| && chunk.len() + ctx_footer.len() < 2000 | ||
| { | ||
| format!("{chunk}{ctx_footer}") | ||
| } else { | ||
| chunk.to_string() | ||
| }; |
There was a problem hiding this comment.
The footer-fit check uses chunk.len()/ctx_footer.len() (byte length), but Discord’s 2000 limit (and format::split_message) are based on Unicode character count; use .chars().count() (or a shared helper) so the footer isn’t incorrectly dropped for non-ASCII content.
| let chunks = format::split_message(&final_content, 2000); | ||
| let last_idx = chunks.len().saturating_sub(1); | ||
| for (i, chunk) in chunks.iter().enumerate() { | ||
| // Append context footer to the last chunk if it fits. | ||
| let content = if i == last_idx && !ctx_footer.is_empty() | ||
| && chunk.len() + ctx_footer.len() < 2000 | ||
| { | ||
| format!("{chunk}{ctx_footer}") | ||
| } else { |
There was a problem hiding this comment.
Right now the context footer is only appended if it fits in the last chunk; when the final chunk is near the 2000-char limit, the footer is silently omitted, which contradicts “display it in Discord after each prompt response”. Consider emitting the footer as its own extra message/chunk when it doesn’t fit.
- Extract format_context_footer() helper with unit tests - Clamp percentage to 100 via u128 integer math (handles used > size) - Use chars().count() instead of byte .len() for Discord 2000-char limit - Send footer as separate message when it doesn't fit in last chunk - Document atomic consistency semantics on context_used/context_size Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Append context footer to the last chunk if it fits (char count, not bytes). | ||
| let content = if i == last_idx { | ||
| if let Some(ref footer) = ctx_footer { | ||
| if chunk.chars().count() + footer.chars().count() < 2000 { | ||
| format!("{chunk}{footer}") | ||
| } else { | ||
| chunk.to_string() | ||
| } |
There was a problem hiding this comment.
The footer append check uses < 2000, so a chunk that would be exactly 2000 characters after appending the footer won’t get it; the follow-up >= 2000 check will then send the footer as a separate message even though it would have fit exactly. Consider using <= 2000 for the append condition and > 2000 (or equivalent) for the separate-message condition to avoid this off-by-one behavior.
| if let Some(ref footer) = ctx_footer { | ||
| let last_chunk = chunks.last().map(|c| c.as_str()).unwrap_or(""); | ||
| if last_chunk.chars().count() + footer.chars().count() >= 2000 { | ||
| let _ = channel.say(&ctx.http, footer).await; |
There was a problem hiding this comment.
When the footer doesn’t fit and is sent as a separate message, it currently includes a leading newline from format_context_footer, which will render as an empty first line in Discord. Consider sending a newline-less variant when posting the footer as a standalone message (or trimming leading \n in this branch).
| let _ = channel.say(&ctx.http, footer).await; | |
| let standalone_footer = footer.trim_start_matches('\n'); | |
| let _ = channel.say(&ctx.http, standalone_footer).await; |
| /// atomic stores, so readers may observe a mixed snapshot (e.g., new `used` | ||
| /// with old `size`). This is acceptable for best-effort display purposes; | ||
| /// any consumer requiring a consistent pair should use the snapshot helper. |
There was a problem hiding this comment.
The struct doc comment mentions “any consumer requiring a consistent pair should use the snapshot helper,” but there doesn’t appear to be a snapshot helper in this module. Either add the referenced helper (e.g., a method that reads both atomics consistently) or adjust the comment so it doesn’t point to a non-existent API.
| /// atomic stores, so readers may observe a mixed snapshot (e.g., new `used` | |
| /// with old `size`). This is acceptable for best-effort display purposes; | |
| /// any consumer requiring a consistent pair should use the snapshot helper. | |
| /// atomic stores, so readers may observe a mixed pair (e.g., new `used` | |
| /// with old `size`). This is acceptable for best-effort display purposes. | |
| /// Code that requires a coordinated view of both values must use additional | |
| /// synchronization instead of reading these atomics independently. |
- Use <= 2000 for footer append check (fixes off-by-one at boundary)
- Use > 2000 for separate-message condition (consistent with above)
- trim_start_matches('\n') on standalone footer (no empty first line)
- Remove reference to non-existent snapshot helper in doc comment
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Footer string no longer carries a leading '\n' — callers add the separator explicitly. This eliminates the trim_start_matches hack when sending the footer as a standalone message. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Store ctx_size before ctx_used to reduce race where reader sees size=0 (sentinel for "no data") while used is already updated - Add "Context:" label to footer to match PR description format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Track ACP context window usage from
usage_updatesession notifications and display it in Discord after each prompt response. Also adds project-specific Copilot code review instructions.Changes
src/acp/protocol.rsUsageUpdate { used, size }variant toAcpEvent; parse via?— returnsNoneif either field is missingsrc/acp/connection.rscontext_used/context_sizeAtomicU64fields; capture values fromusage_updatevia sharedclassify_notificationparsersrc/discord.rsformat_context_footer()helper with 4 unit tests.github/copilot-instructions.mdHow it works
Discord display
After each agent response, a small footer line appears:
-#small text syntax (renders as subdued footer)context_size > 0(at least oneusage_updatereceived)usage_update(e.g. Copilot)Design decisions
context_size == 0means unknown —context_usedcan legitimately be 0 (empty context), socontext_sizeis the sentinel?instead ofunwrap_or(0)— missing fields ->None-> counters unchanged (no silent clobbering)classify_notification— parsing logic stays inprotocol.rs, reader loop delegatesOrdering::Relaxed— sufficient for best-effort metric counters;sizestored beforeusedto reduce sentinel raceformat_context_footer()helper — pure function, separately testable; percentage clamped to 100 viau128integer mathchars().count()for Discord limit — Discord's 2000 limit is character-based, not byte-basedTests
4 new unit tests for
format_context_footer:context_footer_none_when_size_zero— returnsNonewhen size is 0context_footer_normal_percentage— correct format with "Context:" labelcontext_footer_clamps_over_100—used > sizeclamped to 100%context_footer_100_percent— exact 100% boundaryCopilot instructions
The
.github/copilot-instructions.mdteaches Copilot's automated PR reviewer about OpenAB-specific concerns:.await, session pool lock scope)>80%confidence threshold