feat(context): track context tokens#2009
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
Assessment: Approve ✅ Clean, minimal implementation that addresses issue #1197 with zero overhead. The tests are comprehensive and cover all edge cases well. Review Details
Nice work keeping this focused and backward-compatible! 🎉 |
7ed59d7 to
9ff32d5
Compare
|
Thanks for addressing the docstring feedback! The Assessment: ✅ Approve - Ready to merge. |
|
Hi @lizradway, Thanks for this PR! I've been hoping for this kind of context token tracking in Strands for a while. One concern: this works well when the agent instance stays in memory, but when the agent is restored from a session (e.g., via RepositorySessionManager), EventLoopMetrics is re-initialized from scratch — so latest_context_tokens returns 0 until the next LLM call completes. This could cause issues for smart compaction strategies that rely on this value to make threshold-based decisions immediately after restoration. To address this, I think we'd also need per-message metrics metadata that gets persisted and restored alongside the messages themselves. I opened #1532 with a proposal for adding a _metadata field to messages for this purpose — it could complement this PR nicely. |
3771ebd to
516837d
Compare
|
Assessment: Approve (pending API review) ✅ The PR has evolved well based on review feedback. Key improvements since last review: Changes Summary
Code Quality: ✅ Clean implementation, comprehensive tests, proper type hints Documentation: With Awaiting API reviewer sign-off per the |
Description
Add a
latest_context_sizeproperty toEventLoopMetricsand acontext_sizeconvenience property onAgentResultthat surface the most recent context window size (in tokens) as reported by the model.Strands currently has no visibility into how full the context window is. The LLM already reports
inputTokenson every call, but that data was buried in per-cycle usage metrics. These properties expose it as a simple, top-level read:This is a lagging indicator (post-call), not a pre-call estimate. It enables downstream features like compression and externalization to make threshold-based decisions between invocations.
inputTokensalready returned by every LLM callUsageTypedDict withinputTokensRelated Issues
Closes #1197
Documentation PR
N/A — no user-facing documentation changes needed for this property addition.
Type of Change
New feature
Testing
hatch run prepareEventLoopMetrics.latest_context_size: no invocations, no cycles, multiple cycles, multiple invocations, missing inputTokens keyAgentResult.context_size: delegates to metrics, returns None when no dataChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.