feat: add Limits and support it during invoke/stream by notowen333 · Pull Request #2360 · strands-agents/sdk-python

notowen333 · 2026-05-28T17:01:33Z

Description

Motivation

An agent loop today runs until the model stops requesting tools, a hook halts it, or it's cancelled. There's no caller-side guardrail on cost or runaway behavior — anyone wanting to bound a run has to wire a hook themselves or wrap the call in a timeout that races against the model rather than terminating cleanly at a turn boundary. This adds first-class, per-invocation budget caps so callers can put a ceiling on a single invoke_async / stream_async without fighting the loop.

Ports the equivalent TypeScript feature merged in strands-agents/sdk-typescript#1106.

Public API Changes

Agent.__call__, Agent.invoke_async, and Agent.stream_async accept an optional limits argument with three caps:

# Cap loop iterations
result = await agent.invoke_async("Plan a trip", limits={"turns": 5})
if result.stop_reason == "limit_turns":
    ...

# Cap cumulative model-generated tokens across the loop
result = await agent.invoke_async("Summarize", limits={"output_tokens": 10_000})
if result.stop_reason == "limit_output_tokens":
    ...

# Cap cumulative input + output tokens (compounds across turns; approximates billed spend)
result = await agent.invoke_async("Research X", limits={"total_tokens": 100_000})

# Combine
result = await agent.invoke_async("Plan a trip", limits={
    "turns": 10, "output_tokens": 50_000, "total_tokens": 200_000,
})

When a cap is reached the loop terminates gracefully at the next turn boundary and returns an AgentResult with one of three new stop_reason values: "limit_turns", "limit_output_tokens", "limit_total_tokens". No exception is raised — this mirrors how "cancelled" already works, and leaves result.message and result.metrics accessible.

The existing "max_tokens" reason (and MaxTokensReachedException) is unchanged. That one signals the model provider's per-call output cap and still raises; the new caller-set caps are graceful.

Caps are scoped to a single invocation — counters reset on each invoke_async / stream_async call against the same agent. Tools requested by the previous turn always run to completion before a cap fires, so agent.messages stays in a state the agent can be reinvoked from. On simultaneous trip the priority is turns → total_tokens → output_tokens, giving the most informative reason in the result. Caps are soft: a single oversized model response can overshoot the budget by one turn, since checks happen at turn boundaries, not within an individual model call.

Each cap, when set, must be a positive int; invalid values raise TypeError at the start of the invocation. Backward compatible — omitting limits preserves prior behavior.

Why a `limits` dict instead of top-level kwargs

Grouping under limits keeps the top-level kwarg surface stable as future caps (wall-clock, cost, etc.) are added. The stop reasons keep a limit_ prefix rather than collapsing to a single "limit_exceeded" so granular telemetry / log buckets are preserved — callers shouldn't have to derive which cap tripped from result.metrics plus the limits they passed in.

Use Cases

Cost ceilings on user-facing agents — bound an interactive turn so a misbehaving model can't burn an unbounded number of tokens before the user sees a response.
Eval / batch harnesses — give every run a fixed turns budget so a single stuck agent doesn't pin a worker.
Cheap timeouts at turn boundaries — terminate cleanly with a real stop_reason instead of cancelling mid-stream.

Related Issues

#2124

Documentation PR

N/A — will follow up.

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

codecov · 2026-05-28T17:04:08Z

Codecov Report

❌ Patch coverage is 92.00000% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
strands-py/src/strands/event_loop/event_loop.py	84.61%	2 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

mypy narrows the loop key to a literal union, so the TypedDict access no longer needs the literal-required ignore.

feat: add Limits and support it during invoke/stream

bfa053a

notowen333 requested a review from zastrowm May 28, 2026 17:01

github-actions Bot added the size/m label May 28, 2026

notowen333 had a problem deploying to auto-approve May 28, 2026 17:01 — with GitHub Actions Failure

notowen333 requested a deployment to manual-approval May 28, 2026 17:01 — with GitHub Actions Waiting

zastrowm previously approved these changes May 28, 2026

View reviewed changes

fix: remove unnecessary type: ignore in _validate_limits

7e6fff1

mypy narrows the loop key to a literal union, so the TypedDict access no longer needs the literal-required ignore.

notowen333 dismissed zastrowm’s stale review via 7e6fff1 May 28, 2026 18:58

zastrowm approved these changes May 28, 2026

View reviewed changes

notowen333 had a problem deploying to manual-approval May 28, 2026 19:10 — with GitHub Actions Error

github-actions Bot added size/m and removed size/m labels May 28, 2026

notowen333 had a problem deploying to auto-approve May 28, 2026 19:10 — with GitHub Actions Failure

notowen333 enabled auto-merge (squash) May 28, 2026 19:58

notowen333 merged commit 9c17f2e into strands-agents:main May 28, 2026
36 of 40 checks passed

github-actions Bot mentioned this pull request May 29, 2026

chore: sync strands-agents/sdk-typescript into monorepo #2363

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Limits and support it during invoke/stream#2360

feat: add Limits and support it during invoke/stream#2360
notowen333 merged 2 commits into
strands-agents:mainfrom
notowen333:feat/per-invocation-limits

notowen333 commented May 28, 2026

Uh oh!

codecov Bot commented May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

notowen333 commented May 28, 2026

Description

Motivation

Public API Changes

Why a limits dict instead of top-level kwargs

Use Cases

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

codecov Bot commented May 28, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Why a `limits` dict instead of top-level kwargs