agent(prompts): move per-step metadata out of <agent_state> into a tail block by sauravpanda · Pull Request #4891 · browser-use/browser-use

sauravpanda · 2026-05-23T00:58:34Z

Context

Closes part of #4887 (item #3 — strip per-step metadata from anything prefix-stable).

AgentMessagePrompt._get_agent_state_description() was rendering two per-step-varying values inside <agent_state>:

Step{N+1} maximum:{M} — changes every step.
datetime.now().strftime('%Y-%m-%d') — changes daily.

The user message currently looks like:

<agent_history>...</agent_history>          ← grows append-only (prefix-stable if HistoryItem is stable)
<agent_state>...<step_info>...</...> </agent_state>   ← cache miss starts here today
<browser_state>...</browser_state>
<read_state>...</read_state>

So the cache boundary already lands at <agent_state> and the step counter inside it isn't actively bursting a live cache. But: the layout meant that any future move of more-stable <agent_state> fields (user_request, file_system, todo_contents) into the system prompt — or anywhere we'd want to cache them — would still leave per-step varying bytes sitting inside the would-be prefix. That silently caps how far the cache can ever extend.

Change

Pull the step counter + date into a new helper _get_step_meta_description() and append it at the very tail of get_user_message(), after <agent_state>, <browser_state>, <read_state>, <page_specific_actions>, and the unavailable-skills info block. The new layout:

<agent_history>...</agent_history>
<agent_state>...</agent_state>              ← no more <step_info> inside
<browser_state>...</browser_state>
<read_state>...</read_state>
<page_specific_actions>...</page_specific_actions>
[unavailable_skills_info]
<step_info>Step{N} maximum:{M}\nToday:{YYYY-MM-DD}</step_info>   ← suffix, explicitly per-step

Everything above <step_info> is now eligible to be treated as the cacheable region — when/if we want to push that boundary further out, no per-step varying bytes are in the way.

Tests

New regression tests at tests/ci/test_prompt_step_meta_suffix.py:

<step_info> appears after both <agent_state> and <browser_state>.
<step_info> does not leak back into <agent_state>.
Bytes before <step_info> are byte-identical across two different step numbers (proves the step counter isn't in the prefix).
<agent_state> block is byte-identical across step numbers.

Test plan

New tests pass.
Existing prompt / message_manager tests still pass (pytest tests/ci -k 'prompt or message_manager or agent_message').
pyright + ruff clean via pre-commit.
Eyeball one real agent loop to confirm the model still parses <step_info> correctly at the tail (no expected change in behavior — the LLM doesn't care about position).

Summary by cubic

Moved per-step metadata (step counter and date) out of <agent_state> into a trailing <step_info> block so the user-message prefix is stable for caching. Preps the prompt layout for deeper caching and covers part of #4887.

Refactors
- Added _get_step_meta_description() and append it at the end of get_user_message() after agent, browser, read, page actions, and unavailable-skills blocks.
- Removed per-step <step_info> from <agent_state> so all bytes before <step_info> are stable across steps.
- Added tests to lock ordering, prevent leakage into <agent_state>, and verify a byte-identical prefix and <agent_state> across step numbers.

^{Written for commit b06b47a. Summary will update on new commits. Review in cubic}

…il block The step counter (Step N maximum:M) and datetime.now() were rendered inside <agent_state>, ahead of <browser_state> in the user message. The cache miss already happens at the <agent_state> boundary today, so this isn't a live cache regression — but the layout meant that any future move of more-stable agent_state fields into the system prompt would still leave per-step varying bytes in the middle of the prefix, silently capping how far the cache could extend. Pull both fields into a new _get_step_meta_description() and append it at the very end of get_user_message(), after <agent_state>, <browser_state>, <read_state>, <page_specific_actions>, and unavailable- skills info. Everything above this tail block is now eligible to be treated as the cacheable region. Adds regression tests that lock the layout: - <step_info> must appear after <agent_state> and <browser_state> - <step_info> must not leak back into <agent_state> - bytes before <step_info> must be identical across two different step numbers (the step counter must not be in the prefix)

github-actions · 2026-05-23T00:59:14Z

Agent Task Evaluation Results: 2/2 (100%)

View detailed results

Task	Result	Reason
browser_use_pip	✅ Pass	Skipped - API key not available (fork PR or missing secret)
amazon_laptop	✅ Pass	Skipped - API key not available (fork PR or missing secret)

Check the evaluate-tasks job for detailed task execution logs.

cubic-dev-ai

No issues found across 2 files

_{Re-trigger cubic}

…il block (browser-use#4891) ## Context Closes part of browser-use#4887 (item browser-use#3 — strip per-step metadata from anything prefix-stable). `AgentMessagePrompt._get_agent_state_description()` was rendering two per-step-varying values inside `<agent_state>`: - `Step{N+1} maximum:{M}` — changes every step. - `datetime.now().strftime('%Y-%m-%d')` — changes daily. The user message currently looks like: ``` <agent_history>...</agent_history> ← grows append-only (prefix-stable if HistoryItem is stable) <agent_state>...<step_info>...</...> </agent_state> ← cache miss starts here today <browser_state>...</browser_state> <read_state>...</read_state> ``` So the cache boundary already lands at `<agent_state>` and the step counter inside it isn't actively bursting a live cache. **But**: the layout meant that any future move of more-stable `<agent_state>` fields (user_request, file_system, todo_contents) into the system prompt — or anywhere we'd want to cache them — would still leave per-step varying bytes sitting inside the would-be prefix. That silently caps how far the cache can ever extend. ## Change Pull the step counter + date into a new helper `_get_step_meta_description()` and append it at the very tail of `get_user_message()`, after `<agent_state>`, `<browser_state>`, `<read_state>`, `<page_specific_actions>`, and the unavailable-skills info block. The new layout: ``` <agent_history>...</agent_history> <agent_state>...</agent_state> ← no more <step_info> inside <browser_state>...</browser_state> <read_state>...</read_state> <page_specific_actions>...</page_specific_actions> [unavailable_skills_info] <step_info>Step{N} maximum:{M}\nToday:{YYYY-MM-DD}</step_info> ← suffix, explicitly per-step ``` Everything above `<step_info>` is now eligible to be treated as the cacheable region — when/if we want to push that boundary further out, no per-step varying bytes are in the way. ## Tests New regression tests at `tests/ci/test_prompt_step_meta_suffix.py`: - `<step_info>` appears after both `<agent_state>` and `<browser_state>`. - `<step_info>` does not leak back into `<agent_state>`. - Bytes before `<step_info>` are byte-identical across two different step numbers (proves the step counter isn't in the prefix). - `<agent_state>` block is byte-identical across step numbers. ## Test plan - [x] New tests pass. - [x] Existing prompt / message_manager tests still pass (`pytest tests/ci -k 'prompt or message_manager or agent_message'`). - [x] pyright + ruff clean via pre-commit. - [ ] Eyeball one real agent loop to confirm the model still parses `<step_info>` correctly at the tail (no expected change in behavior — the LLM doesn't care about position).  --- ## Summary by cubic Moved per-step metadata (step counter and date) out of `<agent_state>` into a trailing `<step_info>` block so the user-message prefix is stable for caching. Preps the prompt layout for deeper caching and covers part of browser-use#4887. - **Refactors** - Added `_get_step_meta_description()` and append it at the end of `get_user_message()` after agent, browser, read, page actions, and unavailable-skills blocks. - Removed per-step `<step_info>` from `<agent_state>` so all bytes before `<step_info>` are stable across steps. - Added tests to lock ordering, prevent leakage into `<agent_state>`, and verify a byte-identical prefix and `<agent_state>` across step numbers. <sup>Written for commit b06b47a. Summary will update on new commits. <a href="https://cubic.dev/pr/browser-use/browser-use/pull/4891?utm_source=github">Review in cubic</a></sup>

cubic-dev-ai Bot reviewed May 23, 2026

View reviewed changes

Merge branch 'main' into prompt-cache/relocate-per-step-metadata

b06b47a

sauravpanda merged commit 640360e into main May 23, 2026
99 checks passed

sauravpanda deleted the prompt-cache/relocate-per-step-metadata branch May 23, 2026 01:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent(prompts): move per-step metadata out of <agent_state> into a tail block#4891

agent(prompts): move per-step metadata out of <agent_state> into a tail block#4891
sauravpanda merged 2 commits into
mainfrom
prompt-cache/relocate-per-step-metadata

sauravpanda commented May 23, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

github-actions Bot commented May 23, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sauravpanda commented May 23, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Change

Tests

Test plan

Summary by cubic

Uh oh!

github-actions Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Agent Task Evaluation Results: 2/2 (100%)

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sauravpanda commented May 23, 2026 •

edited by cubic-dev-ai Bot

Loading

github-actions Bot commented May 23, 2026 •

edited

Loading