Skip to content

agent(prompts): move per-step metadata out of <agent_state> into a tail block#4891

Merged
sauravpanda merged 2 commits into
mainfrom
prompt-cache/relocate-per-step-metadata
May 23, 2026
Merged

agent(prompts): move per-step metadata out of <agent_state> into a tail block#4891
sauravpanda merged 2 commits into
mainfrom
prompt-cache/relocate-per-step-metadata

Conversation

@sauravpanda
Copy link
Copy Markdown
Collaborator

@sauravpanda sauravpanda commented May 23, 2026

Context

Closes part of #4887 (item #3 — strip per-step metadata from anything prefix-stable).

AgentMessagePrompt._get_agent_state_description() was rendering two per-step-varying values inside <agent_state>:

  • Step{N+1} maximum:{M} — changes every step.
  • datetime.now().strftime('%Y-%m-%d') — changes daily.

The user message currently looks like:

<agent_history>...</agent_history>          ← grows append-only (prefix-stable if HistoryItem is stable)
<agent_state>...<step_info>...</...> </agent_state>   ← cache miss starts here today
<browser_state>...</browser_state>
<read_state>...</read_state>

So the cache boundary already lands at <agent_state> and the step counter inside it isn't actively bursting a live cache. But: the layout meant that any future move of more-stable <agent_state> fields (user_request, file_system, todo_contents) into the system prompt — or anywhere we'd want to cache them — would still leave per-step varying bytes sitting inside the would-be prefix. That silently caps how far the cache can ever extend.

Change

Pull the step counter + date into a new helper _get_step_meta_description() and append it at the very tail of get_user_message(), after <agent_state>, <browser_state>, <read_state>, <page_specific_actions>, and the unavailable-skills info block. The new layout:

<agent_history>...</agent_history>
<agent_state>...</agent_state>              ← no more <step_info> inside
<browser_state>...</browser_state>
<read_state>...</read_state>
<page_specific_actions>...</page_specific_actions>
[unavailable_skills_info]
<step_info>Step{N} maximum:{M}\nToday:{YYYY-MM-DD}</step_info>   ← suffix, explicitly per-step

Everything above <step_info> is now eligible to be treated as the cacheable region — when/if we want to push that boundary further out, no per-step varying bytes are in the way.

Tests

New regression tests at tests/ci/test_prompt_step_meta_suffix.py:

  • <step_info> appears after both <agent_state> and <browser_state>.
  • <step_info> does not leak back into <agent_state>.
  • Bytes before <step_info> are byte-identical across two different step numbers (proves the step counter isn't in the prefix).
  • <agent_state> block is byte-identical across step numbers.

Test plan

  • New tests pass.
  • Existing prompt / message_manager tests still pass (pytest tests/ci -k 'prompt or message_manager or agent_message').
  • pyright + ruff clean via pre-commit.
  • Eyeball one real agent loop to confirm the model still parses <step_info> correctly at the tail (no expected change in behavior — the LLM doesn't care about position).

Summary by cubic

Moved per-step metadata (step counter and date) out of <agent_state> into a trailing <step_info> block so the user-message prefix is stable for caching. Preps the prompt layout for deeper caching and covers part of #4887.

  • Refactors
    • Added _get_step_meta_description() and append it at the end of get_user_message() after agent, browser, read, page actions, and unavailable-skills blocks.
    • Removed per-step <step_info> from <agent_state> so all bytes before <step_info> are stable across steps.
    • Added tests to lock ordering, prevent leakage into <agent_state>, and verify a byte-identical prefix and <agent_state> across step numbers.

Written for commit b06b47a. Summary will update on new commits. Review in cubic

…il block

The step counter (Step N maximum:M) and datetime.now() were rendered
inside <agent_state>, ahead of <browser_state> in the user message.
The cache miss already happens at the <agent_state> boundary today, so
this isn't a live cache regression — but the layout meant that any
future move of more-stable agent_state fields into the system prompt
would still leave per-step varying bytes in the middle of the prefix,
silently capping how far the cache could extend.

Pull both fields into a new _get_step_meta_description() and append it
at the very end of get_user_message(), after <agent_state>,
<browser_state>, <read_state>, <page_specific_actions>, and unavailable-
skills info. Everything above this tail block is now eligible to be
treated as the cacheable region.

Adds regression tests that lock the layout:
- <step_info> must appear after <agent_state> and <browser_state>
- <step_info> must not leak back into <agent_state>
- bytes before <step_info> must be identical across two different step
  numbers (the step counter must not be in the prefix)
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2026

Agent Task Evaluation Results: 2/2 (100%)

View detailed results
Task Result Reason
browser_use_pip ✅ Pass Skipped - API key not available (fork PR or missing secret)
amazon_laptop ✅ Pass Skipped - API key not available (fork PR or missing secret)

Check the evaluate-tasks job for detailed task execution logs.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Re-trigger cubic

@sauravpanda sauravpanda merged commit 640360e into main May 23, 2026
99 checks passed
@sauravpanda sauravpanda deleted the prompt-cache/relocate-per-step-metadata branch May 23, 2026 01:32
r266-tech pushed a commit to r266-tech/browser-use that referenced this pull request May 26, 2026
…il block (browser-use#4891)

## Context

Closes part of browser-use#4887 (item browser-use#3 — strip per-step metadata from anything
prefix-stable).

`AgentMessagePrompt._get_agent_state_description()` was rendering two
per-step-varying values inside `<agent_state>`:

- `Step{N+1} maximum:{M}` — changes every step.
- `datetime.now().strftime('%Y-%m-%d')` — changes daily.

The user message currently looks like:

```
<agent_history>...</agent_history>          ← grows append-only (prefix-stable if HistoryItem is stable)
<agent_state>...<step_info>...</...> </agent_state>   ← cache miss starts here today
<browser_state>...</browser_state>
<read_state>...</read_state>
```

So the cache boundary already lands at `<agent_state>` and the step
counter inside it isn't actively bursting a live cache. **But**: the
layout meant that any future move of more-stable `<agent_state>` fields
(user_request, file_system, todo_contents) into the system prompt — or
anywhere we'd want to cache them — would still leave per-step varying
bytes sitting inside the would-be prefix. That silently caps how far the
cache can ever extend.

## Change

Pull the step counter + date into a new helper
`_get_step_meta_description()` and append it at the very tail of
`get_user_message()`, after `<agent_state>`, `<browser_state>`,
`<read_state>`, `<page_specific_actions>`, and the unavailable-skills
info block. The new layout:

```
<agent_history>...</agent_history>
<agent_state>...</agent_state>              ← no more <step_info> inside
<browser_state>...</browser_state>
<read_state>...</read_state>
<page_specific_actions>...</page_specific_actions>
[unavailable_skills_info]
<step_info>Step{N} maximum:{M}\nToday:{YYYY-MM-DD}</step_info>   ← suffix, explicitly per-step
```

Everything above `<step_info>` is now eligible to be treated as the
cacheable region — when/if we want to push that boundary further out, no
per-step varying bytes are in the way.

## Tests

New regression tests at `tests/ci/test_prompt_step_meta_suffix.py`:
- `<step_info>` appears after both `<agent_state>` and
`<browser_state>`.
- `<step_info>` does not leak back into `<agent_state>`.
- Bytes before `<step_info>` are byte-identical across two different
step numbers (proves the step counter isn't in the prefix).
- `<agent_state>` block is byte-identical across step numbers.

## Test plan

- [x] New tests pass.
- [x] Existing prompt / message_manager tests still pass (`pytest
tests/ci -k 'prompt or message_manager or agent_message'`).
- [x] pyright + ruff clean via pre-commit.
- [ ] Eyeball one real agent loop to confirm the model still parses
`<step_info>` correctly at the tail (no expected change in behavior —
the LLM doesn't care about position).

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Moved per-step metadata (step counter and date) out of `<agent_state>`
into a trailing `<step_info>` block so the user-message prefix is stable
for caching. Preps the prompt layout for deeper caching and covers part
of browser-use#4887.

- **Refactors**
- Added `_get_step_meta_description()` and append it at the end of
`get_user_message()` after agent, browser, read, page actions, and
unavailable-skills blocks.
- Removed per-step `<step_info>` from `<agent_state>` so all bytes
before `<step_info>` are stable across steps.
- Added tests to lock ordering, prevent leakage into `<agent_state>`,
and verify a byte-identical prefix and `<agent_state>` across step
numbers.

<sup>Written for commit b06b47a.
Summary will update on new commits. <a
href="https://cubic.dev/pr/browser-use/browser-use/pull/4891?utm_source=github">Review
in cubic</a></sup>

<!-- End of auto-generated description by cubic. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant