Context
Browser Use Desktop now has a provider-neutral skills pattern that should be adapted for browser-use-terminal.
The important Desktop idea is not "browser-only skills" and not automatic URL memory. It is progressive procedural memory:
- Keep reusable instructions as local skill files.
- Inject only a compact metadata index into the agent prompt.
- Require the agent to explicitly search/view a skill before relying on full instructions.
- Let the agent create, patch, validate, or delete only persistent user skills under controlled rules.
- Treat skills as broad procedural memory: browser workflows, terminal workflows, debugging recoveries, repo conventions, output/reporting preferences, and recurring user processes.
Gregor's old per-task skill-memory experiment is useful mainly as prior art for the reflection checklist after a task: look for failure recovery, retries, uncertainty, stable selectors, auth quirks, API shapes, CLI commands, config paths, data formats, and verification steps that future agents should not rediscover. We should not copy its browser-only URL memory architecture.
browser-use-terminal already has adjacent pieces:
prompts/browser-agent-system.md explains the agent contract and browser-harness workflow.
prompts/interaction-skills/ contains read-only browser mechanics guidance.
crates/browser-use-core/src/tools/mod.rs owns the provider-neutral tool registry.
crates/browser-use-store owns the state dir and session event stream.
crates/browser-use-tui renders session events in the terminal UI.
Proposal
Add progressive skills to browser-use-terminal as reusable procedural memory.
1. Storage layout
Use the existing state-dir model:
.browser-use-terminal/
skills/
workflow/<name>/SKILL.md
browser/<name>/SKILL.md
debugging/<name>/SKILL.md
repo/<name>/SKILL.md
Bundled prompt skills stay read-only. User-created skills live under the state dir so they persist across sessions and are not overwritten by app updates.
A user SKILL.md should use simple frontmatter:
---
name: crm-triage
summary: Reusable CRM queue triage workflow after repeated account checks
---
# CRM Triage
Use when...
## Steps
...
## Verification
...
## Gotcha
X What seems obvious but fails
V What actually works
The X/V section is optional. It is useful for sharp lessons learned from a run, but the skill should stay one coherent reusable procedure rather than one tiny skill per gotcha.
2. Compact index injection
Inject a compact skill index into the prompt, not full skill bodies and not hidden runtime reminders after every browser action.
The index should include:
- bundled
prompts/interaction-skills/ metadata
- optional future bundled
prompts/domain-skills/ metadata
- user state-dir
skills/**/SKILL.md metadata
Example prompt section:
## Available Skills
Compact metadata index only. If a skill looks relevant, load full instructions with `skill_view` before using it.
### User skills
- user/workflow/crm-triage: CRM Triage - Reusable CRM queue triage workflow...
### Interaction skills
- interaction/screenshots: Screenshots - Capture and verify visual state...
This is the only "injection" proposed for the first version: normal prompt-time metadata injection. It should not auto-inject URL-matched tips into tool outputs.
3. Provider-neutral skill tools
Because terminal has a shared Rust tool registry, prefer native tools over a shell CLI wrapper:
skill_list
skill_search(query, limit?)
skill_view(id)
skill_create(id, summary, body)
skill_patch(id, old, new, replace_all?)
skill_delete(id) for user skills only
skill_validate(id)
Implementation surface:
- add handler kinds in
crates/browser-use-core/src/tools/mod.rs
- add filesystem-backed implementation alongside
tools/files.rs
- keep path traversal protection and restrict writes/deletes to state-dir user skills
- emit
skill.used for view/search hits and skill.written for create/patch/delete
4. Prompt lifecycle rules
Update prompts/browser-agent-system.md with broad, non-browser-only guidance:
- Search/view relevant skills before inventing browser, repo, terminal, debugging, or workflow-specific steps.
- After a successful nontrivial task, create or patch a user skill only if the new procedure is likely to repeat, long-running enough to justify reuse, or generally applicable beyond the current session.
- Do not write skills for one-off facts/calculations, temporary page state, secrets/tokens, private account details, failed/speculative workflows, or content that belongs in the task output.
Add a lightweight post-task reflection checklist:
- Did the run discover a repeatable procedure?
- Did it recover from an error in a way future agents should know?
- Did it learn a stable selector, API shape, auth flow, CLI command, config path, file layout, data format, or verification step?
- Did an existing skill help, fail, or need patching?
5. Non-goals
Do not copy the old cloud skill-memory architecture for this first version:
- No DB-backed URL-prefix memories.
- No one-gotcha-per-skill auto-generation.
- No automatic hidden injection after every navigation/tool call.
- No creating active skills from failed/speculative workflows.
- No free-text URL/tag dedupe as the only quality gate.
6. Tests and acceptance criteria
Add focused tests for:
- frontmatter parsing and invalid/missing summaries
- index construction and truncation
- search ranking over id/title/summary/body
skill_view returns full instructions and emits skill.used
- create/patch/delete are restricted to user skills and block traversal
- prompt contains compact index but not full user skill bodies
- no-write guidance is present for one-offs, secrets, failed/speculative workflows
Add a small eval/smoke script:
- find an existing interaction skill
- create a new reusable workflow skill after a successful task
- patch an existing skill after learning a better step
- refuse to write a skill for a one-off calculation
- refuse to write a skill for secret/private data
For terminal UI changes, run the repo's existing verification path:
scripts/verify-terminal-ui.sh
Why this shape
This is primarily the Browser Use Desktop skills model adapted to terminal. The only thing borrowed from Gregor's per-task work is the idea that a completed task can reveal reusable lessons. The system shape should stay conservative: broad procedural skills, explicit loading, compact metadata injection, validation, and write restrictions.
Context
Browser Use Desktop now has a provider-neutral skills pattern that should be adapted for
browser-use-terminal.The important Desktop idea is not "browser-only skills" and not automatic URL memory. It is progressive procedural memory:
Gregor's old per-task skill-memory experiment is useful mainly as prior art for the reflection checklist after a task: look for failure recovery, retries, uncertainty, stable selectors, auth quirks, API shapes, CLI commands, config paths, data formats, and verification steps that future agents should not rediscover. We should not copy its browser-only URL memory architecture.
browser-use-terminalalready has adjacent pieces:prompts/browser-agent-system.mdexplains the agent contract and browser-harness workflow.prompts/interaction-skills/contains read-only browser mechanics guidance.crates/browser-use-core/src/tools/mod.rsowns the provider-neutral tool registry.crates/browser-use-storeowns the state dir and session event stream.crates/browser-use-tuirenders session events in the terminal UI.Proposal
Add progressive skills to
browser-use-terminalas reusable procedural memory.1. Storage layout
Use the existing state-dir model:
Bundled prompt skills stay read-only. User-created skills live under the state dir so they persist across sessions and are not overwritten by app updates.
A user
SKILL.mdshould use simple frontmatter:The X/V section is optional. It is useful for sharp lessons learned from a run, but the skill should stay one coherent reusable procedure rather than one tiny skill per gotcha.
2. Compact index injection
Inject a compact skill index into the prompt, not full skill bodies and not hidden runtime reminders after every browser action.
The index should include:
prompts/interaction-skills/metadataprompts/domain-skills/metadataskills/**/SKILL.mdmetadataExample prompt section:
This is the only "injection" proposed for the first version: normal prompt-time metadata injection. It should not auto-inject URL-matched tips into tool outputs.
3. Provider-neutral skill tools
Because terminal has a shared Rust tool registry, prefer native tools over a shell CLI wrapper:
skill_listskill_search(query, limit?)skill_view(id)skill_create(id, summary, body)skill_patch(id, old, new, replace_all?)skill_delete(id)for user skills onlyskill_validate(id)Implementation surface:
crates/browser-use-core/src/tools/mod.rstools/files.rsskill.usedfor view/search hits andskill.writtenfor create/patch/delete4. Prompt lifecycle rules
Update
prompts/browser-agent-system.mdwith broad, non-browser-only guidance:Add a lightweight post-task reflection checklist:
5. Non-goals
Do not copy the old cloud skill-memory architecture for this first version:
6. Tests and acceptance criteria
Add focused tests for:
skill_viewreturns full instructions and emitsskill.usedAdd a small eval/smoke script:
For terminal UI changes, run the repo's existing verification path:
Why this shape
This is primarily the Browser Use Desktop skills model adapted to terminal. The only thing borrowed from Gregor's per-task work is the idea that a completed task can reveal reusable lessons. The system shape should stay conservative: broad procedural skills, explicit loading, compact metadata injection, validation, and write restrictions.