Add progressive agent skills for reusable terminal/browser workflows

## Context

Browser Use Desktop now has a provider-neutral skills pattern that should be adapted for `browser-use-terminal`.

The important Desktop idea is not "browser-only skills" and not automatic URL memory. It is progressive procedural memory:

- Keep reusable instructions as local skill files.
- Inject only a compact metadata index into the agent prompt.
- Require the agent to explicitly search/view a skill before relying on full instructions.
- Let the agent create, patch, validate, or delete only persistent user skills under controlled rules.
- Treat skills as broad procedural memory: browser workflows, terminal workflows, debugging recoveries, repo conventions, output/reporting preferences, and recurring user processes.

Gregor's old per-task skill-memory experiment is useful mainly as prior art for the reflection checklist after a task: look for failure recovery, retries, uncertainty, stable selectors, auth quirks, API shapes, CLI commands, config paths, data formats, and verification steps that future agents should not rediscover. We should not copy its browser-only URL memory architecture.

`browser-use-terminal` already has adjacent pieces:

- `prompts/browser-agent-system.md` explains the agent contract and browser-harness workflow.
- `prompts/interaction-skills/` contains read-only browser mechanics guidance.
- `crates/browser-use-core/src/tools/mod.rs` owns the provider-neutral tool registry.
- `crates/browser-use-store` owns the state dir and session event stream.
- `crates/browser-use-tui` renders session events in the terminal UI.

## Proposal

Add progressive skills to `browser-use-terminal` as reusable procedural memory.

### 1. Storage layout

Use the existing state-dir model:

```text
.browser-use-terminal/
  skills/
    workflow/<name>/SKILL.md
    browser/<name>/SKILL.md
    debugging/<name>/SKILL.md
    repo/<name>/SKILL.md
```

Bundled prompt skills stay read-only. User-created skills live under the state dir so they persist across sessions and are not overwritten by app updates.

A user `SKILL.md` should use simple frontmatter:

```md
---
name: crm-triage
summary: Reusable CRM queue triage workflow after repeated account checks
---

# CRM Triage

Use when...

## Steps
...

## Verification
...

## Gotcha
X What seems obvious but fails
V What actually works
```

The X/V section is optional. It is useful for sharp lessons learned from a run, but the skill should stay one coherent reusable procedure rather than one tiny skill per gotcha.

### 2. Compact index injection

Inject a compact skill index into the prompt, not full skill bodies and not hidden runtime reminders after every browser action.

The index should include:

- bundled `prompts/interaction-skills/` metadata
- optional future bundled `prompts/domain-skills/` metadata
- user state-dir `skills/**/SKILL.md` metadata

Example prompt section:

```md
## Available Skills
Compact metadata index only. If a skill looks relevant, load full instructions with `skill_view` before using it.

### User skills
- user/workflow/crm-triage: CRM Triage - Reusable CRM queue triage workflow...

### Interaction skills
- interaction/screenshots: Screenshots - Capture and verify visual state...
```

This is the only "injection" proposed for the first version: normal prompt-time metadata injection. It should not auto-inject URL-matched tips into tool outputs.

### 3. Provider-neutral skill tools

Because terminal has a shared Rust tool registry, prefer native tools over a shell CLI wrapper:

- `skill_list`
- `skill_search(query, limit?)`
- `skill_view(id)`
- `skill_create(id, summary, body)`
- `skill_patch(id, old, new, replace_all?)`
- `skill_delete(id)` for user skills only
- `skill_validate(id)`

Implementation surface:

- add handler kinds in `crates/browser-use-core/src/tools/mod.rs`
- add filesystem-backed implementation alongside `tools/files.rs`
- keep path traversal protection and restrict writes/deletes to state-dir user skills
- emit `skill.used` for view/search hits and `skill.written` for create/patch/delete

### 4. Prompt lifecycle rules

Update `prompts/browser-agent-system.md` with broad, non-browser-only guidance:

- Search/view relevant skills before inventing browser, repo, terminal, debugging, or workflow-specific steps.
- After a successful nontrivial task, create or patch a user skill only if the new procedure is likely to repeat, long-running enough to justify reuse, or generally applicable beyond the current session.
- Do not write skills for one-off facts/calculations, temporary page state, secrets/tokens, private account details, failed/speculative workflows, or content that belongs in the task output.

Add a lightweight post-task reflection checklist:

- Did the run discover a repeatable procedure?
- Did it recover from an error in a way future agents should know?
- Did it learn a stable selector, API shape, auth flow, CLI command, config path, file layout, data format, or verification step?
- Did an existing skill help, fail, or need patching?

### 5. Non-goals

Do not copy the old cloud skill-memory architecture for this first version:

- No DB-backed URL-prefix memories.
- No one-gotcha-per-skill auto-generation.
- No automatic hidden injection after every navigation/tool call.
- No creating active skills from failed/speculative workflows.
- No free-text URL/tag dedupe as the only quality gate.

### 6. Tests and acceptance criteria

Add focused tests for:

- frontmatter parsing and invalid/missing summaries
- index construction and truncation
- search ranking over id/title/summary/body
- `skill_view` returns full instructions and emits `skill.used`
- create/patch/delete are restricted to user skills and block traversal
- prompt contains compact index but not full user skill bodies
- no-write guidance is present for one-offs, secrets, failed/speculative workflows

Add a small eval/smoke script:

- find an existing interaction skill
- create a new reusable workflow skill after a successful task
- patch an existing skill after learning a better step
- refuse to write a skill for a one-off calculation
- refuse to write a skill for secret/private data

For terminal UI changes, run the repo's existing verification path:

```bash
scripts/verify-terminal-ui.sh
```

## Why this shape

This is primarily the Browser Use Desktop skills model adapted to terminal. The only thing borrowed from Gregor's per-task work is the idea that a completed task can reveal reusable lessons. The system shape should stay conservative: broad procedural skills, explicit loading, compact metadata injection, validation, and write restrictions.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add progressive agent skills for reusable terminal/browser workflows #1

Context

Proposal

1. Storage layout

2. Compact index injection

3. Provider-neutral skill tools

4. Prompt lifecycle rules

5. Non-goals

6. Tests and acceptance criteria

Why this shape

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add progressive agent skills for reusable terminal/browser workflows #1

Description

Context

Proposal

1. Storage layout

2. Compact index injection

3. Provider-neutral skill tools

4. Prompt lifecycle rules

5. Non-goals

6. Tests and acceptance criteria

Why this shape

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions