Read a website once. Generate a JSON SDK. From then on, the LLM uses the site like an API — without ever seeing the page.
A mapper agent visits a site, proposes a list of tool functions
(submit_contact_form, get_top_stories, …), and records each one as a
deterministic recipe. At runtime, an LLM picks a tool from the schema and a
pure-replay engine executes it. No screenshots, no per-step LLM calls during
execution, no fragile prompting.
Built on Python + Playwright + OpenAI gpt-5.4-mini.
Agentify is an Agent Skill — the
single SKILL.md works in both Claude Code and Codex (and any tool that
reads the open SKILL.md standard). Only the install folder differs.
The skill bundles a Python venv + Playwright Chromium, so installing is: clone into your tool's skills directory, then build the bundled environment. Paste the matching prompt to your agent and it will do all of it for you.
Install the Agentify skill from
https://github.com/rivka2003/Agentify.
git clone https://github.com/rivka2003/Agentify ~/.claude/skills/Agentifycd ~/.claude/skills/Agentifypython3 -m venv venvvenv/bin/pip install -e sourcevenv/bin/python -m playwright install chromiumcp source/.env.example source/.envand setOPENAI_API_KEY=to my key.Then confirm the Agentify skill loads (
/skills).
Install the Agentify skill from
https://github.com/rivka2003/Agentify.
git clone https://github.com/rivka2003/Agentify ~/.agents/skills/Agentifycd ~/.agents/skills/Agentifypython3 -m venv venvvenv/bin/pip install -e sourcevenv/bin/python -m playwright install chromiumcp source/.env.example source/.envand setOPENAI_API_KEY=to my key.Then confirm the Agentify skill loads (
/skills).
The only difference is the target directory: Claude Code discovers skills in
~/.claude/skills/, Codex in ~/.agents/skills/. The SKILL.md itself is
identical and path-independent (it resolves the install folder via $SKILL_DIR).
python -m venv venv && source venv/bin/activate
pip install -e source
playwright install chromium
cp source/.env.example source/.env # OPENAI_API_KEY=...agentify map --url https://news.ycombinator.com --name hackernewsWhat this does:
- Crawls the landing page + a few same-origin nav links.
- Sends the survey to the LLM; gets back proposed tools with JSON-schema parameters.
- Shows you the proposals in the terminal — accept all or pick a subset.
- For each accepted tool: drives the site once with sentinel placeholders
(
__W2A_NAME__→{{name}}), captures every browser action with a robust selector (role+name first, CSS fallback). - Writes
recipes/hackernews.tools.json.
Add --auto-approve to skip the interactive approval step.
Two ways:
Direct tool call (no LLM, deterministic):
agentify call --site hackernews --tool get_top_stories --args '{"n": 5}'Natural language → tool pick (LLM picks tool + arguments; never sees the page):
agentify run-mapped --site hackernews \
--task "Give me the top 3 stories right now"PHASE 1 — MAP (one-shot per site)
─────────────────────────────────
Crawler ─→ surveys landing page + nav links
↓
Proposer ─→ LLM call: candidate tool list
↓
Approver ─→ CLI prompt: keep / drop
↓
Recorder ─→ drives site with placeholders, captures recipe
↓
recipes/<slug>.tools.json
PHASE 2 — USE (every invocation)
─────────────────────────────────
Registry ─→ load recipes/<slug>.tools.json
↓
Picker ─→ LLM call: { tool_name, arguments } (NO PAGE CONTENT)
↓
Engine ─→ deterministic replay of the recipe
A site registry is recipes/<slug>.tools.json. Each tool has a name,
description, JSON-Schema parameters, and a list of steps:
| op | purpose |
|---|---|
goto |
navigate to a URL |
click |
click a Target |
type |
fill a textbox (supports {{param}} substitution) |
select |
pick an option in a combobox |
press_enter |
press Enter on a Target |
scroll |
scroll up / down / top / bottom |
wait |
sleep N ms |
extract |
save text/value/attr from a Target into the result dict |
js_extract |
run arbitrary in-page JS for tricky extractions |
verify |
assert page state; failure raises RecipeFailure |
A Target records up to three strategies. The engine tries them in
priority order until one resolves to an element:
role+name— ARIA / accessibility tree (most stable).css— recorded at map time from the element's id / name / data-testid / nth-of-type position.text— visible text match.
agentify/
├── cli.py Typer commands: map, call, run-mapped
├── browser.py Playwright wrapper; actions keyed by element id
├── ax_tree.py Injected JS that builds the numbered element list
│ used during crawling and recording
├── selectors.py Target dataclass + multi-strategy resolver
├── recipe.py Recipe dataclass + deterministic Engine
├── registry.py Load/save recipes/<slug>.tools.json + OpenAI conv.
├── recorder.py Recording Browser subclass: tees every action
│ into a recipe step list with placeholder binding
├── mapper.py The full Phase-1 pipeline: Crawler + Proposer +
│ Approver + Recorder
├── llm.py OpenAI client + system prompts + tool schema
│ (used by the Proposer, the recording driver,
│ and the runtime Picker)
├── agent.py Internal observe→think→act loop used by the
│ Recorder to drive the site during mapping
└── memory.py Step history used by the recording loop
The mapper records arbitrary linear multi-step flows for any site, with no per-site code, via four site-agnostic mechanisms:
- Realistic example inputs — the proposer attaches an
exampleto each parameter (carried onToolProposal.examples, used by_placeholders_for) so typeaheads/live-search respond during recording. - Autocomplete normalization —
_normalize_autocompleterewritestype {{param}} into a combobox → click a named suggestionintotype {{param}} → verify an option exists → click the FIRST option({"role": "option"}resolves to.first), which is parameter-independent. - Auto result-extraction —
record_action_recipeappends ajs_extractof the landing page so action tools return data. - Self-verifying record→replay —
_record_verified_actionreplays each freshly recorded recipe with the example args (_verify_replay) and re-records once with the failure as a hint if it doesn't replay.
What's still missing are non-linear shapes (loops, branches) and a few binding edge cases — below.
The current system works well for single-form submissions, linear multi-step flows (fill fields → pick suggestions → submit → read), and single-page data extraction. Once you've internalised how it works, these are the real seams where it falls over — listed with the concrete fix each one needs:
Every agentify call opens a fresh Browser, executes the recipe, closes
it. So call login(...) followed by call view_cart() won't work — the
second call has no cookies. Any flow that needs auth state, shopping
carts, OAuth, or multi-step wizards is blocked.
- Why it matters: real apps need login.
- Fix size: small. Add a
Sessionclass inbrowser.pythat keeps a PlaywrightBrowserContextalive across calls (keyed by--sessionflag).run-mappedalready runs everything in one context, so onlycallhas the problem.
Recipes are straight-line sequences. You can't say "for each story on
this list, open it and extract X." To paginate HN to page 5, you'd need
5 separate recipes or a js_extract that does all the work in one shot.
- Why it matters: scraping, batch processing, "show me all..." tasks.
- Fix size: medium. Add
for_each {items_target, sub_steps}to the Engine op vocabulary; iteratepage.locator(items_target).all()and run the sub-steps with a{{_item}}variable in scope. Mapper has to learn to propose such recipes for list-shaped sites.
verify either passes or raises RecipeFailure. There's no "if the
cart is empty, go shop; else go to checkout." Every conditional has to
live in the runtime LLM via tool composition, which is fine for top-level
decisions but awkward for "does this modal have a Cancel button or a
Close button?" micro-branches.
- Why it matters: any site with dialogs, optional flows, or validation errors that the recipe needs to react to.
- Fix size: medium. Add
if_verify {kind, value, then: [...steps], else: [...steps]}torecipe.py's Engine and recurse on the chosen branch.
The Crawler only visits the landing page + a few same-origin nav links.
Anything deep in a flow — logged-in pages, search results, multi-step
wizards, modals — is invisible to the Proposer, so no tools are proposed
for it. That's why Wikipedia got search_wikipedia but not
get_article_facts(query): the Crawler never visited an article.
- Why it matters: the Proposer is the bottleneck on how rich the generated SDK is.
- Fix size: medium-large. Make the Crawler an agent itself — follow forms, interact with menus, capture state at each layer. This is real work because crawling becomes recursive and stateful (and may need login fixtures).
When the Recorder sees type_text(id=4, text="__W2A_EMAIL__") it knows
to bind that field to {{email}}. Typeahead/autocomplete is handled now
(see "Multi-step flows" above — the suggestion click is normalized to a
parameter-independent first-option select). But for clicks on radio
buttons, checkboxes, date pickers, file uploads — and for "open X by
title" flows where the agent navigates by clicking a named link instead of
typing — there's no typed text, so no sentinel to swap. Result: those
params get hardcoded to whatever the mapping agent picked (e.g. pizza_size
frozen to "Small" instead of {{pizza_size}}). Note the replay-verify pass
can still pass such a recipe, because the hardcoded path works for the
example value — so inspect recipes whose parameters drive a click rather
than a type.
- Why it matters: any form with selects, radios, checkboxes is partially parameterized.
- Fix size: medium. The Mapper needs to track which placeholder slot is "active" when a non-text action fires, by feeding the agent the synthetic-task placeholders with their parameter names and asking it to pick options whose visible text matches the placeholder. The Recorder then matches role+name+value against the placeholder map.
The cleanest design decision when extending is: is the recipe a flat sequence or a small language?
- Path X — keep recipes flat, push complexity to the LLM.
Add
Session+ improve param binding. Each tool stays a minimal atomic action (login,view_cart,checkout_step_1...). The runtime LLM composes them. Simple, predictable, more LLM calls. - Path Y — make recipes a small language.
Add
for_each,if_verify,subcallto the Engine. One tool can encode "log in if needed, paginate to page N, extract everything, return." LLM only picks top-level tools. More expressive, more code to maintain.
The pragmatic order for this codebase: do (1) and (5) first — they're small, high-value, and unlock most real multi-page cases. Add (2) only when a concrete pagination task demands it. Defer (3) and (4) until you've hit a wall with the simpler path.
Recipes you generate with agentify map land in recipes/ and are
git-ignored by default (they're yours, and may target private sites). A
neutral, public-site example ships in examples/ so you can
see the format and try a replay without mapping anything first:
examples/hackernews.tools.json—get_top_storiesover news.ycombinator.com
pip install -e "source[dev]"
pytest source # hermetic: no Playwright / OpenAI key needed.env:
OPENAI_API_KEY=sk-...
AGENTIFY_MODEL=gpt-5.4-mini # any function-calling model
See CONTRIBUTING.md. By participating you agree to the Code of Conduct. Security issues: see SECURITY.md — please report privately.
Apache-2.0. © Agentify contributors.
{ "name": "submit_contact_form", "description": "Submit a contact request.", "parameters": { "type": "object", "properties": { "name": {"type": "string"}, "email": {"type": "string"} }, "required": ["name", "email"] }, "steps": [ {"op": "goto", "url": "https://example.com/#contact"}, {"op": "type", "target": {"role": "textbox", "name": "Name *"}, "text": "{{name}}"}, {"op": "type", "target": {"role": "textbox", "name": "Email *"}, "text": "{{email}}"}, {"op": "click", "target": {"role": "button", "name": "Send"}}, {"op": "wait", "ms": 1500}, {"op": "verify", "kind": "page_text_contains", "value": "thanks"} ] }