web-skill

Bridge humans and agents through web UI.

Vision

AI agents are becoming increasingly capable, but in many real-world workflows they still behave like black boxes, with too little transparency and too few chances for humans to verify, guide, or step in.

The web is already the interface people know and trust. That makes it the most natural bridge between human workflows and agent workflows.

Yet conventional DOM-driven browser automation is still a poor interface for agents: slow, expensive, and often unreliable.

web-skill aims to change that by making agent work more transparent, more interruptible, and easier to hand over between humans and agents, enabling safer participation in complex, high-stakes, and business-critical workflows.

How it works

web-skill helps a web app publish task-level automation APIs for agents and browser tooling.

From one shared config it can:

register functions under window._web_skills
generate one SKILL.md file per published skill
inject <link rel="web-skill" ...> tags into the HTML <head> through Vite

Entry points are split by purpose:

web-skill for browser/runtime APIs
web-skill/vite for the Vite plugin
web-skill/dev for markdown generation and other build-time helpers

The package is TypeScript-first and works best in apps that already use Vite and Zod.

Install

npm install web-skill zod

If you want the Vite plugin in your own project:

npm install vite

Examples

examples/request-workbench: a pure front-end React + Zustand mock showing how to list existing records, create new drafts, and expose both flows through web-skill
examples/checkout-desk: a three-step checkout form with local React state only, showing how a prepareCheckout skill can still drive a conventional DOM-shaped workflow

The checkout example is intentionally plain. web-skill is still useful when the site is just a conventional form flow with named inputs and submit buttons, not only when the app already has a polished agent-native architecture.

Agent setup

On the agent side, install or load use-web-skill. That gives the agent a simple discovery workflow: inspect the page <head>, look for <link rel="web-skill">, fetch the relevant SKILL.md, and only then continue with browser interaction.

Quick start

Create one generator, point it at the app actions you already trust, and install it at startup:

import { createWebSkillGenerator } from "web-skill";
import { z } from "zod";
import { useProcurementStore } from "./stores/procurement-store";

const webSkills = createWebSkillGenerator();
const procurementStore = useProcurementStore;

const procurementSkill = webSkills.newSkill({
  name: "erpProcurement",
  title: "ERP procurement console API for supplier lookup and purchase order preparation",
  description: "Expose ERP procurement actions to browser agents and the dev console.",
});

procurementSkill.addFunction(
  (input) => procurementStore.getState().findSupplierItem(input),
  "findSupplierItem",
  {
    description: "Search supplier catalog items and return normalized matches.",
    inputSchema: z.object({
      supplierId: z.string().min(1).optional(),
      keyword: z.string().min(1).optional(),
      activeOnly: z.boolean().optional(),
      limit: z.number().int().positive().optional(),
    }),
    outputSchema: z.array(
      z.object({
        itemId: z.string(),
        sku: z.string(),
        itemName: z.string(),
        unitCost: z.number().nullable(),
      }),
    ),
  },
);

procurementSkill
  .addFunction(
    (input) => procurementStore.getState().createPurchaseOrderDraft(input),
    "createPurchaseOrderDraft",
    {
      description: "Prepare a purchase order draft and navigate to the ERP purchase order screen.",
      inputSchema: z.object({
        supplierId: z.string().min(1),
        requesterId: z.string().min(1),
        currency: z.string().min(1),
        lines: z.array(
          z.object({
            itemId: z.string(),
            quantity: z.number().positive(),
            expectedUnitCost: z.number().nonnegative().optional(),
          }),
        ),
      }),
      outputSchema: z.object({
        draftId: z.string(),
        route: z.string(),
      }),
    },
  )
  .addFunction(
    (input) => procurementStore.getState().submitPurchaseRequisition(input),
    "submitPurchaseRequisition",
    {
      description: "Create and submit a purchase requisition from the current selection.",
      inputSchema: z.object({
        departmentCode: z.string().min(1),
        neededByDate: z.string().min(1),
        lines: z.array(
          z.object({
            itemId: z.string(),
            quantity: z.number().int().positive().optional(),
          }),
        ),
      }),
    },
  );

webSkills.install();

That exposes browser-callable functions like:

window._web_skills.erpProcurement.findSupplierItem(input)
window._web_skills.erpProcurement.createPurchaseOrderDraft(input)
window._web_skills.erpProcurement.submitPurchaseRequisition(input)
window._web_skills.erpProcurement._meta

Vite integration

Use the Vite plugin with the same generator instance:

import { defineConfig } from "vite";
import react from "@vitejs/plugin-react";
import { webSkillVitePlugin } from "web-skill/vite";
import { webSkills } from "./src/web-skills.ts";

export default defineConfig({
  plugins: [
    react(),
    webSkillVitePlugin({
      generator: webSkills,
    }),
  ],
});

At dev/build time the plugin will:

generate /skills/<skill-slug>/SKILL.md for every configured skill
serve those markdown files in Vite dev
emit those markdown files as build assets
inject one <link rel="web-skill" ...> tag per skill into the HTML <head>

What gets generated

Each generated SKILL.md contains:

the skill title
the console entrypoint under window._web_skills.<skillKey>
one section per function
summarized input and output schemas

The repository also includes skills/use-web-skill/SKILL.md, a discovery workflow for agents that inspect <head> before automating a page.

API reference

For the exported surface, expected behaviors, and common runtime errors, see docs/api.md.

Design guidance

web-skill works best when you expose task-level actions instead of raw UI primitives.

Good examples:

findCustomer
openInvoiceDraft
fillPurchaseOrderForm
submitCurrentApproval

Well-designed functions should also help move the workflow to the right interface for the next step. When appropriate, a function can open the relevant screen, navigate to a draft, or return route and record identifiers that make human handoff easy.

Avoid low-level wrappers like:

clickSaveButton
setInputValue
selectTableRow

The goal is to publish a stable, browser-callable API even when the underlying page is old, DOM-heavy, or inconsistent.

Safety

Treat destructive actions as special cases:

require explicit confirmation fields for submit/delete/finalize flows
validate inputs with Zod whenever possible
return structured results instead of relying on toast text
avoid exposing unstable internal helpers just because they are easy to call

Authorization guidance

web-skill does not enforce authorization by itself. The recommended model is for the application to decide which skill documents and runtime functions are exposed in the current session.

Typical patterns:

only render admin-only <link rel="web-skill" ...> tags for admin users
serve different skill documents for different audiences such as /admin-only/SKILL.md and /user/SKILL.md
keep backend APIs as the final authorization boundary even if a browser skill becomes visible

If your agent flow benefits from advisory hints, prefer putting that information directly in the visible skill description or link title. For example, titles and descriptions such as Admin-only order approval API are often enough to help agents choose the right skill without adding new package-level permission features.

Local development

npm install
npm run verify

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
docs		docs
examples		examples
skills/use-web-skill		skills/use-web-skill
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

web-skill

Vision

How it works

Install

Examples

Agent setup

Quick start

Vite integration

What gets generated

API reference

Design guidance

Safety

Authorization guidance

Local development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

web-skill

Vision

How it works

Install

Examples

Agent setup

Quick start

Vite integration

What gets generated

API reference

Design guidance

Safety

Authorization guidance

Local development

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages