Skip to content

alexrupom/flunky

Repository files navigation

Flunky

Flunky lets any AI agent drive a real browser. It borrows the layering of a browser automation library (a thin driver, a stateful session, a readable action DSL) and adds the two things an agent needs that a test framework does not: a way to show the page to a model, and a way to expose actions as tools the model can call.

The gem never calls an AI vendor itself. It emits tool schemas and a dispatcher; you inject the model client.

How it fits together

  • Driver (Drivers::Base, Drivers::FerrumDriver) talks to the browser. Ferrum drives Chrome over the DevTools Protocol with no Selenium server. The backend is swappable.
  • Snapshot reduces the live page to the elements an agent can act on, each stamped with an integer ref, and renders a compact prompt block.
  • Actions is the human-readable DSL over the driver (click, type, fill_in, ...).
  • Session owns one driver, caches the latest snapshot, and exposes the actions.
  • Tools turns a session into vendor-neutral tool schemas plus a dispatcher.
  • Agent is an optional observe/decide/act loop around an injected model.

Installation

Flunky needs Ruby >= 3.0 and a local Chrome (Ferrum launches it).

bundle add flunky

or

gem install flunky

Usage

require "flunky"

Flunky.session do |s|
  s.visit("https://example.com")
  puts s.snapshot.to_prompt   # the page as the model sees it
  s.actions.click(1)          # act on a stamped ref
end

refs only exist after a snapshot stamps them, so observe (or read snapshot) before acting. After client-side navigation a ref can go stale; the tool dispatcher re-observes after every action so the model always sees the current page.

Tool calling

session = Flunky::Session.new
tools = Flunky::Tools.new(session)

tools.definitions          # hand straight to an Anthropic style client
tools.dispatch("click", { ref: 1 })  # run a returned tool call

The tool schema shape (name / description / input_schema) matches Anthropic's tool format. For OpenAI, wrap each definition as { type: "function", function: { **defn, parameters: defn[:input_schema] } }.

Agent loop

Flunky::Agent.new(session, model:) drives an observe/decide/act loop. model must respond to call(messages:, tools:) and return { text:, tool_calls: [{ id:, name:, arguments: }] }. See examples/anthropic_agent.rb for a roughly 50 line adapter to Anthropic's /v1/messages endpoint.

Configuration

Flunky.configure do |c|
  c.headless = true
  c.default_timeout = 10
  c.max_elements = 200
  c.window_size = [1280, 800]
end

Per-session options passed to Session.new override the global configuration.

Development

After checking out the repo, run bin/setup to install dependencies, then bundle exec rspec. Specs tagged :browser are skipped automatically when Chrome is not installed, so the suite passes on a bare machine.

License

Available as open source under the terms of the MIT License.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors