Skip to content

KrakeIO/krake_browser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

krake_browser

Persistent local Chromium that a human and an LLM share. The LLM drives most flows; the human steps in for 2FA, judgment calls, and approvals. Sessions stay warm forever.

Why

Most browser-automation tools today fall into two camps:

  1. Fully autonomous agents (Browser Use, Stagehand, Skyvern, Multi-On, Open Operator) — spin up their own headless browser, log in with stored credentials, try to complete tasks unsupervised. Great for benchmarks; brittle on real-world flows that hit 2FA, anti-bot, or judgment calls. And the "stored credentials in someone else's cloud" model is a non-starter for serious operational work.
  2. Pure record-replay (Playwright Codegen, Selenium IDE) — fast to build, no AI in the loop, breaks on the first DOM change.

krake_browser is the third camp: symbiosis.

A long-lived Chromium runs against your own user profile. You're already logged into WhatsApp, Instagram, LinkedIn, your FDA portal, your Google Workspace, whatever. The LLM attaches over CDP, runs a recipe, and pauses with an explicit human_intervention action whenever it needs you — to solve a CAPTCHA, complete a 2FA, approve a draft message, or take over a step that's easier done by hand.

The key property: session inheritance in both directions.

  • You 2FA into the FDA portal once; the LLM can now operate FDA pages.
  • The LLM keeps your WhatsApp Web session warm indefinitely; you skip the QR-scan dance every Monday.
  • When LinkedIn anti-bot challenges you, the LLM hands off, you click through, and the recipe resumes.

How it fits

+----------------------+        +-------------------------+
|  Chat / LLM Client   |<-MCP-->|  krake_browser engine   |
|  (Claude, Cursor,    |        |  (Sinatra + Playwright) |
|   custom UI)         |        +-------------------------+
+----------------------+                    |
                                            | CDP
                                            v
                              +--------------------------+
                              |  Persistent Chromium     |
                              |  (--user-data-dir=...)   |
                              |  Tabs: WA / IG / FDA /.. |
                              +--------------------------+
                                            ^
                                            |
                                       Human, watching
                                       + co-piloting

Recipes (sequences of click / insert / wait / human_intervention / etc.) come from sibling repos:

  • krake_recipes — generic public recipes (WhatsApp, IG, LinkedIn, FDA, …). Community-contributable.
  • TrueSightDAO/tdg_recipes — DAO-specific recipes (Partner Follow-ups, Edgar contribution check-ins, …) that consume the same engine.

Status

Stage 0 — scaffold. This repo currently contains the design:

  • README.md — what and why (this file)
  • ARCHITECTURE.md — components and how they wire together
  • DSL.md — recipe format, extending the Krake DSL from 2014 with formalized human_intervention

The engine itself is being built across subsequent commits. See the roadmap below.

Roadmap

  • Recipe DSL spec (extends Krake DSL from 2014 with formalized human_intervention)
  • CDP attach to externally-launched Chromium
  • Persistent profile management (--user-data-dir, restart-safe)
  • Recipe executor (consumes recipes from local clones of krake_recipes / tdg_recipes)
  • Chat-handoff transport (MCP server first; WebSocket fallback)
  • Tab coordination (LLM works in named tabs; warns before touching human's active tab)
  • Localhost auth on the CDP/MCP port (keys-to-the-kingdom; not optional)
  • Reference client: minimal CLI that proxies a recipe + prompts user when human_intervention fires

Lineage

Direct evolution of the Krake.io DSL Gary Teh designed in 2014 for the original Krake scraper platform. The action vocabulary (click, insert, scroll_bottom, wait, trigger_change, solve_captcha) is preserved verbatim; the only new primitive is human_intervention with a prompt string and an ack channel, which formalizes what solve_captcha was already half-doing.

If you're familiar with the original Krake DSL, you already know 90% of krake_browser. See DSL.md for the delta.

Security note

The CDP port that the engine attaches to is keys-to-the-kingdom — it exposes every cookie and every active session in your browser to anything that can reach the port. The engine binds CDP to localhost only and requires a token on the MCP/WebSocket surface. Do not expose port 9222 to your LAN. Do not run this on a multi-user machine where other accounts are untrusted.

License

MIT — see LICENSE.

About

Persistent local Chromium that a human and an LLM share. Recipe-driven, with human_intervention as a first-class DSL primitive.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors