Skip to content

marcopared/browser-flow

Repository files navigation

Flow

Flow is a local AI-assisted browser-flow capture and extraction harness. It captures browser or fixture evidence, analyzes saved artifacts, creates an extraction plan, requires human approval, generates a reusable extractor, and validates that extractor against the saved HTML.

Install

python -m pip install -e ".[test]"

Flow uses filesystem storage only. Runs are written under runs/; generated code is written under generated/.

bb-browser Setup

For authenticated captures, start Chrome with remote debugging on the default Flow port:

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=19825

Flow calls bb-browser through:

npx -y bb-browser ...

Unit tests do not require Chrome, CDP, or bb-browser.

Workflow

flow capture \
  --name demo \
  --from-html tests/fixtures/jsonld_job.html \
  --goal "Extract job" \
  --fields title,company,location,description,apply_url

flow analyze runs/demo/latest
flow review runs/demo/latest
flow approve runs/demo/latest --notes "Correct"
flow generate runs/demo/latest --target python-extractor
flow validate generated/demo

During development, python -m flow.cli ... works the same as flow ....

Live Examples

LinkedIn authenticated browser capture:

flow capture \
  --name linkedin_recommended \
  --url "https://www.linkedin.com/jobs/collections/recommended" \
  --goal "Extract recommended LinkedIn jobs" \
  --browser bb \
  --port 19825 \
  --fields title,company,location,detail_url,description,apply_url

Goldman Sachs fetch capture:

flow capture \
  --name goldman_sachs_jobs \
  --url "https://higher.gs.com/results?LOCATION=New%20York&page=1&search=software%20engineer&sort=RELEVANCE" \
  --goal "Extract Goldman Sachs software engineering jobs in New York" \
  --browser fetch \
  --fields title,location,division,description,compensation,apply_url

Safety

Flow is read-only by default. It captures, analyzes, extracts, and validates. It does not submit applications, purchase anything, send messages, change account settings, delete data, modify profiles, bypass CAPTCHAs, or store credentials.

Tests

python -m compileall flow
pytest -q
python -m flow.cli --help

About

Browser-flow capture and extraction harness for generating validated extractors and source adapters

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors