ReCAP-Agent

Training and evaluating CAPTCHA-capable GUI agents

Dynamic CAPTCHA generation, CAPTCHA benchmarks, unified evaluation framework, and trace generation pipelines for reasoning-action supervision.

Quick Start • Components

Why ReCAP-Agent

Modern GUI agents can navigate websites, apps, and interfaces, but CAPTCHAs still break many real workflows. ReCAP-Agent is a practical stack for studying that failure mode end to end: generate CAPTCHA tasks, benchmark agents against them, and convert runs into training traces that support better reasoning and recovery behavior.

This repository brings together:

dynamic CAPTCHA environment and benchmark with diverse interaction patterns;
static real-world CAPTCHA benchmarks (contributed by Teoh et al.);
direct reasoning-action trace generation;
self-correction trace generation from failed attempts;
cross-provider evaluation for multiple model families.

Components

Module	Purpose
`dynamic_captchas/`	Dynamically generated CAPTCHA tasks used to probe transfer across layouts and interaction styles.
`halligan_captchas/`	Static benchmark set based on real-world CAPTCHAs, contributed by Teoh et al. Included here for convenient local evaluation.
`captcha_eval_framework/`	Unified benchmarking framework for running GUI agents across providers and model families.
`trace_generation/`	Pipelines for generating direct traces, self-correction traces, and model-specific training data formats.

CAPTCHA Coverage

The dynamic CAPTCHA system covers seven representative interactive types:

text
compact_text
icon_match
icon_selection
paged
slider
image_grid

These tasks collectively target four broad capabilities shown above:

optical character recognition,
continuous control,
spatial localization, and
visual-semantic comprehension.

Quick Start

1. Start the Dynamic CAPTCHA server

cd dynamic_captchas
pip install -r requirements.txt
python download_datasets.py
python app.py

2. Optional: start the Halligan benchmark server

cd halligan_captchas
conda env create --file environment.yml --name halligan-benchmark
conda activate halligan-benchmark
python server.py

3. Run the unified evaluation framework

cd captcha_eval_framework
pip install -r requirements.txt
cp .env.example .env
python3 ./main.py --provider dynamic --test-mode once --model-family qwen3

4. Generate traces for training

cd ..
pip install -r captcha_eval_framework/requirements.txt
python -m playwright install chromium
python -m trace_generation direct
python -m trace_generation self-correction
python -m trace_generation convert

For setup details, environment variables, and advanced usage, refer to the component READMEs linked above.

Repository Layout

ReCAP-Agent/
├── dynamic_captchas/
├── halligan_captchas/
├── captcha_eval_framework/
├── trace_generation/
├── images/
└── README.md

Roadmap

Dynamic CAPTCHA generation and verification server
Static benchmark integration
Unified cross-provider evaluation framework
Trace generation module with direct and self-correction traces

Contributing

Contributions are welcome.

Fork the repository and create a branch for your change.
Make the change with clear commits and any necessary documentation updates.
Push your branch and open a pull request describing the motivation and behavior change.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReCAP-Agent

Training and evaluating CAPTCHA-capable GUI agents

Why ReCAP-Agent

Components

CAPTCHA Coverage

Quick Start

1. Start the Dynamic CAPTCHA server

2. Optional: start the Halligan benchmark server

3. Run the unified evaluation framework

4. Generate traces for training

Repository Layout

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
captcha_eval_framework		captcha_eval_framework
dynamic_captchas		dynamic_captchas
halligan_captchas		halligan_captchas
images		images
trace_generation		trace_generation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

ReCAP-Agent

Training and evaluating CAPTCHA-capable GUI agents

Why ReCAP-Agent

Components

CAPTCHA Coverage

Quick Start

1. Start the Dynamic CAPTCHA server

2. Optional: start the Halligan benchmark server

3. Run the unified evaluation framework

4. Generate traces for training

Repository Layout

Roadmap

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages