WebHarbor docks popular websites into local, stable, Docker-based mirrors with full auth, database, and multimodal image content. Environments evolve with agent capability.
Live websites are noisy: reCAPTCHA, geo-blocks, network flakiness, content drift. Their most useful features sit behind login walls that benchmarks can't touch. Existing offline web environments either freeze the web into toy synthetic sites or fall back to static traces with no real interaction, which limits large-scale RL training.
WebHarbor takes a different approach. We leverage coding agent (e.g., Claude Code/CodeX) to mirror real sites into local Docker images that:
- Stable & reproducible — no network noise, no content drift, no geo-blocks
- Deep features unlocked — carts, checkouts, accounts, all fully testable
- Evolving — harder tasks drive richer mirrors; the environment grows with agents
- RL-ready — sub-second database resets between rollouts
- Community-driven — 15 sites today, scaling to 100+ together
One command to run all web environments:
docker run -p 8101:8101 -p 40000-40014:40000-40014 battalion7244/webharbor:latestThen point your agent at http://localhost:40000 through http://localhost:40014 to explore 15 local mirrors of webvoyager sites: Allrecipes, Amazon, Apple, ArXiv, BBC News, Booking, GitHub, Google Flights, Google Maps, Google Search, Hugging Face, Wolfram Alpha, Cambridge Dictionary, Coursera, and ESPN.
For sub-second reset between rollouts, expose the control plane and call /reset/<site>:
curl -X POST http://localhost:8101/reset/amazon # one site
curl -X POST http://localhost:8101/reset-all # all sites in parallelIf you prefer to build the image yourself:
git clone https://github.com/aiming-lab/WebHarbor && cd WebHarbor
./scripts/fetch_assets.sh # pulls static assets from ChilleD/WebHarbor on HF
./scripts/build.sh # docker build -t webharbor:dev .We have built 15 high-quality mirrors covering the WebVoyager benchmark. The next goal is 100+ sites, covering everything in Online-Mind2Web. We are inviting the community to build this together.
There are two ways to join the author list:
Use a coding agent to build a new mirror (frontend + backend + database + tasks). Contributing one website qualifies you for consideration on the final paper's author list.
- Browse the Contribution Track Sheet and pick an unclaimed site.
- Submit the Contribution Request Form to claim it. We lock the site to prevent duplicate work.
- Follow the Website Contribution Guide and CONTRIBUTING.md to build and open a PR.
Review submitted mirrors for visual fidelity, functional correctness, and task grounding. Reviewing 5 environments earns a spot on the author list.
- Browse open Pull Requests.
- Check whether the submitted environment supports its proposed tasks, and whether those tasks are meaningful and challenging.
- Follow the Review Pipeline for systematic verification.
Any other improvement — bug fixes, UI polish, data enrichment, task suggestions, or even feedback, qualifies for the paper's acknowledgement section.
| Name | Link |
|---|---|
| 🏠 WebHarbor Project Page | WebHarbor |
| 🤗 HuggingFace Dataset | ChilleD/WebHarbor |
| 💻 WebHarbor GitHub | Code Repo |
| 📊 Contribution Track Sheet | Google Sheet |
| 📝 Contribution Request Form | Google Form |
WebHarbor is initiated by UNC-Chapel Hill and Microsoft, with contributions from the broader community. If you have any questions, please contact us via webharborcomm at gmail dot com or zhaoyang at cs dot unc dot edu.
@misc{webharbor2026,
title = {WebHarbor: Docking Real Websites for Evolving GUI Agent Environments},
author = {{WebHarbor Team and Contributors}},
year = {2026},
url = {https://aiming-lab.github.io/webharbor.github.io},
note = {Project website.}
}