Feat/manage tabs #8

olesho · 2025-10-22T02:29:00Z

No description provided.

- Add VisionJudge class to evals/lib/judge.py for visual evaluation with screenshots - Add /page/content endpoint to api-server.js for retrieving page HTML/text - Add /page/screenshot endpoint to api-server.js for capturing screenshots - Fixes ImportError: cannot import name 'VisionJudge' These were lost during cleanup but are required for the eval runner to work. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Created separate example config files for different model providers: - config.example.openai.yml: OpenAI models (now default) - config.example.openrouter-gemini.yml: OpenRouter with Gemini - config.example.openrouter-gpt.yml: OpenRouter with GPT Main config.yml now defaults to OpenAI provider for reliability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

…nt into feat/manage-tabs # Conflicts: # evals/config.yml

Copilot

Pull Request Overview

This PR implements a comprehensive tab management system for the browser automation framework, transitioning from an eval-server architecture to a browser-agent-server approach. The changes remove the eval-server components and introduce new browser-agent-server functionality with improved tab handling, initialization scripts, and configuration management.

Key changes:

Removed eval-server Python and NodeJS implementations
Introduced browser-agent-server configuration for both standard and Cloud Run deployments
Added container initialization scripts for cleanup of lock files
Implemented comprehensive evaluation framework with LLM-as-a-judge capabilities
Updated Chromium configuration to remove auto-devtools flag

Reviewed Changes

Copilot reviewed 112 out of 270 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
supervisor/services/browser-agent-server.conf	Added new supervisor configuration for browser-agent-server service
supervisor/services-cloudrun/browser-agent-server.conf	Updated Cloud Run supervisor config to use browser-agent-server instead of eval-server
supervisor/services/chromium.conf	Removed CHROMIUM_FLAGS environment variable for auto-opening devtools
scripts/init-container.sh	New initialization script for container startup that cleans lock files
scripts/test-browser-agent-server.sh	New test script for browser-agent-server Docker build
evals/run.py	New universal evaluation runner with screenshot capture and vision judge support
evals/lib/*.py	New evaluation framework library modules (config, judge, api client, eval loader)
evals/config*.yml	New configuration files for evaluation framework with model settings
evals/data/*/.yaml	Updated evaluation definitions with corrected field names and structure

Files not reviewed (2)

browser-agent-server/nodejs/package-lock.json: Language not supported
eval-server/nodejs/package-lock.json: Language not supported

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-22T05:50:30Z

scripts/test-browser-agent-server.sh

+docker run --rm browser-agent-server-test ls -la /browser-agent-server/node_modules | head -5
+
+echo ""
+echo "✅ All checks passed! Eval-server build is working."


The message refers to 'Eval-server' but should refer to 'browser-agent-server' to match the actual component being tested.

Suggested change

echo "✅ All checks passed! Eval-server build is working."

echo "✅ All checks passed! browser-agent-server build is working."

tysonthomas9 · 2025-10-22T05:51:48Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2025-10-22T05:55:23Z

browser-agent-server/start.js

 #!/usr/bin/env node

-// Custom eval-server startup script for Cloud Run
+// Custom browser-agent-server startup script for Cloud Run
 // Uses environment variables for port configuration

 import { EvalServer } from './src/lib/EvalServer.js';


Update Cloud Run start script to use BrowserAgentServer

The startup script still imports ./src/lib/EvalServer.js and constructs EvalServer, but that module was removed in this commit when the implementation was renamed to BrowserAgentServer.js. Executing this script will throw ERR_MODULE_NOT_FOUND before the server even starts, so Cloud Run deployments will fail. Replace the import (and class name) with BrowserAgentServer to match the new file.

Useful? React with 👍 / 👎.

Should be tackled now.

olesho and others added 22 commits October 15, 2025 15:28

Local deployment minor fixes

9a07b1f

Make dev build

edbf470

Local deployment setup: should support fast building

8afb683

Added proper /tabs API

81afe0e

Added API for managing tabs

7f5e4cd

New tab opened for each /responses API call

469cc40

Updated documentation

ccc914d

Initial eval setup

6f33f63

Evals refactoring. Only simple test works.

54af04e

Changed evals structure

c94dd24

Cleanup

0a23ba6

Changed evals structure

17d3bff

Another attempt to cleanup

1fb0bc1

Merge branch 'feat/manage-tabs' of github.com:BrowserOperator/web-age…

29ad772

…nt into feat/manage-tabs # Conflicts: # evals/config.yml

Cleanup and refactoring; fixed docker-compose setup

035855f

Adjust UDP ranges

dc58434

Remove unused code from eval-server

6b1baa4

Refactored mounted directories

d0b464c

Refactoring

501fbb5

Refactoring and proper naming

0c8530d

olesho requested a review from tysonthomas9 October 22, 2025 02:29

olesho added 2 commits October 21, 2025 21:53

Cleanup config

3044f90

Example configs

bd5061e

tysonthomas9 requested a review from Copilot October 22, 2025 05:49

Copilot AI reviewed Oct 22, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 22, 2025

View reviewed changes

Fix the old ./src/lib/EvalServer.js reference

29081c5

tysonthomas9 approved these changes Oct 23, 2025

View reviewed changes

olesho merged commit 355b80a into main Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat/manage tabs #8

Feat/manage tabs #8

Uh oh!

olesho commented Oct 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 22, 2025

Uh oh!

tysonthomas9 commented Oct 22, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 22, 2025

Uh oh!

olesho Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	echo "✅ All checks passed! Eval-server build is working."
	echo "✅ All checks passed! browser-agent-server build is working."

Feat/manage tabs #8

Feat/manage tabs #8

Uh oh!

Conversation

olesho commented Oct 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

tysonthomas9 commented Oct 22, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

olesho Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants