-
Notifications
You must be signed in to change notification settings - Fork 1
Feat/manage tabs #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add VisionJudge class to evals/lib/judge.py for visual evaluation with screenshots - Add /page/content endpoint to api-server.js for retrieving page HTML/text - Add /page/screenshot endpoint to api-server.js for capturing screenshots - Fixes ImportError: cannot import name 'VisionJudge' These were lost during cleanup but are required for the eval runner to work. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Created separate example config files for different model providers: - config.example.openai.yml: OpenAI models (now default) - config.example.openrouter-gemini.yml: OpenRouter with Gemini - config.example.openrouter-gpt.yml: OpenRouter with GPT Main config.yml now defaults to OpenAI provider for reliability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…nt into feat/manage-tabs # Conflicts: # evals/config.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a comprehensive tab management system for the browser automation framework, transitioning from an eval-server architecture to a browser-agent-server approach. The changes remove the eval-server components and introduce new browser-agent-server functionality with improved tab handling, initialization scripts, and configuration management.
Key changes:
- Removed eval-server Python and NodeJS implementations
- Introduced browser-agent-server configuration for both standard and Cloud Run deployments
- Added container initialization scripts for cleanup of lock files
- Implemented comprehensive evaluation framework with LLM-as-a-judge capabilities
- Updated Chromium configuration to remove auto-devtools flag
Reviewed Changes
Copilot reviewed 112 out of 270 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| supervisor/services/browser-agent-server.conf | Added new supervisor configuration for browser-agent-server service |
| supervisor/services-cloudrun/browser-agent-server.conf | Updated Cloud Run supervisor config to use browser-agent-server instead of eval-server |
| supervisor/services/chromium.conf | Removed CHROMIUM_FLAGS environment variable for auto-opening devtools |
| scripts/init-container.sh | New initialization script for container startup that cleans lock files |
| scripts/test-browser-agent-server.sh | New test script for browser-agent-server Docker build |
| evals/run.py | New universal evaluation runner with screenshot capture and vision judge support |
| evals/lib/*.py | New evaluation framework library modules (config, judge, api client, eval loader) |
| evals/config*.yml | New configuration files for evaluation framework with model settings |
| evals/data/**/*.yaml | Updated evaluation definitions with corrected field names and structure |
Files not reviewed (2)
- browser-agent-server/nodejs/package-lock.json: Language not supported
- eval-server/nodejs/package-lock.json: Language not supported
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| docker run --rm browser-agent-server-test ls -la /browser-agent-server/node_modules | head -5 | ||
|
|
||
| echo "" | ||
| echo "✅ All checks passed! Eval-server build is working." |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message refers to 'Eval-server' but should refer to 'browser-agent-server' to match the actual component being tested.
| echo "✅ All checks passed! Eval-server build is working." | |
| echo "✅ All checks passed! browser-agent-server build is working." |
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
browser-agent-server/start.js
Outdated
| #!/usr/bin/env node | ||
|
|
||
| // Custom eval-server startup script for Cloud Run | ||
| // Custom browser-agent-server startup script for Cloud Run | ||
| // Uses environment variables for port configuration | ||
|
|
||
| import { EvalServer } from './src/lib/EvalServer.js'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update Cloud Run start script to use BrowserAgentServer
The startup script still imports ./src/lib/EvalServer.js and constructs EvalServer, but that module was removed in this commit when the implementation was renamed to BrowserAgentServer.js. Executing this script will throw ERR_MODULE_NOT_FOUND before the server even starts, so Cloud Run deployments will fail. Replace the import (and class name) with BrowserAgentServer to match the new file.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be tackled now.
No description provided.