Skip to content

Conversation

@xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Nov 11, 2025


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:97e82c2-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-97e82c2-python \
  ghcr.io/openhands/agent-server:97e82c2-python

All tags pushed for this build

ghcr.io/openhands/agent-server:97e82c2-golang-amd64
ghcr.io/openhands/agent-server:97e82c2-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:97e82c2-golang-arm64
ghcr.io/openhands/agent-server:97e82c2-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:97e82c2-java-amd64
ghcr.io/openhands/agent-server:97e82c2-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:97e82c2-java-arm64
ghcr.io/openhands/agent-server:97e82c2-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:97e82c2-python-amd64
ghcr.io/openhands/agent-server:97e82c2-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:97e82c2-python-arm64
ghcr.io/openhands/agent-server:97e82c2-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:97e82c2-golang
ghcr.io/openhands/agent-server:97e82c2-java
ghcr.io/openhands/agent-server:97e82c2-python

About Multi-Architecture Support

  • Each variant tag (e.g., 97e82c2-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 97e82c2-python-amd64) are also available if needed

@xingyaoww xingyaoww added the integration-test Runs the integration tests and comments the results label Nov 11, 2025
@github-actions
Copy link
Contributor

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

@xingyaoww xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Nov 11, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 11, 2025

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

Last updated: 2025-11-11 18:35:41 UTC

Example Status Duration Cost
01_standalone_sdk/02_custom_tools.py ✅ PASS 30s $0.02
01_standalone_sdk/03_activate_skill.py ✅ PASS 11s $0.01
01_standalone_sdk/05_use_llm_registry.py ✅ PASS 10s $0.01
01_standalone_sdk/07_mcp_integration.py ✅ PASS 42s $0.02
01_standalone_sdk/09_pause_example.py ✅ PASS 11s $0.01
01_standalone_sdk/10_persistence.py ✅ PASS 33s $0.02
01_standalone_sdk/11_async.py ✅ PASS 33s $0.03
01_standalone_sdk/12_custom_secrets.py ✅ PASS 16s $0.01
01_standalone_sdk/13_get_llm_metrics.py ✅ PASS 41s $0.02
01_standalone_sdk/14_context_condenser.py ✅ PASS 201s $0.41
01_standalone_sdk/17_image_input.py ✅ PASS 17s $0.02
01_standalone_sdk/18_send_message_while_processing.py ✅ PASS 27s $0.02
01_standalone_sdk/19_llm_routing.py ✅ PASS 14s $0.02
01_standalone_sdk/20_stuck_detector.py ✅ PASS 16s $0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py ✅ PASS 11s $0.01
01_standalone_sdk/22_anthropic_thinking.py ✅ PASS 12s $0.01
01_standalone_sdk/23_responses_reasoning.py ✅ PASS 40s $0.01
01_standalone_sdk/24_planning_agent_workflow.py ✅ PASS 248s $0.27
01_standalone_sdk/25_agent_delegation.py ❌ FAIL (exit: 1) 76s $0.00
01_standalone_sdk/26_custom_visualizer.py ✅ PASS 25s $0.00N/A
02_remote_agent_server/01_convo_with_local_agent_server.py ✅ PASS 71s $0.06
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py ✅ PASS 102s $0.04
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py ✅ PASS 57s $0.04
02_remote_agent_server/04_convo_with_api_sandboxed_server.py ✅ PASS 98s $0.02

❌ Some tests failed

Total: 24 | Passed: 23 | Failed: 1

View full workflow run

@github-actions
Copy link
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
TOTAL12171563353% 
report-only-changed-files is enabled. No files were changed during this commit :)

@github-actions
Copy link
Contributor

🧪 Integration Tests Results

Overall Success Rate: 100.0%
Total Cost: $0.70
Models Tested: 4
Timestamp: 2025-11-11 18:19:29 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

📊 Summary

Model Success Rate Tests Passed Skipped Total Tests Cost
litellm_proxy_gpt_5_mini_2025_08_07 100.0% 8/8 0 8 $0.04
litellm_proxy_claude_sonnet_4_5_20250929 100.0% 8/8 0 8 $0.39
litellm_proxy_moonshot_kimi_k2_thinking 100.0% 7/7 1 8 $0.25
litellm_proxy_deepseek_deepseek_chat 100.0% 7/7 1 8 $0.02

📋 Detailed Results

litellm_proxy_gpt_5_mini_2025_08_07

  • Success Rate: 100.0% (8/8)
  • Total Cost: $0.04
  • Run Suffix: litellm_proxy_gpt_5_mini_2025_08_07_f45c900_gpt5_mini_run_N8_20251111_181445

litellm_proxy_claude_sonnet_4_5_20250929

  • Success Rate: 100.0% (8/8)
  • Total Cost: $0.39
  • Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_f45c900_sonnet_run_N8_20251111_181440

litellm_proxy_moonshot_kimi_k2_thinking

  • Success Rate: 100.0% (7/7)
  • Total Cost: $0.25
  • Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_f45c900_kimi_k2_run_N8_20251111_181447
  • Skipped Tests: 1

Skipped Tests:

  • t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_deepseek_deepseek_chat

  • Success Rate: 100.0% (7/7)
  • Total Cost: $0.02
  • Run Suffix: litellm_proxy_deepseek_deepseek_chat_f45c900_deepseek_run_N8_20251111_181441
  • Skipped Tests: 1

Skipped Tests:

  • t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

@openhands-ai
Copy link

openhands-ai bot commented Nov 11, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Run Examples Scripts

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1138 at branch `xw/release`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Copy link
Collaborator

@tofarr tofarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@xingyaoww
Copy link
Collaborator Author

Example scripts actually works, it is failing due to #1112

@xingyaoww xingyaoww merged commit 4e2ecd8 into main Nov 11, 2025
53 of 54 checks passed
@xingyaoww xingyaoww deleted the xw/release branch November 11, 2025 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration-test Runs the integration tests and comments the results test-examples Run all applicable "examples/" files. Expensive operation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants