Fix run examples workflow failed on schedule run & use parallel execution with pytest #1229

ryanhoangt · 2025-11-22T10:10:13Z

This PR is to:

Fix existing examples & failure to run on cron job
Speed up with multiple workers using pytest-xdist

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:30c3275-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-30c3275-python \
  ghcr.io/openhands/agent-server:30c3275-python

All tags pushed for this build

ghcr.io/openhands/agent-server:30c3275-golang-amd64
ghcr.io/openhands/agent-server:30c3275-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:30c3275-golang-arm64
ghcr.io/openhands/agent-server:30c3275-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:30c3275-java-amd64
ghcr.io/openhands/agent-server:30c3275-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:30c3275-java-arm64
ghcr.io/openhands/agent-server:30c3275-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:30c3275-python-amd64
ghcr.io/openhands/agent-server:30c3275-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:30c3275-python-arm64
ghcr.io/openhands/agent-server:30c3275-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:30c3275-golang
ghcr.io/openhands/agent-server:30c3275-java
ghcr.io/openhands/agent-server:30c3275-python

About Multi-Architecture Support

Each variant tag (e.g., 30c3275-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 30c3275-python-amd64) are also available if needed

github-actions · 2025-11-24T12:37:32Z

"## 🔄 Running Examples with openhands/claude-haiku-4-5-20251001 Generated: 2025-11-24 16:54:13 UTC | Example | Status | Duration | Cost | |---------|--------|----------|------| | 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 32.1s | $0.03 | | 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 10.0s | $0.01 | | 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 12.0s | $0.01 | | 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 50.0s | $0.02 | | 01_standalone_sdk/09_pause_example.py | ✅ PASS | 14.6s | $0.01 | | 01_standalone_sdk/10_persistence.py | ✅ PASS | 37.1s | $0.02 | | 01_standalone_sdk/11_async.py | ✅ PASS | 40.5s | $0.03 | | 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 14.1s | $0.01 | | 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 32.7s | $0.01 | | 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 3m 44s | $0.41 | | 01_standalone_sdk/17_image_input.py | ✅ PASS | 17.3s | $0.02 | | 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 15.5s | $0.01 | | 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 31.9s | $0.02 | | 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 20.4s | $0.02 | | 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 10.2s | $0.00 | | 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 18.0s | $0.01 | | 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 36.9s | $0.01 | | 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 5m 40s | $0.41 | | 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 48.8s | $0.04 | | 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 34.1s | $0.03 | | 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 1m 14s | $0.05 | | 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 55s | $0.04 | | 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 7s | $0.06 | | 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ❌ FAIL
Exit code 1 | 5m 13s | -- | --- ### ❌ Some tests failed Total: 24 | Passed: 23 | Failed: 1 | Total Cost: $1.27 Failed examples: - examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1 View full workflow run"

This reverts commit 4e998b7.

github-actions · 2025-11-24T15:26:16Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
TOTAL	12502	5792	53%

report-only-changed-files is enabled. No files were changed during this commit :)

openhands-ai · 2025-11-24T16:54:31Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Run Examples Scripts

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1229 at branch `ht/fix-examples`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

github-actions · 2025-11-24T17:09:19Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`\n\n_Run in progress..._\n

github-actions · 2025-11-24T17:16:21Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2025-11-24 17:26:47 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	29.9s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	10.3s	$0.01
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	13.7s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	49.4s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	17.0s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	39.7s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	37.8s	$0.02
01_standalone_sdk/12_custom_secrets.py	✅ PASS	14.0s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	32.7s	$0.01
01_standalone_sdk/14_context_condenser.py	✅ PASS	3m 15s	$0.37
01_standalone_sdk/17_image_input.py	✅ PASS	17.2s	$0.02
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	21.5s	$0.01
01_standalone_sdk/19_llm_routing.py	✅ PASS	17.4s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	23.9s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.4s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	13.5s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	39.7s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	7m 34s	$0.57
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 31s	$0.09
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	25.9s	$0.02
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	1m 8s	$0.05
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 3s	$0.02
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	2m 17s	$0.04
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	4m 57s	$0.03

✅ All tests passed!

Total: 24 | Passed: 24 | Failed: 0 | Total Cost: $1.42

View full workflow run

github-actions · 2025-11-24T18:13:54Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2025-11-24 18:21:33 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	38.5s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	13.6s	$0.01
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	15.1s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	51.0s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	18.5s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	43.5s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	36.3s	$0.03
01_standalone_sdk/12_custom_secrets.py	✅ PASS	23.0s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	35.3s	$0.02
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 43s	$0.30
01_standalone_sdk/17_image_input.py	✅ PASS	17.4s	$0.02
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	20.7s	$0.01
01_standalone_sdk/19_llm_routing.py	✅ PASS	17.9s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	22.0s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	13.8s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	22.2s	$0.02
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	41.8s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	4m 46s	$0.32
01_standalone_sdk/25_agent_delegation.py	✅ PASS	47.0s	$0.04
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	24.6s	$0.03
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	56.4s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	2m 29s	$0.04
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	2m 45s	$0.10
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 40s	$0.03

✅ All tests passed!

Total: 24 | Passed: 24 | Failed: 0 | Total Cost: $1.13

View full workflow run

xingyaoww

This is so awesome! Thank you!

github-actions · 2025-11-24T19:16:03Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2025-11-24 19:24:10 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	29.1s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	12.5s	$0.01
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	12.9s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	48.3s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	16.6s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	39.0s	$0.02
01_standalone_sdk/11_async.py	✅ PASS	37.8s	$0.03
01_standalone_sdk/12_custom_secrets.py	✅ PASS	19.4s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	32.7s	$0.01
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 58s	$0.34
01_standalone_sdk/17_image_input.py	✅ PASS	19.0s	$0.02
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	23.6s	$0.01
01_standalone_sdk/19_llm_routing.py	✅ PASS	24.9s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	20.8s	$0.01
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	11.4s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	16.6s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	38.2s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	3m 41s	$0.21
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 42s	$0.22
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	23.0s	$0.02
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	1m 10s	$0.05
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	2m 28s	$0.05
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	2m 54s	$0.07
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 37s	$0.03

✅ All tests passed!

Total: 24 | Passed: 24 | Failed: 0 | Total Cost: $1.23

View full workflow run

ryanhoangt added 4 commits November 22, 2025 10:08

dont wait for server build on scheduled run

907b241

fix example 25

4a6c463

fix example 26

43ce9b8

test parallelization

4e998b7

ryanhoangt added the test-examples Run all applicable "examples/" files. Expensive operation. label Nov 24, 2025

ryanhoangt added 2 commits November 24, 2025 14:58

Revert "test parallelization"

0436ad9

This reverts commit 4e998b7.

use pytest for parallel run

5ebeb0f

ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

fix workflow

c0b8f54

ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

ryanhoangt added 3 commits November 24, 2025 16:00

use 4 worker

832a608

fix data parsing

26d1065

increase to 4 workers

7481b07

ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

fix comments

2d79688

ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

ryanhoangt marked this pull request as ready for review November 24, 2025 16:41

ryanhoangt added 2 commits November 24, 2025 17:02

fix json handling

b002393

dont reuse comment

2b705bf

ryanhoangt removed the test-examples Run all applicable "examples/" files. Expensive operation. label Nov 24, 2025

use 4 workers

8217b00

ryanhoangt added the test-examples Run all applicable "examples/" files. Expensive operation. label Nov 24, 2025

fix formatting

bae2bb9

ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

clean up

dc46525

ryanhoangt changed the title ~~Fix run examples workflow failed on schedule run & fix failed example scripts~~ Fix run examples workflow failed on schedule run & use parallel execution with pytest Nov 24, 2025

ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

ryanhoangt requested a review from xingyaoww November 24, 2025 18:12

Merge branch 'main' into ht/fix-examples

efe9614

xingyaoww approved these changes Nov 24, 2025

View reviewed changes

xingyaoww added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Nov 24, 2025

xingyaoww merged commit e996867 into main Nov 24, 2025
31 checks passed

xingyaoww deleted the ht/fix-examples branch November 24, 2025 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix run examples workflow failed on schedule run & use parallel execution with pytest #1229

Fix run examples workflow failed on schedule run & use parallel execution with pytest #1229

Uh oh!

ryanhoangt commented Nov 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

openhands-ai bot commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

xingyaoww left a comment

Uh oh!

github-actions bot commented Nov 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix run examples workflow failed on schedule run & use parallel execution with pytest #1229

Fix run examples workflow failed on schedule run & use parallel execution with pytest #1229

Uh oh!

Conversation

ryanhoangt commented Nov 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openhands-ai bot commented Nov 24, 2025

Uh oh!

github-actions bot commented Nov 24, 2025

🔄 Running Examples with openhands/claude-haiku-4-5-20251001\n\n_Run in progress..._\n

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

✅ All tests passed!

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

✅ All tests passed!

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

✅ All tests passed!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ryanhoangt commented Nov 22, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Nov 24, 2025 •

edited

Loading

github-actions bot commented Nov 24, 2025 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`\n\n_Run in progress..._\n

github-actions bot commented Nov 24, 2025 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

github-actions bot commented Nov 24, 2025 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

github-actions bot commented Nov 24, 2025 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`