Skip to content

Dev/refine models#3

Merged
wangxingjun778 merged 13 commits into
modelscope:mainfrom
wangxingjun778:dev/refine_models
Dec 18, 2023
Merged

Dev/refine models#3
wangxingjun778 merged 13 commits into
modelscope:mainfrom
wangxingjun778:dev/refine_models

Conversation

@wangxingjun778

Copy link
Copy Markdown
Member

No description provided.

@wangxingjun778 wangxingjun778 merged commit bd6726b into modelscope:main Dec 18, 2023
Yunnglin pushed a commit that referenced this pull request Mar 7, 2025
Yunnglin pushed a commit that referenced this pull request Dec 3, 2025
Yunnglin added a commit that referenced this pull request May 27, 2026
- Replace two asyncio.run() calls with single AsyncioLoopRunner.run()
  to avoid cross-loop resource errors (Gemini #1, Copilot #2 #3)
- Add _sanitize_local_paths() in benchmark_stats.py to strip absolute
  home paths from generated sample_example (Copilot #4 #5)
- Regenerate _meta JSON and docs via make docs-pipeline
- Add test_terminal_bench_v2_1 smoke test with environment_kwargs (Copilot #6)
Yunnglin added a commit that referenced this pull request May 27, 2026
#1376)

* feat(benchmarks): add Terminal-Bench v2.1 + upgrade harbor + expose environment_kwargs

- Add terminal_bench_v2_1 adapter (Terminal-Bench 2.1 with 26 task fixes)
- Upgrade harbor dependency from ==0.1.28 to >=0.8.0,<1.0.0
- Switch dataset source from GitHub to Harbor Hub
- Add environment_kwargs extra param for container resource config
- Update agent_name description to clarify terminus-2 vs standalone agents
- Update docs (en/zh) for both v2 and v2.1

Closes #1327, closes #1373

* fix: address PR review — single event loop, path sanitization, v2.1 test

- Replace two asyncio.run() calls with single AsyncioLoopRunner.run()
  to avoid cross-loop resource errors (Gemini #1, Copilot #2 #3)
- Add _sanitize_local_paths() in benchmark_stats.py to strip absolute
  home paths from generated sample_example (Copilot #4 #5)
- Regenerate _meta JSON and docs via make docs-pipeline
- Add test_terminal_bench_v2_1 smoke test with environment_kwargs (Copilot #6)

* update

* fix: address remaining PR review — type hint, env kwarg guard, dynamic trace env

- Add return type annotation to _on_inference
- Strip reserved 'type' key from environment_kwargs to prevent TypeError
- Use self.environment_type in AgentTrace instead of hardcoded 'docker'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant