Skip to content

Conversation

@pan-x-c
Copy link
Collaborator

@pan-x-c pan-x-c commented Feb 5, 2026

Description

As the title says.

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has passed all tests
  • Docstrings have been added/updated in Google Style
  • Documentation has been updated
  • Code is ready for review

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request prepares the project for its 0.5.0 release by updating version identifiers and refining system requirements. It also enhances compatibility with newer vLLM versions by adjusting dependency ranges and implementing conditional logic for tokenizer initialization, ensuring smoother operation with the latest vLLM features. Additionally, installation instructions have been slightly modified for clarity and correctness.

Highlights

  • Version Update: The project version has been incremented to 0.5.0 across pyproject.toml and trinity/__init__.py, signifying a new release.
  • GPU Requirements Refinement: Documentation in README.md, README_zh.md, and trinity_installation.md files has been updated to specify a minimum of "At least one NVIDIA GPU with compute capability 8.0 or higher" instead of "at least 2 GPUs".
  • vLLM Dependency Update: The vllm dependency range in pyproject.toml and the vllm patch logic in worker_patch.py have been extended to support versions up to 0.15.1.
  • vLLM Tokenizer Compatibility: Conditional logic was added to vllm_model.py to correctly initialize the tokenizer based on the vllm version, accommodating changes in get_tokenizer behavior in vllm versions >= 0.15.0.
  • Installation Instructions Adjustment: PyPI installation commands for flash-attn in the documentation were updated to include the --no-build-isolation flag, and the "Via PyPI" section was reordered in the English installation guide.
Changelog
  • README.md
    • Updated GPU hardware requirements.
  • README_zh.md
    • Updated GPU hardware requirements in the Chinese version.
  • docs/sphinx_doc/source/tutorial/trinity_installation.md
    • Revised GPU requirements.
    • Adjusted PyPI installation steps for flash-attn, including reordering the PyPI section.
  • docs/sphinx_doc/source_zh/tutorial/trinity_installation.md
    • Revised GPU requirements in the Chinese version.
    • Adjusted PyPI installation steps for flash-attn in the Chinese version.
  • pyproject.toml
    • Incremented the project version to 0.5.0.
    • Updated the maximum allowed vllm dependency to 0.15.1.
  • trinity/init.py
    • Updated the __version__ string to 0.5.0.
  • trinity/common/models/vllm_model.py
    • Implemented version-specific logic for vllm tokenizer initialization to support vllm versions 0.15.0 and above.
  • trinity/common/models/vllm_patch/worker_patch.py
    • Extended the supported vllm version range for the prompt logprobs patch to include versions up to 0.15.1.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/docker.yaml
Activity
  • No human activity (comments, reviews, approvals) has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request prepares for the v0.5.0 release. It includes version bumps, updates to the vllm dependency to support up to v0.15.1, and corresponding code changes for compatibility. The documentation has also been updated to reflect new GPU requirements and improved installation instructions. The changes are consistent and well-organized for a release. I have one minor suggestion for improving code maintainability.

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Feb 5, 2026

/unittest-all

@github-actions
Copy link

github-actions bot commented Feb 5, 2026

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
254 247 0 7 0 0 1h 34m

Skipped

Tests Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class skipped ⏭️
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner skipped ⏭️
tests/utils/swanlab_test.py::TestSwanlabMonitor::test_swanlab_monitor_smoke skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_std_grpo 5.2s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_step_wise_grpo_advantage 3.5s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo 5.1s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage 3.5s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias 2.0s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std 1.7s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage 2.0s
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_with_std_threshold 2.4s
tests/algorithm/kl_fn_test.py::KLFnTest::test_abs_kl_fn 1.8s
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_fallback 931ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_loss 982ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_same_policy 866ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_corrected_k3_with_old_logprob 840ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_dummy_kl_fn 834ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k1_kl_fn 807ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k2_kl_fn 841ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_k3_kl_fn 806ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_kl_loss_aggregation_modes 891ms
tests/algorithm/kl_fn_test.py::KLFnTest::test_low_var_kl_fn 872ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss 2.1s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss 2.0s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss 3.4s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss 1.5s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss 1.2s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss_with_sequence_masking 1.3s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sapo_policy_loss 1.9s
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss 997ms
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline 3h 4m
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation 1h 38m
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer 42m 4s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft 1h 10m
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo 1h 23m
tests/buffer/file_test.py::TestFileBuffer::test_file_reader 6m 20s
tests/buffer/file_test.py::TestFileBuffer::test_file_writer 28m 6s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter 8m 41s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter 8m 20s
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter 14m 7s
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter 16m 29s
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter 12m 7s
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter 3m 46s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse 1h 46m
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity 39m 3s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control 1h 8m
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue 51m 10s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue 51m 12s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity 58m 58s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration 13m 55s
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage 6.4s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy 34m 17s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy 29m 37s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy 29m 46s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy 29m 33s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy 1h 18m
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy 38m 2s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy 32m 34s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy 32m 52s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy 32m 52s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy 1h 7m
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0 1h 33m
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1 38m 22s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write 45m 33s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0 5m 26s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1 4m 45s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2 5m 18s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3 5m 17s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4 5m 20s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5 5m 23s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6 5m 41s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple 4m 39s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file 6m 20s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql 49m 8s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file 40.5s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql 45m 38s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file 41.4s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql 59m 41s
tests/cli/launcher_test.py::TestLauncherMain::test_debug_mode 11h 16m
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command 1h 39m
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc 28m 27s
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command 5m 15s
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run 4h 1m
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 8h 57m
tests/common/config_test.py::TestConfig::test_chat_template_path 5m 2s
tests/common/config_test.py::TestConfig::test_config_flatten 32.4s
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 6m 25s
tests/common/config_test.py::TestConfig::test_default_workflow 5m 1s
tests/common/config_test.py::TestConfig::test_load_default_config 1h 25m
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly 14m 11s
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation 5m 4s
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster 6m 16s
tests/common/experience_test.py::TestEID::test_eid_properties 592ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 539ms
tests/common/experience_test.py::TestExperience::test_assertions 319ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 392ms
tests/common/experience_test.py::TestExperience::test_gather 818ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward 567ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion 14.9s
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 366ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1.1s
tests/common/experience_test.py::TestExperience::test_single_turn_experience 354ms
tests/common/experience_test.py::TestExperience::test_to_dict 335ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 710ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 529ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 790ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields 491ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 599ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution 1.0s
tests/common/sudoku_test.py::test_9x9_generator_creates_holes 657ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled 673ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board 252ms
tests/common/sudoku_test.py::test_judge_detects_row_violation 222ms
tests/common/sudoku_test.py::test_judge_detects_column_violation 217ms
tests/common/sudoku_test.py::test_judge_detects_block_violation 242ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution 265ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled 257ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation 236ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation 224ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 15h 36m
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 10h 55m
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 10h 46m
tests/common/vllm_test.py::TestModelLen_0::test_model_len 7h 39m
tests/common/vllm_test.py::TestModelLen_1::test_model_len 8h 53m
tests/common/vllm_test.py::TestModelLen_2::test_model_len 7h 4m
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len 7h 37m
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation 7h 26m
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status 7h 29m
tests/common/vllm_test.py::TestAPIServer::test_api 8h 13m
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api 7h 39m
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async 8h 23m
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async ⏭️ 595ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask 4m 5s
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools 3m 59s
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls 8h 58m
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls 8h 43m
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate 25h 53m
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api 11h 39m
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 27h 36m
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer 20h 26m
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer 15h 2m
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer 50h 43m
tests/explorer/explorer_test.py::ServeTest::test_serve 15h 24m
tests/explorer/proxy_test.py::RecorderTest::test_recorder 1m 24s
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow 1h 23m
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 1h 24m
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout 3h 38m
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 5h 42m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0 1h 25m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1 1h 20m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0 1h 19m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1 1h 23m
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution 1h 29m
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow 1h 21m
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait 2h 27m
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 4h 4m
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 2h 32m
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 2h 13m
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid 6h 57m
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 2h 13m
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 3h 48m
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection 2h 53m
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0 1.6s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1 10m 3s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0 1.4s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1 16m 42s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error 1.2s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps 16m 42s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 12.3s
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 17.3s
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 10m 41s
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow 4.2s
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 12.0s
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 8.0s
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0 770ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1 1m 41s
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0 761ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1 3m 21s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow 6h 19m
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow 6h 33m
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording 1h 6m
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0 12m 55s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 13.7s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner 2m 17s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state 2h 14m
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai 7h 34m
tests/explorer/workflow_test.py::TestConcurrentWorkflowRunner::test_concurrent_workflow_runner 10h 58m
tests/manager/synchronizer_test.py::TestSynchronizerExit_0::test_synchronizer 44h
tests/manager/synchronizer_test.py::TestSynchronizerExit_1::test_synchronizer 41h 25m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer 34h 23m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer 26h 48m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer 39h 7m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer 42h 1m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_4::test_synchronizer 40h 35m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_5::test_synchronizer 43h 57m
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer 19h 28m
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer 18h 2m
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_2::test_synchronizer 18h 6m
tests/service/data_juicer_test.py::TestDataJuicer::test_config 18m 9s
tests/service/data_juicer_test.py::TestDataJuicer::test_server_start 5h 58m
tests/service/data_juicer_test.py::TestDataJuicerExperiencePipeline::test_data_juicer_operators 5h 44m
tests/service/data_juicer_test.py::TestDataJuicerTaskPipeline::test_data_juicer_task_pipeline 4h 11m
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer 62h 59m
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer 86h 4m
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 26h 53m
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer 18h 28m
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer 19h 26m
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer 19h 31m
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer 20h 43m
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 39h 36m
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 9h 31m
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer 8h 24m
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools 9h 2m
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode 27h 12m
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode 27h 17m
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode 40h 32m
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer 48h 16m
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer 98h 14m
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer 33h 29m
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer 29h 33m
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer ⏭️ 16m 59s
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer ⏭️ 20m 47s
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer 55h 30m
tests/trainer/trainer_test.py::TestOverRollout::test_trainer 18h 33m
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer 12h 42m
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer ⏭️ 780ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class ⏭️ 293ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner ⏭️ 347ms
tests/trainer/trainer_test.py::ColocateModeTest::test_trainer 33h 30m
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_equivalent 10.3s
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_not_equivalent 1.2s
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_ground_truth 1.8s
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_solution_string 294ms
tests/utils/eval_utils_test.py::TestComputeScore::test_multiple_boxed_answers_in_solution 1.8s
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_equivalent 1.1s
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_not_equivalent 1.2s
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_not_boxed 293ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_raw_and_ground_truth_boxed_equivalent 1.1s
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_extract_answer 4.2s
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_verify_math_answer 1m 3s
tests/utils/eval_utils_test.py::TestEvalUtils::test_is_equiv 5.3s
tests/utils/log_test.py::LogTest::test_actor_log 37m 51s
tests/utils/log_test.py::LogTest::test_group_by_node 35m 52s
tests/utils/log_test.py::LogTest::test_no_actor_log 18m 35s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_0__workspace_tests_utils_plugins 5m 21s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_1_tests_utils_plugins 5m 9s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_0__workspace_tests_utils_plugins 2h 29m
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_1_tests_utils_plugins 2h 26m
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_0__workspace_tests_utils_plugins 1h 23m
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_1_tests_utils_plugins 1h 21m
tests/utils/registry_test.py::TestRegistryWithRay::test_dynamic_import 39m 8s
tests/utils/registry_test.py::TestRegistry::test_algorithm_registry_mapping 8.6s
tests/utils/registry_test.py::TestRegistry::test_buffer_module_registry_mapping 3.2s
tests/utils/registry_test.py::TestRegistry::test_common_module_registry_mapping 54.9s
tests/utils/registry_test.py::TestRegistry::test_register_module 530ms
tests/utils/registry_test.py::TestRegistry::test_utils_module_registry_mapping 677ms
tests/utils/swanlab_test.py::TestSwanlabMonitor::test_swanlab_monitor_smoke ⏭️ 421ms

Github Test Reporter by CTRF 💚

@pan-x-c
Copy link
Collaborator Author

pan-x-c commented Feb 5, 2026

/unittest-module-common

@github-actions
Copy link

github-actions bot commented Feb 5, 2026

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
55 54 0 1 0 0 10m 27s

Skipped

Tests Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid 9h 21m
tests/common/config_test.py::TestConfig::test_chat_template_path 5m 1s
tests/common/config_test.py::TestConfig::test_config_flatten 32.9s
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid 6m 24s
tests/common/config_test.py::TestConfig::test_default_workflow 4m 59s
tests/common/config_test.py::TestConfig::test_load_default_config 1h 39m
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly 5m
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation 5m 5s
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster 31m 57s
tests/common/experience_test.py::TestEID::test_eid_properties 516ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type 469ms
tests/common/experience_test.py::TestExperience::test_assertions 322ms
tests/common/experience_test.py::TestExperience::test_dpo_experience 386ms
tests/common/experience_test.py::TestExperience::test_gather 876ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward 549ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion 14.8s
tests/common/experience_test.py::TestExperience::test_multi_turn_experience 357ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize 1.9s
tests/common/experience_test.py::TestExperience::test_single_turn_experience 361ms
tests/common/experience_test.py::TestExperience::test_to_dict 312ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion 686ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion 516ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion 786ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields 492ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion 626ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution 972ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes 676ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled 1.6s
tests/common/sudoku_test.py::test_judge_allows_incomplete_board 240ms
tests/common/sudoku_test.py::test_judge_detects_row_violation 233ms
tests/common/sudoku_test.py::test_judge_detects_column_violation 221ms
tests/common/sudoku_test.py::test_judge_detects_block_violation 220ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution 271ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled 250ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation 234ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation 218ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate 15h 40m
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate 11h 6m
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate 10h 42m
tests/common/vllm_test.py::TestModelLen_0::test_model_len 7h 40m
tests/common/vllm_test.py::TestModelLen_1::test_model_len 6h 44m
tests/common/vllm_test.py::TestModelLen_2::test_model_len 7h 46m
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len 8h 54m
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation 7h 23m
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status 7h 33m
tests/common/vllm_test.py::TestAPIServer::test_api 8h 13m
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api 7h 18m
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async 7h 58m
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async ⏭️ 654ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask 4m 14s
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools 4m
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls 8h 53m
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls 7h 48m
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate 26h 5m
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api 11h 9m

Github Test Reporter by CTRF 💚

@pan-x-c pan-x-c merged commit fdfdc7b into agentscope-ai:main Feb 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants