Skip to content

Conversation

@EmmonsCurse
Copy link
Collaborator

Motivation

This PR adds the ability to use the /re-run command in PR comments to easily restart any failed CI workflows.
It aims to improve the developer experience when handling flaky tests or temporary CI issues.

Modifications

  • Added new GitHub Actions workflow file: .github/workflows/rerun.yml
  • The workflow listens to issue_comment events and checks if the comment contains /re-run
  • Supports selective reruns for specific CI jobs by matching keywords such as:
    • all-failed, approval, ci_iluvatar, ci_xpu, codestyle, clone, build, run_ce_cases, accuracy_tests, base_tests, run_tests_logprob, run_tests_with_coverage, stable_tests,etc.
  • Automatically ensures rerun requests are only accepted from the PR author

Usage or Command

To re-run CI jobs, comment the following under your Pull Request:

/re-run all-failed              # all-failed
/re-run approval                # Approval
/re-run ci_iluvatar             # CI_ILUVATAR
/re-run ci_xpu                  # CI_XPU
/re-run codestyle               # Pre Commit(alias: /re-run pre_commit)
/re-run clone                   # FD-Clone-Linux / code-clone
/re-run build                   # FD-Build-Linux / fd-build
/re-run run_ce_cases            # Extracted partial CE model tasks to run in CI. / run_ce_cases
/re-run accuracy_tests          # Run Accuracy Tests / accuracy_tests
/re-run base_tests              # Run Base Tests / base_tests
/re-run run_tests_logprob       # Run FastDeploy LogProb Tests / run_tests_logprob
/re-run run_tests_with_coverage # Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage
/re-run diff_coverage_report    # Run FastDeploy Unit Tests and Coverage / diff_coverage_report
/re-run stable_tests            # Run Stable Tests / stable_tests

Accuracy Tests

This PR only modifies GitHub workflow logic and does not affect model accuracy or inference results.

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Note:

  1. Commands must be commented by the PR author to take effect.
    Only the pull request author’s comments will trigger the re-run workflow.
  2. You can trigger multiple re-runs by commenting several commands.
    Each /re-run command can be used independently to restart different CI workflows.
  3. Task dependencies and re-run timing:
    • The following jobs belong to the same execution group:clone, build, run_ce_cases, accuracy_tests, base_tests, run_tests_logprob, run_tests_with_coverage, stable_tests.
    • These jobs have a dependency chain:
    • ✅ clone → automatically triggers build after success
    • ✅ build → automatically triggers all test jobs: run_ce_cases, accuracy_tests, base_tests, run_tests_logprob, run_tests_with_coverage, stable_tests.
    ⚠️ Re-run restriction: Jobs in the same execution group can only be re-run after all jobs in the group have finished. Triggering a re-run before dependent jobs complete will not take effect.

Acknowledgements

  • Modification method reference: PaddlePaddle implementation
  • Thanks to @SigureMo and @DrRyanHuang for their guidance.

@paddle-bot
Copy link

paddle-bot bot commented Oct 26, 2025

Thanks for your contribution!

@EmmonsCurse EmmonsCurse merged commit ebae69b into PaddlePaddle:develop Oct 27, 2025
33 of 38 checks passed
kevincheng2 pushed a commit to kevincheng2/FastDeploy that referenced this pull request Oct 27, 2025
Jiang-Jia-Jun added a commit that referenced this pull request Oct 27, 2025
* support mm prefix caching

* update code

* fix mm_hashes

* support encoder cache

* add encoder cache

* update code

* update encoder cache

* fix features bug

* fix worker bug

* support processor cache, need to optimize yet

* refactor multimodal data cache

* update code

* update code

* update v1 scheduler

* update code

* update code

* update codestyle

* support turn off processor cache and encoder cache

* update pre-commit

* fix code

* solve review

* update code

* update code

* update test case

* set processor cache in GiB

* update test case

* support mm prefix caching for qwen model

* fix code style check

* update pre-commit

* fix unit test

* fix unit test

* add ci test case

* fix rescheduled bug

* change text_after_process to prompt_tokens

* fix unit test

* fix chat template

* change model path

* [EP] fix adapter bugs (#4572)

* Update expert_service.py

* Update common_engine.py

* Update expert_service.py

* fix v1 hang bug (#4573)

* fix import image_ops error on some platforms (#4559)

* [CLI]Update parameters in bench latecy cli tool and fix collect-env cli tool (#4558)

* add collect-env

* del files

* [Graph Optimization] Add dy_runnable and introduce cudagraph_switch_threshold for cudagraph mode switching (#4578)

* add new branch for sot

* reorder

* fix batch bug

* [XPU]Moe uses a new operator (#4585)

* [XPU]Moe uses a new operator

* [XPU]Moe uses a new operator

* update response

* [Feature] Support Paddle-OCR (#4396)

* init

* update code

* fix code style & disable thinking

* adapt for common_engine.update_mm_requests_chunk_size

* use 3d rope

* use flash_attn_unpadded

* opt siglip

* update to be compatible with the latest codebase

* fix typo

* optim OCR performance

* fix bug

* fix bug

* fix bug

* fix bug

* normlize name

* modify xpu rope

* revert logger

* fix bug

* fix bug

* fix bug

* support default_v1

* optim performance

* fix bug

---------

Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com>
Co-authored-by: zhangyue66 <zhangyue66@baidu.com>

* [DataProcessor] add reasoning_tokens into usage info (#4520)

* add reasoning_tokens into usage info initial commit

* add unit tests

* modify unit test

* modify and add unit tests

* fix unit test

* move steam usage to processor

* modify processor

* modify test_logprobs

* modify test_logprobs.py

* modify stream reasoning tokens accumulation

* fix unit test

* perf: Optimize task queue communication from engine to worker (#4531)

* perf: Optimize task queue communication from engine to worker

* perf: get_tasks to numpy

* perf: get_tasks remove to_numpy

* fix: request & replace ENV

* remove test_e2w_perf.py

* fix code style

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>

* Clean up ports after processing results (#4587)

* [CI] Add /re-run command in PR comments to restart failed CI workflows (#4593)

* [Others] api server exits when worker process is dead (#3271)

* [fix] fix terminal hangs when worker process is dead

* [chore] change sleep time of monitor

* [chore] remove redundant comments

* update docs

---------

Co-authored-by: ApplEOFDiscord <wwy640130@163.com>
Co-authored-by: ApplEOFDiscord <31272106+ApplEOFDiscord@users.noreply.github.com>
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com>
Co-authored-by: yinwei <yinwei_hust@163.com>
Co-authored-by: JYChen <zoooo0820@qq.com>
Co-authored-by: qwes5s5 <45442318+qwes5s5@users.noreply.github.com>
Co-authored-by: Ryan <zihaohuang@aliyun.com>
Co-authored-by: yyssys <atyangshuang@foxmail.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com>
Co-authored-by: zhangyue66 <zhangyue66@baidu.com>
Co-authored-by: kxz2002 <115912648+kxz2002@users.noreply.github.com>
Co-authored-by: SunLei <sunlei5788@gmail.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants