Skip to content

[Cherry-Pick][CI] Sync dev optimizations to 2.6(#7335)#7343

Merged
EmmonsCurse merged 1 commit intoPaddlePaddle:release/2.6from
EmmonsCurse:ci_optimize_26_0411
Apr 12, 2026
Merged

[Cherry-Pick][CI] Sync dev optimizations to 2.6(#7335)#7343
EmmonsCurse merged 1 commit intoPaddlePaddle:release/2.6from
EmmonsCurse:ci_optimize_26_0411

Conversation

@EmmonsCurse
Copy link
Copy Markdown
Collaborator

Motivation

The CI pipeline may leave behind running containers or uncleaned workspaces when jobs are canceled or fail unexpectedly. This can cause resource leakage, workspace conflicts, and instability in subsequent jobs.

Modifications

Cherry-pick of #7198 #7227 #7283 #7268 #7315 #7335 to release/2.6.

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 11, 2026

Thanks for your contribution!

@EmmonsCurse
Copy link
Copy Markdown
Collaborator Author

/skip-ci ci_iluvatar
/skip-ci ci_hpu
/skip-ci build_xpu

Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-11

📋 Review 摘要

PR 概述:Cherry-pick 将 dev 分支的 CI 优化同步到 release/2.6 分支

变更范围:13 个 GitHub workflow 文件和 1 个测试脚本

影响面 Tag[CI]

问题

级别 文件 概述
🔴 Bug .github/workflows/_accuracy_test.yml:227 容器清理逻辑存在漏洞,当容器已停止时 workspace 不会被清理
🟡 建议 .github/workflows/_accuracy_test.yml:163 --shm-size=64g 大小写与其他 workflow 不一致
🟡 建议 scripts/run_pre_ce.sh:10 缺少详细注释说明为何使用预编译 wheel 文件

总体评价

本次 CI 优化整体正确,改进了容器资源清理和错误处理。但存在一个 P0 级别的容器清理逻辑问题需要修复,该问题影响所有 13 个添加了清理步骤的 workflow 文件。建议修复后合并。

echo "TEST_EXIT_CODE=${TEST_EXIT_CODE}"
exit ${TEST_EXIT_CODE}

- name: Terminate and delete the container
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug 容器清理逻辑存在漏洞

当容器已停止时(例如测试失败导致容器退出),docker exec -t ${{ runner.name }} 会失败,导致 workspace 内容不会被清理,但容器仍会被 docker rm -f 删除。

建议修改为:

- name: Terminate and delete the container
  if: always()
  run: |
    set +e
    # 清理 workspace(如果容器仍在运行)
    docker exec -t ${{ runner.name }} /bin/bash -c 'find /workspace -mindepth 1 -delete' 2>/dev/null || true
    # 强制删除容器
    docker rm -f ${{ runner.name }}
    # 如果容器已停止但 workspace 残留,直接清理宿主机上的 workspace
    find $(pwd) -mindepth 1 -maxdepth 1 -delete 2>/dev/null || true

fi

docker run --rm --ipc=host --pid=host --net=host \
docker run --rm --net=host \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大小写不一致

此处的 --shm-size=64g 使用了小写 g,而其他 workflow 文件(如 _unit_test_coverage.yml)使用大写 G。虽然 Docker 对大小写不敏感,但建议保持一致。

python -m pip install -r requirements.txt
python -m pip install jsonschema aistudio_sdk==0.3.5
python -m pip install xgrammar==0.1.19 torch==2.6.0
# Use prebuilt wheel files to install xgrammar==0.1.19 and torch==2.6.0 specifically for the CI environment
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 建议 添加注释说明为什么需要使用预编译 wheel 文件

建议在注释中说明原因(例如:PyPI 源可能不可用、CI 网络环境限制、特定构建版本等),便于后续维护。

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (release/2.6@42b0f59). Learn more about missing BASE report.

Additional details and impacted files
@@              Coverage Diff               @@
##             release/2.6    #7343   +/-   ##
==============================================
  Coverage               ?   73.85%           
==============================================
  Files                  ?      376           
  Lines                  ?    52960           
  Branches               ?     8268           
==============================================
  Hits                   ?    39112           
  Misses                 ?    11112           
  Partials               ?     2736           
Flag Coverage Δ
GPU 73.85% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@EmmonsCurse EmmonsCurse merged commit 9e8ea7d into PaddlePaddle:release/2.6 Apr 12, 2026
36 of 37 checks passed
@EmmonsCurse EmmonsCurse deleted the ci_optimize_26_0411 branch April 12, 2026 05:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants