Skip to content

[CI] Ensure container cleanup after job to avoid resource leakage#7315

Merged
EmmonsCurse merged 2 commits intoPaddlePaddle:developfrom
EmmonsCurse:ci_optimize_dev_0410
Apr 10, 2026
Merged

[CI] Ensure container cleanup after job to avoid resource leakage#7315
EmmonsCurse merged 2 commits intoPaddlePaddle:developfrom
EmmonsCurse:ci_optimize_dev_0410

Conversation

@EmmonsCurse
Copy link
Copy Markdown
Collaborator

Motivation

The CI pipeline may leave behind running containers or uncleaned workspaces when jobs are canceled or fail unexpectedly. This can cause resource leakage, workspace conflicts, and instability in subsequent jobs.

Additionally, the use of --privileged in the build task is unnecessary for the current workflow and introduces avoidable security risks.

Modifications

  • Add a cleanup step to ensure containers are stopped and removed after each job.
  • Clean the workspace to prevent interference between different CI runs.
  • Improve overall CI stability, especially in scenarios where jobs are canceled or interrupted.
  • Remove --privileged from the build task to reduce unnecessary privilege usage and enhance security.
  • Use prebuilt wheel files to install xgrammar==0.1.19 and torch==2.6.0 specifically for the CI environment

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@EmmonsCurse
Copy link
Copy Markdown
Collaborator Author

/skip-ci ci_iluvatar
/skip-ci ci_hpu
/skip-ci build_xpu
/skip-ci gpu_4cards_test
/skip-ci coverage
/skip-ci stable_test
/skip-ci base_test

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 10, 2026

Thanks for your contribution!

Copy link
Copy Markdown

@fastdeploy-bot fastdeploy-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review | 2026-04-10 18:34 CST

📋 Review 摘要

PR 概述:为 CI workflow 添加容器清理步骤并移除不必要的 --privileged 参数
变更范围:12 个 GitHub workflow 配置文件、1 个 shell 脚本
影响面 Tag[CI]

问题

级别 文件 概述
🔴 Bug .github/workflows/_build_linux.yml:161 --rm 参数与清理步骤冲突,导致清理逻辑无法正常执行

总体评价

PR 意图合理,但存在关键 bug:docker run --rm 会自动删除容器,导致后续清理步骤无法执行,工作空间清理将失效。建议移除 --rm 参数。

docker run --rm --net=host \
--cap-add=SYS_PTRACE --privileged --shm-size=64G \
--cap-add=SYS_PTRACE --shm-size=64G \
--name ${runner_name} \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug docker run 使用了 --rm 参数,导致容器在退出时自动删除。后续清理步骤中的 docker rm -f ${{ runner.name }} 会因容器已不存在而失败,且 docker exec 在容器退出后无法执行工作空间清理。

建议:移除 --rm 参数,因为已经有显式的清理步骤。修改第 159-161 行为:

docker run --net=host \
--cap-add=SYS_PTRACE --shm-size=64G \
--name ${runner_name} \

@EmmonsCurse EmmonsCurse merged commit 1269eda into PaddlePaddle:develop Apr 10, 2026
90 of 91 checks passed
@EmmonsCurse EmmonsCurse deleted the ci_optimize_dev_0410 branch April 10, 2026 14:32
EmmonsCurse added a commit to EmmonsCurse/FastDeploy that referenced this pull request Apr 11, 2026
EmmonsCurse added a commit that referenced this pull request Apr 12, 2026
* [Cherry-Pick][CI] Sync dev optimizations to 2.5(#7315)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants