Skip to content

[Cherry-Pick][CI] Add 4-GPU test job and fix stable_test(#6082)(#6270)#6279

Merged
EmmonsCurse merged 2 commits intoPaddlePaddle:release/2.4from
EmmonsCurse:add_4cards_24
Jan 30, 2026
Merged

[Cherry-Pick][CI] Add 4-GPU test job and fix stable_test(#6082)(#6270)#6279
EmmonsCurse merged 2 commits intoPaddlePaddle:release/2.4from
EmmonsCurse:add_4cards_24

Conversation

@EmmonsCurse
Copy link
Copy Markdown
Collaborator

Motivation

Existing CI coverage lacks validation for 4-card (4-GPU) deployment scenarios. This PR adds dedicated test cases to verify multi-card inference stability and basic functionality, ensuring issues related to distributed setup, resource allocation, and startup behavior can be detected earlier in CI.

In CI environments, using --ipc=host and --pid=host exposes the container to host-level IPC and process namespaces, which may introduce unintended interference between concurrent jobs running on the same machine.
Removing these options helps reduce cross-job coupling and improves isolation and stability of CI executions.

Modifications

  • Added new end-to-end test cases for 4-card (4-GPU) deployment scenarios
  • Integrated the new tests into the CI workflow
  • Add GLM E2E tests for non-MTP and MTP
  • Remove --ipc=host and --pid=host from _stable_test.yml

Usage or Command

N/A

Accuracy Tests

N/A

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Jan 29, 2026

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Jan 29, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (release/2.4@1079f92). Learn more about missing BASE report.

Additional details and impacted files
@@              Coverage Diff               @@
##             release/2.4    #6279   +/-   ##
==============================================
  Coverage               ?   56.98%           
==============================================
  Files                  ?      330           
  Lines                  ?    41307           
  Branches               ?     6289           
==============================================
  Hits                   ?    23538           
  Misses                 ?    15932           
  Partials               ?     1837           
Flag Coverage Δ
GPU 56.98% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@EmmonsCurse EmmonsCurse merged commit 0f35091 into PaddlePaddle:release/2.4 Jan 30, 2026
25 of 29 checks passed
@EmmonsCurse EmmonsCurse deleted the add_4cards_24 branch January 30, 2026 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants