-
Notifications
You must be signed in to change notification settings - Fork 2k
[https://nvbugs/5716787][fix] terminate nixl running when exiting #9785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
/bot run |
📝 WalkthroughWalkthroughThis change introduces a unified Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (1)
507-507: Consider setting mIsRunning = false earlier in destructor.Currently,
mIsRunning = falseis set after stopping progress threads and joining the ZMQ thread (line 507). Consider moving it to the very beginning of the destructor (before line 483) to signal shutdown state immediately, allowing any concurrent operations to detect and exit early before waiting for thread cleanup.Apply this diff to signal shutdown earlier:
UcxConnectionManager::~UcxConnectionManager() { TLLM_LOG_DEBUG(mRank, "UcxConnectionManager::~UcxConnectionManager"); + mIsRunning = false; for (auto& worker : mWorkersPool) { worker->stopProgressThread(); } // ... rest of cleanup ... - mIsRunning = false; mZmqRepSocket.close();
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
cpp/include/tensorrt_llm/executor/cacheCommunicator.h(1 hunks)cpp/tensorrt_llm/batch_manager/dataTransceiver.cpp(2 hunks)cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp(3 hunks)cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.h(2 hunks)cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp(1 hunks)cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.h(1 hunks)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp(2 hunks)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.h(2 hunks)tests/unittest/llmapi/test_llm_pytorch.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{cpp,h,cu}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.{cpp,h,cu}: Closing braces of namespaces should have a comment saying the namespace it closes (e.g.,} // namespace foo)
Preferconstorconstexprvariables over#definewhenever possible, as the latter are not visible to the compiler
A variable that is not modified after its initialization should be declared asconst
Except0(only used in comparison for checking signness/existence/emptiness) andnullptr,true,false, all other literals should only be used for variable initialization and should be replaced with named constants
Use Allman indentation style for braces in C++
Put the semicolon for an emptyfororwhileloop in a new line
The statement forming the body of aswitch,while,do .. whileorforstatement shall be a compound statement (use brace-delimited statements)
Ifandelseshould always be followed by brace-delimited statements, even if empty or a single statement
C++ filenames should use camel case with first letter lowercase (e.g.,thisIsASubDirandthisIsAFilename.cpp)
All filenames involved in compilation of a compilation target must have case-insensitive unique filenames
All types (including class names) should use camel case with uppercase first letter (e.g.,FooBarClass)
Local variables, methods and namespaces should use camel case with first letter lowercase (e.g.,localFooBar)
Non-magic-number global variables that are non-static and not defined in anonymous namespace should use camel case prefixed by a lower case 'g' (e.g.,gDontUseGlobalFoos)
Non-magic-number global variables that are static or defined in an anonymous namespace should use camel case prefixed by a lower case 's' (e.g.,sMutableStaticGlobal)
Locally visible static variables should use camel case with lowercase prefix 's' as the first letter of the name (e.g.,static std::once_flag sFlag;)
Public, private and protected class member variables should use camel case prefixed with 'm' (e.g.,mNbFooValues), though the 'm' pre...
Files:
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.hcpp/include/tensorrt_llm/executor/cacheCommunicator.hcpp/tensorrt_llm/batch_manager/dataTransceiver.cppcpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cppcpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cppcpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.hcpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.hcpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp
**/*.h
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.h: Use a preprocessor guard in C++ header files with the guard name formatTRTLLM_followed by the filename in all caps (e.g.,TRTLLM_FOO_BAR_HELLO_Hfor fileFooBarHello.h); do not include directory names in the symbol
Do not use underscore prefix or suffix in C++ preprocessor guard symbols; they are reserved in C++ standard for compilers or implementation
Files:
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.hcpp/include/tensorrt_llm/executor/cacheCommunicator.hcpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.hcpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.h
**/*.{cpp,h,cu,py}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code files should contain an NVIDIA copyright header that includes the current year at the top
Files:
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.hcpp/include/tensorrt_llm/executor/cacheCommunicator.hcpp/tensorrt_llm/batch_manager/dataTransceiver.cppcpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cppcpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cppcpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.hcpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.htests/unittest/llmapi/test_llm_pytorch.pycpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+
Indent Python code with 4 spaces; do not use tabs
Always maintain the namespace when importing in Python, even if only one class or function from a module is used (e.g., usefrom package.subpackage import fooand thenfoo.SomeClass()instead offrom package.subpackage.foo import SomeClass)
Python filenames should use snake_case (e.g.,some_file.py)
Python class names should use PascalCase (e.g.,class SomeClass)
Python function and method names should use snake_case (e.g.,def my_awesome_function():)
Python local variable names should use snake_case, with prefixkfor variable names that start with a number (e.g.,k_99th_percentile = ...)
Python global variables should use upper snake_case with prefixG(e.g.,G_MY_GLOBAL = ...)
Python constants should use upper snake_case (e.g.,MY_CONSTANT = ...)
Avoid shadowing variables declared in an outer scope in Python
Initialize all externally visible members of a Python class in the constructor
For Python interfaces that may be used outside a file, prefer docstrings over comments
Python comments should be reserved for code within a function, or interfaces that are local to a file
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx
Python attributes and variables can be documented inline with type and description (e.g.,self.x = 5followed by"""<type>: Description of 'x'""")
Avoid using reflection in Python when functionality can be easily achieved without reflection
When using try-except blocks in Python, limit the except clause to the smallest set of specific errors possible instead of catching all exceptions
When using try-except blocks in Python to handle multiple possible variable types (duck-typing), keep the body of the try as small as possible and use the else block to implement the logic
Files:
tests/unittest/llmapi/test_llm_pytorch.py
🧠 Learnings (5)
📚 Learning: 2025-08-06T08:18:28.669Z
Learnt from: zhengd-nv
Repo: NVIDIA/TensorRT-LLM PR: 6633
File: cpp/tensorrt_llm/batch_manager/dataTransceiverImpl.cpp:145-155
Timestamp: 2025-08-06T08:18:28.669Z
Learning: In cpp/tensorrt_llm/batch_manager/dataTransceiverImpl.cpp, the existing `mMtxForMap` mutex in DataSenderImpl is sufficient to synchronize measurement file operations in the `release` method, as all file operations occur within the same critical section that protects the `mRequestToSession` map access.
Applied to files:
cpp/tensorrt_llm/batch_manager/dataTransceiver.cpp
📚 Learning: 2025-08-20T06:56:02.889Z
Learnt from: eopXD
Repo: NVIDIA/TensorRT-LLM PR: 6768
File: cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp:577-579
Timestamp: 2025-08-20T06:56:02.889Z
Learning: In cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp, maxSequenceLength is now enforced as a non-optional argument in the BlockManager constructor, so concerns about std::nullopt defaulting to 0 are not applicable. When windowSize > maxSequenceLength, a warning should be added instead of handling optional parameter cases.
Applied to files:
cpp/tensorrt_llm/batch_manager/dataTransceiver.cpp
📚 Learning: 2025-08-21T09:41:49.347Z
Learnt from: eopXD
Repo: NVIDIA/TensorRT-LLM PR: 6768
File: cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp:2010-2045
Timestamp: 2025-08-21T09:41:49.347Z
Learning: In cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp, updateSequenceCacheBlockOffsets is specifically for updating bookkeeping when blocks are added during the context phase, not for refreshing offsets after detach operations. During detach operations, GenerationRequest::removeFrontBlock handles the necessary cache block bookkeeping internally.
Applied to files:
cpp/tensorrt_llm/batch_manager/dataTransceiver.cpp
📚 Learning: 2025-08-15T06:46:54.897Z
Learnt from: eopXD
Repo: NVIDIA/TensorRT-LLM PR: 6767
File: cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp:0-0
Timestamp: 2025-08-15T06:46:54.897Z
Learning: In cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp addToken function, newly allocated blocks are unshared by design. The beam search path in addToken (when sequence.getNumTokens() > windowSize) is currently broken/non-functional with SWA, so the block allocation doesn't follow a shared-then-unshared pattern.
Applied to files:
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.h
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
Repo: NVIDIA/TensorRT-LLM PR: 6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tests/unittest/llmapi/test_llm_pytorch.py
🧬 Code graph analysis (7)
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.h (3)
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp (2)
isRunning(603-606)isRunning(603-603)cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (2)
isRunning(80-83)isRunning(80-80)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (2)
isRunning(676-679)isRunning(676-676)
cpp/include/tensorrt_llm/executor/cacheCommunicator.h (2)
cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (2)
isRunning(80-83)isRunning(80-80)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (2)
isRunning(676-679)isRunning(676-676)
cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (3)
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp (2)
isRunning(603-606)isRunning(603-603)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (2)
isRunning(676-679)isRunning(676-676)cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.h (1)
MpiConnectionManager(41-55)
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (2)
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp (2)
isRunning(603-606)isRunning(603-603)cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (2)
isRunning(80-83)isRunning(80-80)
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.h (4)
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp (3)
nodiscard(395-406)isRunning(603-606)isRunning(603-603)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.h (2)
nodiscard(75-78)nodiscard(83-88)cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (2)
isRunning(80-83)isRunning(80-80)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (2)
isRunning(676-679)isRunning(676-676)
tests/unittest/llmapi/test_llm_pytorch.py (3)
tests/unittest/llmapi/apps/_test_openai_chat.py (1)
backend(26-27)tests/unittest/llmapi/apps/_test_openai_completions.py (1)
backend(21-22)tests/unittest/llmapi/apps/_test_openai_misc.py (1)
backend(14-15)
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp (2)
cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (2)
isRunning(80-83)isRunning(80-80)cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (2)
isRunning(676-679)isRunning(676-676)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (16)
cpp/include/tensorrt_llm/executor/cacheCommunicator.h (1)
69-69: LGTM! Well-designed interface addition.The new
isRunning()pure virtual method provides a clean way for external components to query connection manager lifecycle state. The[[nodiscard]]attribute ensures callers don't ignore the state, which is appropriate for this safety-critical query.cpp/tensorrt_llm/batch_manager/dataTransceiver.cpp (2)
625-628: Appropriate early-exit guard for graceful shutdown.The dual check of
mTerminate || !mManager->isRunning()correctly handles both internal shutdown signals and external manager lifecycle changes. This prevents the response thread from attempting operations on a shutting-down manager.
363-367: Verify that callers ofrecvRequestInfo()safely handle the returnedinfoobject during shutdown scenarios.When
connectionis null and the manager is not running, the function returnsinfowithout signaling its validity to callers. Ensure all call sites either validate the returned object's state or use a mechanism (e.g., optional, status field) to indicate shutdown conditions where the result is invalid.cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.h (1)
299-299: LGTM! Consistent implementation of running state tracking.The
isRunning()override andmIsRunningatomic flag follow the same pattern as other connection managers (MpiConnectionManager, UcxConnectionManager). The atomic initialization totrueis correct, indicating the manager starts in a running state.Also applies to: 313-313
cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.h (1)
45-45: LGTM! Complete lifecycle management interface.The additions provide proper lifecycle management with:
- Destructor to clean up and signal shutdown
isRunning()query method for external state checks- Atomic
mIsRunningflag for thread-safe state trackingConsistent with the broader pattern across all connection manager implementations.
Also applies to: 49-49, 55-55
cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.h (1)
65-65: LGTM! Consistent with other connection manager implementations.The
mIsRunningatomic flag andisRunning()override maintain consistency across all connection manager implementations (Agent, MPI, UCX), enabling uniform lifecycle management throughout the cache transmission layer.Also applies to: 90-90
cpp/tensorrt_llm/executor/cache_transmission/agent_utils/connection.cpp (4)
498-501: Appropriate early-exit for graceful shutdown.The early return when
!mIsRunningallows waiting threads to cleanly exit during shutdown. This prevents indefinite blocking when the manager is being destroyed.
599-599: Correct lifecycle management in destructor.Setting
mIsRunning = falseat the beginning of the destructor ensures that any concurrent operations can detect shutdown and exit early before resources are cleaned up. This ordering is critical for preventing use-after-free issues.
603-606: LGTM! Simple and thread-safe accessor.The
isRunning()implementation correctly returns the atomic flag, providing thread-safe state queries without requiring additional synchronization.
322-325: Verify nullptr handling in callers of recvConnectionAndRequestInfo.When
mIsRunningis false, this method returnsnullptrwithout settingrequestInfo. Ensure all callers (particularly in dataTransceiver.cpp) properly handle the null return and potentially uninitializedrequestInfo.cpp/tensorrt_llm/executor/cache_transmission/mpi_utils/connection.cpp (2)
80-83: LGTM! Thread-safe state accessor.The implementation correctly returns the atomic
mIsRunningflag, consistent with other connection manager implementations (AgentConnectionManager, UcxConnectionManager).
85-88: LGTM! Proper lifecycle signaling in destructor.Setting
mIsRunning = falsein the destructor ensures that any threads checking the running state can detect shutdown and exit gracefully. This follows the same pattern as the other connection managers.cpp/tensorrt_llm/executor/cache_transmission/ucx_utils/ucxCacheCommunicator.cpp (1)
676-679: LGTM! Consistent with other connection manager implementations.The
isRunning()implementation correctly returns the atomic flag, providing thread-safe state queries. This matches the pattern used in AgentConnectionManager and MpiConnectionManager.tests/unittest/llmapi/test_llm_pytorch.py (3)
1055-1055: LGTM! Good test coverage for both backends.The parameterization correctly extends the test to cover both NIXL and UCX backends, ensuring that the KV cache exhaustion and timeout behavior works correctly for both implementations. This aligns with the PR's goal of introducing unified running state tracking across ConnectionManager implementations.
1056-1057: Function signature correctly updated.The
backendparameter is properly added to accept the parameterized values from the decorator, maintaining compatibility with the existingsender_future_timeout_msparameter.
1077-1077: Backend parameter correctly applied.The change from hardcoded
"UCX"to the parameterizedbackendvariable correctly enables the test to run against both NIXL and UCX backends, validating that the new unified running state tracking works properly for both implementations.
|
PR_Github #27292 [ run ] triggered by Bot. Commit: |
|
PR_Github #27292 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #27349 [ run ] triggered by Bot. Commit: |
|
PR_Github #27349 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #27515 [ run ] triggered by Bot. Commit: |
|
PR_Github #27515 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #27546 [ run ] triggered by Bot. Commit: |
|
PR_Github #27546 [ run ] completed with state |
|
/bot run --disable-fail-fast |
1 similar comment
|
/bot run --disable-fail-fast |
|
PR_Github #27733 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-fail-fast |
|
PR_Github #27860 [ run ] triggered by Bot. Commit: |
|
PR_Github #27860 [ run ] completed with state |
…IDIA#9785) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com> Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
…IDIA#9785) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com> Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Summary by CodeRabbit
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
Details
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.