[None][chore] Use affinity rather than CPU count for default build parallelism#12167
[None][chore] Use affinity rather than CPU count for default build parallelism#12167tburt-nv merged 3 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
📝 WalkthroughWalkthroughThe change introduces CPU affinity awareness to build parallelism defaults by adding a new helper function that retrieves the number of CPUs available to the current process, honoring Linux CPU affinity constraints. This function replaces direct Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan for PR comments
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/build_wheel.py (1)
2-2: Update copyright year to 2026.The file has been meaningfully modified. As per coding guidelines, the copyright header should reflect the year of the latest meaningful modification.
Proposed fix
-# SPDX-FileCopyrightText: Copyright (c) 2022-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-FileCopyrightText: Copyright (c) 2022-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/build_wheel.py` at line 2, Update the SPDX header year from "2022-2025" to "2022-2026" in the top-of-file copyright comment (the line starting with "# SPDX-FileCopyrightText: Copyright (c) 2022-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved."); replace the year range so the header reflects the latest meaningful modification year 2026.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/build_wheel.py`:
- Around line 42-52: The get_available_cpu_count function must handle
cpu_count() possibly returning None; update the except-path in
get_available_cpu_count to coerce the result to an int with a safe default
(e.g., 1) so the function always returns an int. Specifically, when catching
AttributeError and calling cpu_count(), treat a None result as 1 (or use max(1,
cpu_count() or 1)) and return that value. Ensure the change is made inside
get_available_cpu_count so downstream callers (e.g., parallel flags) never
receive None.
---
Nitpick comments:
In `@scripts/build_wheel.py`:
- Line 2: Update the SPDX header year from "2022-2025" to "2022-2026" in the
top-of-file copyright comment (the line starting with "# SPDX-FileCopyrightText:
Copyright (c) 2022-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.");
replace the year range so the header reflects the latest meaningful modification
year 2026.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b146b208-88c3-4864-967d-8883e376164e
📒 Files selected for processing (1)
scripts/build_wheel.py
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
/bot run |
|
PR_Github #38762 [ run ] triggered by Bot. Commit: |
|
PR_Github #38762 [ run ] completed with state
|
|
/bot run |
|
PR_Github #38783 [ run ] triggered by Bot. Commit: |
|
PR_Github #38783 [ run ] completed with state |
…rallelism (NVIDIA#12167) Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
Summary by CodeRabbit
Release Notes
Description
On some clusters, the build can have its affinity limited to a subset of CPU cores, as well as a portion of RAM. Launching more processes than available CPU cores is wasteful and raises the risk of running into OOM nvcc crashes.
Test Coverage
N/A
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.