Skip to content
This repository was archived by the owner on Nov 19, 2025. It is now read-only.

fix: pynvml is pinned when using TRTLLM v13 due to breaking change in 12.0.0#485

Merged
terrykong merged 3 commits intomainfrom
tk/pynvml-fix
Jan 21, 2025
Merged

fix: pynvml is pinned when using TRTLLM v13 due to breaking change in 12.0.0#485
terrykong merged 3 commits intomainfrom
tk/pynvml-fix

Conversation

@terrykong
Copy link
Copy Markdown
Collaborator

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Changelog

  • Please update the CHANGELOG.md under next version with high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

Checklist when contributing a new algorithm

  • Does the trainer resume and restore model state all states?
  • Does the trainer support all parallelism techniques(PP, TP, DP)?
  • Does the trainer support max_steps=-1 and validation?
  • Does the trainer only call APIs defined in alignable_interface.py?
  • Does the trainer have proper logging?

Additional Information

  • Related to # (issue)

… 12.0.0

Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong requested a review from ko3n1g January 17, 2025 20:50
@terrykong terrykong added the Run CICD Set + un-set to retrigger (add after r*.*.* labels) label Jan 17, 2025
Copy link
Copy Markdown
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for placing this dependency here instead into the requirements.txt?

@terrykong
Copy link
Copy Markdown
Collaborator Author

What was the reason for placing this dependency here instead into the requirements.txt?

At the moment, we install --no-deps, so it actually wouldn't make a difference, but that's a good point that we can also document this in requirements.txt since we'll eventually rely on that

ko3n1g
ko3n1g previously approved these changes Jan 17, 2025
Copy link
Copy Markdown
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder! SG, I think we’ll anyway be revisiting the dependency management soon

Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong
Copy link
Copy Markdown
Collaborator Author

@ko3n1g Updated requirements.txt + comment

@terrykong terrykong added Run CICD Set + un-set to retrigger (add after r*.*.* labels) and removed Run CICD Set + un-set to retrigger (add after r*.*.* labels) labels Jan 17, 2025
Copy link
Copy Markdown
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change!

@terrykong terrykong enabled auto-merge (squash) January 21, 2025 17:07
@terrykong terrykong added Run CICD Set + un-set to retrigger (add after r*.*.* labels) and removed Run CICD Set + un-set to retrigger (add after r*.*.* labels) labels Jan 21, 2025
@terrykong terrykong merged commit 5f4f6d6 into main Jan 21, 2025
@terrykong terrykong deleted the tk/pynvml-fix branch January 21, 2025 17:46
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Run CICD Set + un-set to retrigger (add after r*.*.* labels)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants