Skip to content

Conversation

sergiopaniego
Copy link
Contributor

@sergiopaniego sergiopaniego commented Sep 25, 2025

Purpose

Update TRL integration docs with more recent details.
@hmellor


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the documentation Improvements or additions to documentation label Sep 25, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the TRL integration documentation, which is a valuable improvement. The new information about supported trainers and modes is helpful. I've provided a couple of suggestions to enhance clarity and consistency in the documentation. Specifically, I've pointed out an inconsistency between the list of trainers and the list of configurations, and noted that the usage of the vllm_mode parameter could be clarified.

@hmellor
Copy link
Member

hmellor commented Sep 26, 2025

Please fix DCO (instructions here).

Also you can fix the pre-commit checks with:

pip install pre-commit
pre-commit install
pre-commit run -a --hook-stage manual markdownlint

(for future changes pre-commit will run all the non-manual hooks automatically now)

Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some unrelated changes have been picked up in a merge. Can they please be reverted?

Comment on lines 11 to 30
- [`trl.GRPOConfig.use_vllm`](https://huggingface.co/docs/trl/main/en/grpo_trainer#trl.GRPOConfig.use_vllm)
- [`trl.OnlineDPOConfig.use_vllm`](https://huggingface.co/docs/trl/main/en/online_dpo_trainer#trl.OnlineDPOConfig.use_vllm)
- [`trl.RLOOConfig.use_vllm`](https://huggingface.co/docs/trl/main/en/rloo_trainer#trl.RLOOConfig.use_vllm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this list necessary now that the more complete link has been added above?

If yes, can we make it match the list above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the list since we already have the links to the docs above to simplify.
I've also added some more details about the modes of using vLLM during TRL training.

sergiopaniego and others added 4 commits September 30, 2025 10:39
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
sergiopaniego and others added 4 commits September 30, 2025 10:44
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for improving this doc!

@vllm-bot vllm-bot merged commit 1ad3aca into vllm-project:main Sep 30, 2025
5 checks passed
@sergiopaniego sergiopaniego deleted the trl-docs branch September 30, 2025 10:49
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants