Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

⚰️ Remove deprecated
#3704 opened Jul 8, 2025 by qgallouedec Loading…
5 tasks
[GRPO] Log generation entropy
#3700 opened Jul 7, 2025 by LeonEricsson Loading…
2 of 5 tasks
FSDP2+GRPO
#3687 opened Jul 3, 2025 by SalmanMohammadi Loading…
5 tasks
Support FSDP2 in GRPOTrainer
#3670 opened Jun 30, 2025 by thepowerfuldeez Loading…
[SFT] Dry up the sft tests
#3657 opened Jun 27, 2025 by kashif Loading…
5 tasks
feat: Initial implementation of RePO trainer and components
#3655 opened Jun 26, 2025 by celsowm Loading…
5 tasks
Ensure Chat Template Safe Prompt Truncation
#3646 opened Jun 25, 2025 by pramodith Loading…
4 of 5 tasks
[WIP] vllm-server-spec-dec-support
#3643 opened Jun 24, 2025 by shirinyamani Loading…
5 tasks
GRPO: Pack Responses within the same group.
#3642 opened Jun 24, 2025 by pramodith Draft
4 of 5 tasks
Add Entropy Control to GRPOTrainer
#3628 opened Jun 22, 2025 by 1485840691 Loading…
Feature: Add SGLang support for GRPO Trainer
#3627 opened Jun 21, 2025 by PrinsYin Draft
5 tasks
[WIP] [SFT] SFT doc rewrite
#3619 opened Jun 18, 2025 by qgallouedec Loading…
5 tasks
ClearML logging of visualization in RewardTrainer evaluation
#3602 opened Jun 16, 2025 by ioverho Loading…
2 of 5 tasks
Fix: corrected fsdp in GRPO trainer
#3582 opened Jun 13, 2025 by tryumanshow Loading…
2 of 5 tasks
Check rewards shapes in RewardTrainer
#3577 opened Jun 13, 2025 by ioverho Loading…
4 tasks done
Chisquare regularized DPO
#3573 opened Jun 12, 2025 by asparius Loading…
[WIP] 🥳 new rloo
#3533 opened Jun 3, 2025 by shirinyamani Loading…
5 tasks
Push KTAE impl
#3518 opened May 30, 2025 by SamComber Loading…
5 tasks
intuit
#3513 opened May 29, 2025 by shirinyamani Loading…
5 tasks
🎀 New defaults: gradient_checkpointing=True
#3510 opened May 29, 2025 by qgallouedec Loading…
5 tasks
Add Bidirectional Knowledge Distillation Option to GKDTrainer
#3508 opened May 29, 2025 by shaischaudhry Loading…
3 of 5 tasks
HF Doc Builder style
#3498 opened May 26, 2025 by qgallouedec Draft
[GRPO] Adds SSR priorized replay buffer
#3496 opened May 26, 2025 by edbeeching Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.