huggingface / trl Public

generated from fastai/nbdev_template

Notifications
Fork 1.8k
Star 13.2k

Code
Issues 352
Pull requests 62
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 32 Milestones 0

New pull request New

62 Open 1,468 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

add vllm support for token ids as input

#3280 opened Apr 11, 2025 by wybryan

Loading…

Reward takes completion ids

#3272 opened Apr 9, 2025 by qgallouedec • Draft

5 tasks

🦙 Llama 4

#3267 opened Apr 9, 2025 by qgallouedec • Draft

5 tasks

[NOT MEANT TO BE MERGED] Log correct/incorrect lengths

#3263 opened Apr 8, 2025 by qgallouedec • Draft

[SFT] support for ring_attn in SFTTrainer

#3262 opened Apr 8, 2025 by kashif

Loading…

5 tasks

[🐯+GRPO] Support FSDP + Fix bug when using LigerGRPO with DDP

#3260 opened Apr 8, 2025 by shivam15s • Draft

5 tasks

Add a raw generate API to the vLLM server

#3227 opened Apr 3, 2025 by wilrop

Loading…

5 tasks

Support iterable datasets in GRPO

#3226 opened Apr 3, 2025 by wilrop

Loading…

5 tasks

feat(trainer): Support multi-role & consecutive turns in DataCollatorForCompletionOnlyLM (#3223)

#3224 opened Apr 3, 2025 by Kirili4ik

Loading…

4 tasks done

update weight update process group

#3211 opened Apr 2, 2025 by ji-huazhong • Draft

5 tasks

Adding sampling parameters for vllm generation

#3210 opened Apr 2, 2025 by shaipranesh2

Loading…

Support for Models With Pre-Finetuned LoRA Adapters in GRPO: Add use_peft_as_reference Flag

#3196 opened Mar 31, 2025 by LoganVegnaSHOP

Loading…

5 tasks done

GRPO: Scalable training with one LLM/node

#3186 opened Mar 31, 2025 by jglaser

Loading…

3 of 5 tasks

🚀 Enhance GRPO VLLM server from sync to async and accelerate training

#3182 opened Mar 30, 2025 by binary-husky

Loading…

Co-Locating vLLM w/ training to achieve higher throughput and GPU utilization

#3162 opened Mar 26, 2025 by toslali-ibm

Loading…

2 of 5 tasks

Extend BCO Trainer dataset format support

#3134 opened Mar 22, 2025 by reihig-ut

Loading…

1 of 5 tasks

Add GRPO/ Online DPO support for quantitative models when use vllm as infer backbone.

#3133 opened Mar 22, 2025 by maoulee

Loading…

improvement(utils.py): simplify repeating completion string

#3122 opened Mar 20, 2025 by tpoisonooo

Loading…

feat: Add Interleaved Trainer implementation

#3107 opened Mar 18, 2025 by ucalyptus2

Loading…

3 tasks done

Update sft trainer to include better packing

#3100 opened Mar 17, 2025 by Ishan-Kumar2

Loading…

4 tasks done

[GRPO] add vlm training capabilities to the trainer

#3072 opened Mar 13, 2025 by CompN3rd

Loading…

3 of 5 tasks

[WIP] PEFT 🤝 Liger DPO

#3065 opened Mar 12, 2025 by SalmanMohammadi • Draft

5 tasks

[WIP] Iterative training scripts for SPIN and SPPO

#3011 opened Mar 5, 2025 by jkx19 • Draft

3 of 5 tasks

Fixing GRPO reward_func being a model with DeepSpeed ZeRO-3

#2984 opened Feb 28, 2025 by jamesbraza

Loading…

Feature: Add SGLang as inference backend for generation in GRPO

#2981 opened Feb 28, 2025 by jhinpan

Loading…

5 tasks done

Previous 1 2 3 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly