-
Notifications
You must be signed in to change notification settings - Fork 29.5k
Pull requests: huggingface/transformers
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: filter None router logits in Qwen3 MoE and handle empty router logits (#39203)
#39206
opened Jul 3, 2025 by
SwiftAkira
Loading…
Fix errors when use verl to train GLM4.1v model
#39199
opened Jul 3, 2025 by
kaln27
Loading…
5 tasks
Add packed tensor format support for flex/sdpa/eager through the mask!
for patch
Tag issues / labels that should be included in the next patch
#39194
opened Jul 3, 2025 by
Cyrilvallez
Loading…
fix(pipelines): QA pipeline returns fewer than top_k results in batch mode
#39193
opened Jul 3, 2025 by
yushi2006
Loading…
adjust input and output texts for test_modeling_recurrent_gemma.py
#39190
opened Jul 3, 2025 by
kaixuanliu
Loading…
[modular] Follow global indexing and attribute setting, and their dependencies
#39180
opened Jul 2, 2025 by
Cyrilvallez
Loading…
fix bug using FSDP V1 will lead to model device not properly set
#39177
opened Jul 2, 2025 by
kaixuanliu
Loading…
Make _compute_dynamic_ntk_parameters exportable
#39171
opened Jul 2, 2025 by
xadupre
Loading…
5 tasks
Don't send new comment if the previous one is less than 30 minutes (unless the content is changed)
#39170
opened Jul 2, 2025 by
ydshieh
Loading…
[bugfix] fix flash attention 2 unavailable error on Ascend NPU
#39166
opened Jul 2, 2025 by
FightingZhen
Loading…
5 tasks done
Refactor
PretrainedConfig.__init__
method to make it more explicit
#39158
opened Jul 1, 2025 by
qubvel
Loading…
Scaffolding
transformers serve
to be compatible with Responses API
#39155
opened Jul 1, 2025 by
LysandreJik
•
Draft
Efficient Expert Weight Fusion for Moe deepseek v3
#39150
opened Jul 1, 2025 by
VassilyLombard
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.