Pulse · huggingface/transformers

May 8, 2025 – May 15, 2025

Overview

111 Active pull requests

76 Active issues

Could not load contribution data

Please try again later

112 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Add Magma Agentic Model from Microsoft
#37267 commented on May 13, 2025 • 64 new comments
New cache tests and modular Hybrid Cache
#37972 commented on May 14, 2025 • 25 new comments
[core] Completely rewrite the masking logic for all attentions
#37866 commented on May 15, 2025 • 18 new comments
Add ColQwen2 to 🤗 transformers
#35778 commented on May 13, 2025 • 14 new comments
Bart: new cache format
#35314 commented on May 15, 2025 • 12 new comments
Superpoint fast image processor
#37804 commented on May 15, 2025 • 8 new comments
Add Fast Image Processor for mobileViT
#37143 commented on May 12, 2025 • 8 new comments
36978 | Fast image processor for DPT model
#37481 commented on May 15, 2025 • 7 new comments
[Validation] First implementation of `@strict` from `huggingface_hub`
#36534 commented on May 15, 2025 • 6 new comments
add profiler to trainer
#37889 commented on May 15, 2025 • 6 new comments
Enhance Model Loading By Providing Parallelism, Uses Optional Env Flag
#36835 commented on May 15, 2025 • 4 new comments
Support Kosmos-2.5
#31711 commented on May 15, 2025 • 4 new comments
Add args support for fast image processors
#37018 commented on May 13, 2025 • 3 new comments
Feat: save_pretrained for tensor parallel (and other parallelisms) models
#37919 commented on May 15, 2025 • 3 new comments
Fix Float64 RuntimeError on Integrated Graphics when using DirectML
#37735 commented on May 12, 2025 • 2 new comments
parallelism goes brrr
#37877 commented on May 15, 2025 • 2 new comments
Feat: add warnings for unused keys and rules in tensor parallel
#37893 commented on May 15, 2025 • 2 new comments
Adds use_repr to model_addition_debugger_context
#37984 commented on May 12, 2025 • 2 new comments
feat: support indivisible shards for TP model loading and TPlizing.
#37220 commented on May 12, 2025 • 1 new comment
fix: support grad clipping for TP through replicating non-sharded modules
#36132 commented on May 15, 2025 • 1 new comment
add fast image processor nougat
#37661 commented on May 13, 2025 • 1 new comment
Translating model_doc/bert.md to Chinese
#37806 commented on May 14, 2025 • 1 new comment
make Llama4TextMoe forward more readable
#37529 commented on May 12, 2025 • 0 new comments
internalize build_inputs_with_special_tokens and prepare_for_model
#37522 commented on May 15, 2025 • 0 new comments
Docs: fix docstrings for Gemma3 modeling
#37534 commented on May 9, 2025 • 0 new comments
Add callback to monitor progress in whisper transcription
#37483 commented on May 14, 2025 • 0 new comments
Mllama fast image processor
#37539 commented on May 15, 2025 • 0 new comments
Inherited CausalLM Tests
#37590 commented on May 15, 2025 • 0 new comments
Fix interpolation of convnext image processor
#37460 commented on May 14, 2025 • 0 new comments
[Cache] Support compilable cache reuse with smaller batch sizes
#37394 commented on May 15, 2025 • 0 new comments
Add `segmentation_maps` support to MobileNetV2ImageProcessor
#37312 commented on May 9, 2025 • 0 new comments
Add Fast Image Processor for Chameleon
#37140 commented on May 13, 2025 • 0 new comments
Improve typing in TrainingArgument
#36944 commented on May 13, 2025 • 0 new comments
fix unexpected kws of input_ids when setup no speech detection of whisper
#36809 commented on May 13, 2025 • 0 new comments
RuntimeError when converting and saving Flax ViT model to PyTorch
#37999 commented on May 12, 2025 • 0 new comments
Pass `eps` to `Mistral3RMSNorm`
#38026 commented on May 13, 2025 • 0 new comments
[ESM] Add flash-attention-2 backend for ESM-2
#38023 commented on May 15, 2025 • 0 new comments
Qwen2.5-Omni: Update modeling_qwen2_5_omni.py to fix error when loading quantized weights with AutoAWQ.
#38013 commented on May 12, 2025 • 0 new comments
proof of concept for using dataset of test cases for tokenizer tests
#37994 commented on May 13, 2025 • 0 new comments
update loss computation in modeling code
#37993 commented on May 15, 2025 • 0 new comments
CI result inspector util
#37976 commented on May 14, 2025 • 0 new comments
Include output embedding as well with `include_embedding` flag
#37935 commented on May 13, 2025 • 0 new comments
Fix wrong example in grounding dino
#37921 commented on May 10, 2025 • 0 new comments
support MiniCPM-o2.6
#37917 commented on May 15, 2025 • 0 new comments
Feat: Add class_proba option to semantic segmentation post-processing
#37904 commented on May 13, 2025 • 0 new comments
Get our efficiency back
#37884 commented on May 9, 2025 • 0 new comments
[WIP] Perception lm
#37878 commented on May 14, 2025 • 0 new comments
New bart model card
#37858 commented on May 14, 2025 • 0 new comments
Add z-loss to Bamba for v2
#37842 commented on May 9, 2025 • 0 new comments
Added False case implementation for config.do_stable_layer_norm in FlaxWav2vec2Models
#37822 commented on May 15, 2025 • 0 new comments
general spm converter
#37763 commented on May 15, 2025 • 0 new comments
Stop autoconverting custom code checkpoints
#37751 commented on May 9, 2025 • 0 new comments
[VLMs] add helpers to get multimodal encodings
#37743 commented on May 9, 2025 • 0 new comments
refactor can_save_slow_tokenizer
#37722 commented on May 9, 2025 • 0 new comments
:rotating_light: :rotating_light: Fix custom code saving
#37716 commented on May 15, 2025 • 0 new comments
Add support for manually setting `head_dim` in Qwen2 MoE
#37643 commented on May 9, 2025 • 0 new comments
Add time-based evaluation strategy to Trainer
#37642 commented on May 9, 2025 • 0 new comments
Add PLM Model
#37634 commented on May 9, 2025 • 0 new comments
[WiP] Add EoMT Model
#37610 commented on May 12, 2025 • 0 new comments
Trainer Stuck at 0% Progress during Training on Multi-GPU Setup
#38008 commented on May 12, 2025 • 0 new comments
Trainer.training_step incorrectly normalizes mean token loss when n_gpu > 1
#37474 commented on May 12, 2025 • 0 new comments
Community contribution: Adding GGUF support for more architectures
#33260 commented on May 12, 2025 • 0 new comments
How to solve the error of converting Qwen onnx_model to tensorRT_model?
#37408 commented on May 12, 2025 • 0 new comments
Loading HQQ quantized models is broken since #35926
#37263 commented on May 12, 2025 • 0 new comments
Support multimodal models in vLLM with transformers backend
#37780 commented on May 12, 2025 • 0 new comments
Model implementation with Transformers and Hugging face hub.
#27532 commented on May 12, 2025 • 0 new comments
how to fine tune TrOCR on specifique langage guide.
#33106 commented on May 12, 2025 • 0 new comments
Patches for different modalities
#34585 commented on May 12, 2025 • 0 new comments
Refactor bert-based models to use global attention function
#37495 commented on May 12, 2025 • 0 new comments
FileNotFoundError when using SentenceTransformerTrainingArguments(load_best_model_at_end=True) and Peft
#34747 commented on May 12, 2025 • 0 new comments
Inconsistent Documentation for `⁠dataset_index` Requirement Across ViTPose Models
#36773 commented on May 12, 2025 • 0 new comments
Since 4.50.0, saving and loading a Whisper model causes an error
#37172 commented on May 12, 2025 • 0 new comments
Issue: Unexpected Shape of logits: When Using generate() with num_return_sequences > 1
#37378 commented on May 11, 2025 • 0 new comments
ImportError: cannot import name '_flash_supports_window_size' from 'transformers.modeling_flash_attention_utils'
#37428 commented on May 11, 2025 • 0 new comments
facebook/opt-30b Cuda Allocation Error with version >= 4.50.0 code
#37436 commented on May 11, 2025 • 0 new comments
Processor multiprocessing error when load custom processor
#37637 commented on May 10, 2025 • 0 new comments
Make `argmax` in `post_process_semantic_segmentation` optional
#37715 commented on May 10, 2025 • 0 new comments
FP8 tensors not saved correctly
#37250 commented on May 10, 2025 • 0 new comments
TimeSformer assumes a fixed number of frames in its layers even though it interpolates temporal embeddings based on the input
#38027 commented on May 10, 2025 • 0 new comments
clarify the label shifting behavior of llama models when `labels` is given.
#32944 commented on May 10, 2025 • 0 new comments
A shallow copy in groundingdino
#37333 commented on May 9, 2025 • 0 new comments
Image Processor fails to process void segmentation maps
#30064 commented on May 9, 2025 • 0 new comments
Are there any plans to provide some performance analysis tools for transformers?
#36360 commented on May 9, 2025 • 0 new comments
Can't load Llama4 Processor
#37375 commented on May 9, 2025 • 0 new comments
Does Qwen_2_5_VL support variable length attention computation?
#38007 commented on May 9, 2025 • 0 new comments
[WIP] Add DINO DETR Model to HuggingFace Transformers
#36711 commented on May 11, 2025 • 0 new comments
Add Aimv2 model
#36625 commented on May 13, 2025 • 0 new comments
Add evolla rebase main
#36232 commented on May 12, 2025 • 0 new comments
[WIP] Add a dedicated tokenizer for byte level transformers
#36216 commented on May 12, 2025 • 0 new comments
[ModernBERT] Add CausalLM functionality to ModernBERT
#35946 commented on May 14, 2025 • 0 new comments
Add padding-free to bamba
#35861 commented on May 12, 2025 • 0 new comments
[Whisper] Pipeline: handle long form generation
#35750 commented on May 9, 2025 • 0 new comments
Integrate xlstm cleanly.
#35377 commented on May 15, 2025 • 0 new comments
[`ESM`] Add support for sdpa.
#34954 commented on May 13, 2025 • 0 new comments
Add Molmo (7B-D, 7B-O, 70B)
#33962 commented on May 12, 2025 • 0 new comments
Custom beam search scorer argument in generate function
#32097 commented on May 14, 2025 • 0 new comments
[Contributions Welcome] Add Fast Image Processors
#36978 commented on May 15, 2025 • 0 new comments
Cannot run backward with tensor parallel
#36657 commented on May 15, 2025 • 0 new comments
Why can't InternVL3-8B start vLLM after being converted to the Hugging Face format? It shows the error: `ValueError: 'limit_mm_per_prompt' is only supported for multimodal models.'
#38000 commented on May 15, 2025 • 0 new comments
Incorrect installation instructions
#37476 commented on May 15, 2025 • 0 new comments
4.51.3 is much faster than prevous version - do you see the same?
#37504 commented on May 15, 2025 • 0 new comments
Trainer num_tokens() function seem to be outdated and not correct
#37510 commented on May 15, 2025 • 0 new comments
Wrong KV cache update for sliding-window attention (SWA) layers when total sequence length reaches window size
#37574 commented on May 15, 2025 • 0 new comments
[Community contributions] Model cards
#36979 commented on May 15, 2025 • 0 new comments
Qwen2vl support for GGUF
#35282 commented on May 14, 2025 • 0 new comments
The "force_words_ids" does not seem to be available on llama4
#37478 commented on May 14, 2025 • 0 new comments
Convnext image preprocessor raises an AssertionError when comparing logits
#37461 commented on May 13, 2025 • 0 new comments
`last_cache_position` definition issue in hybrid SWA models
#37706 commented on May 13, 2025 • 0 new comments
FSDP Torch XLA vs. FSDPv2 (SMPD) Torch XLA checkpoint saving bug
#36004 commented on May 13, 2025 • 0 new comments
Whisper word-level timestamp extraction fails with beam search
#36093 commented on May 13, 2025 • 0 new comments
pytorch_utils.py > isin_mps_friendly > RuntimeError: Expected elements.dtype() == test_elements.dtype() to be true, but got false.
#37423 commented on May 13, 2025 • 0 new comments
Broken phi4 model
#37464 commented on May 13, 2025 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

May 8, 2025 – May 15, 2025

Overview

Could not load contribution data

62 Pull requests merged by 38 people

49 Pull requests opened by 38 people

48 Issues closed by 17 people

28 Issues opened by 25 people

112 Unresolved conversations

Insights: huggingface/transformers

May 8, 2025 – May 15, 2025

Overview

Could not load contribution data

62 Pull requests merged by 38 people

49 Pull requests opened by 38 people

48 Issues closed by 17 people

28 Issues opened by 25 people

112 Unresolved conversations