Skip to content

issues Search Results · repo:huggingface/transformers language:Python

Filter by

17k results
 (85 ms)

17k results

inhuggingface/transformers (press backspace or delete to remove)

System Info Irrelevant Who can help? @ArthurZucker Documentation says: The number of layers that use SWA (Sliding Window Attention). The bottom layers use SWA while the top use full attention. But ...
bug
  • norpadon
  • 2
  • Opened 
    4 hours ago
  • #38787

System Info transformers v4.52.3 The docs at https://huggingface.co/docs/transformers/accelerate show - image However when working with accelerate in TrainingArgs I get the following issue of fsdp strategy ...
bug
  • PT-10
  • 1
  • Opened 
    9 hours ago
  • #38776

Feature request Hi from pytorch distributed! Thanks for showcasing pytorch APIs device_map= auto and tp_plan= auto are somehow coupled right now: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L4280-L4283 ...
Feature request
  • weifengpy
  • 3
  • Opened 
    17 hours ago
  • #38771

Model description Apple recently released FastVLM, a new vision-language model introduced at CVPR 2025, which significantly improves on previous models in the LLaVA family. The smallest FastVLM variant ...
New model
  • kamila-chay
  • 2
  • Opened 
    23 hours ago
  • #38765

System Info - transformers version: 4.49.0 - Platform: Linux-5.4.0-216-generic-x86_64-with-glibc2.31 - Python version: 3.13.2 - Huggingface_hub version: 0.29.2 - Safetensors version: 0.5.3 ...
bug
  • VladPyzh
  • 2
  • Opened 
    yesterday
  • #38753

System Info Before this PR 38288, the program will run smoothly even when we set output_attentions=True and the attn implementation is not eager, as it will fallback to use eager mode, after this PR, ...
bug
  • kaixuanliu
  • Opened 
    yesterday
  • #38750

System Info When I set the infomerconfig.input_size = 1, I find a bug, but I don t know how to fix it. - Function Name : create_network_inputs time_feat = ( torch.cat( ...
bug
  • 2004learner
  • 6
  • Opened 
    yesterday
  • #38745

Feature request Implement handling for configurations where the q_lora_rank parameter is set to None. Motivation 1. DeepSeek-V2-Lite model has q_lora_rank=None so we can support this model with this ...
Feature request
  • bzantium
  • Opened 
    yesterday
  • #38742

Feature request Have a section on Pruna AI within the documentation. We did a similar PR for diffusers and thought it would be nice to show how to optimize transformers models too. . Motivation Have ...
Feature request
  • davidberenstein1957
  • 2
  • Opened 
    yesterday
  • #38740

System Info For the current version (4.52.4), in the LlamaAttention class, the type hint for the forward function https://github.com/huggingface/transformers/blob/aa798b7ac9ff5018b3578eb927dc438671ab6a3e/src/transformers/models/llama/modeling_llama.py#L231 ...
bug
  • nhatkhtn
  • 7
  • Opened 
    yesterday
  • #38739
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub