Insights: huggingface/transformers
Overview
Could not load contribution data
Please try again later
34 Pull requests merged by 22 people
-
[`MT5`] Fix CONFIG_MAPPING issue leading it to load umt5 class
#24678 merged
Jul 7, 2023 -
Fix integration with Accelerate and failing test
#24691 merged
Jul 6, 2023 -
Avoid import `sentencepiece_model_pb2` in `utils.__init__.py`
#24689 merged
Jul 6, 2023 -
DeepSpeed/FSDP ckpt saving utils fixes and FSDP training args fixes
#24591 merged
Jul 6, 2023 -
Add dropouts to GPT-NeoX
#24680 merged
Jul 6, 2023 -
LlamaTokenizer should be picklable
#24681 merged
Jul 6, 2023 -
Add Nucleotide Transformer notebooks and restructure notebook list
#24669 merged
Jul 5, 2023 -
Fix model referenced and results in documentation. Model mentioned was inaccessible
#24609 merged
Jul 5, 2023 -
Unpin `huggingface_hub`
#24667 merged
Jul 5, 2023 -
Add is_torch_mps_available function to utils
#24660 merged
Jul 5, 2023 -
Fix `VisionTextDualEncoderIntegrationTest`
#24661 merged
Jul 5, 2023 -
Fix `EncodecModelTest::test_multi_gpu_data_parallel_forward`
#24663 merged
Jul 5, 2023 -
Make warning disappear for remote code in pipelines
#24603 merged
Jul 4, 2023 -
Add `finetuned_from` property in the autogenerated model card
#24528 merged
Jul 4, 2023 -
Update warning messages reffering to post_process_object_detection
#24649 merged
Jul 4, 2023 -
documentation_tests.txt - sort filenames alphabetically
#24647 merged
Jul 4, 2023 -
llama fp16 torch.max bug fix
#24561 merged
Jul 4, 2023 -
Fix audio feature extractor deps
#24636 merged
Jul 4, 2023 -
precompiled_charsmap checking before adding to the normalizers' list for XLNetTokenizerFast conversion.
#24618 merged
Jul 4, 2023 -
Generate: force cache with `inputs_embeds` forwarding
#24639 merged
Jul 3, 2023 -
Generate: multi-device support for contrastive search
#24635 merged
Jul 3, 2023 -
Fix loading dataset docs link in run_translation.py example
#24594 merged
Jul 3, 2023 -
Pin `Pillow` for now
#24633 merged
Jul 3, 2023 -
[Time-Series] Added blog-post to tips
#24482 merged
Jul 3, 2023 -
🌐 [i18n-KO] Translated `perplexity.mdx` to Korean
#23850 merged
Jul 3, 2023 -
[`Umt5`] Add google's umt5 to `transformers`
#24477 merged
Jul 3, 2023 -
Limit Pydantic to V1 in dependencies
#24596 merged
Jun 30, 2023 -
Use protobuf 4
#24599 merged
Jun 30, 2023 -
[several models] improve readability
#24585 merged
Jun 30, 2023 -
Speed up TF tests by reducing hidden layer counts
#24595 merged
Jun 30, 2023 -
Make (TF) CI faster (test only a random subset of model classes)
#24592 merged
Jun 30, 2023 -
Show a warning for missing attention masks when pad_token_id is not None
#24510 merged
Jun 30, 2023 -
Udate link to RunHouse hardware setup documentation.
#24590 merged
Jun 30, 2023 -
⚠️⚠️[`T5Tokenize`] Fix T5 family tokenizers⚠️⚠️
#24565 merged
Jun 30, 2023
26 Pull requests opened by 23 people
-
[WIP] Add Llama Flax Implementation
#24587 opened
Jun 30, 2023 -
🌐 [i18n-KO] Translated`tasks/document_question_answering.md` to Korean
#24588 opened
Jun 30, 2023 -
Add forward methods to quantizer that also computes commitment loss
#24593 opened
Jun 30, 2023 -
add link to accelerate doc
#24601 opened
Jun 30, 2023 -
translate the English documentation into Chinese
#24611 opened
Jul 1, 2023 -
[DOC] Clarify relationshi load_best_model_at_end and save_total_limit
#24614 opened
Jul 1, 2023 -
🌐 [i18n-KO] Translated `model_summary.md` to Korean
#24625 opened
Jul 2, 2023 -
Create SECURITY.md
#24627 opened
Jul 2, 2023 -
[`MPT`] Add MosaicML's `MPT` model to transformers
#24629 opened
Jul 3, 2023 -
[WIP] Add LaVIN
#24645 opened
Jul 4, 2023 -
Enable `conversational` pipeline for `GPTSw3Tokenizer`
#24648 opened
Jul 4, 2023 -
fixing name position_embeddings to object_queries
#24652 opened
Jul 4, 2023 -
Llama: add RoPE scaling
#24653 opened
Jul 4, 2023 -
add CFG for .generate()
#24654 opened
Jul 5, 2023 -
🌐 [i18n-KO] Fixed Korean and English `quicktour.md`
#24664 opened
Jul 5, 2023 -
Whisper: fix prompted max length
#24666 opened
Jul 5, 2023 -
updating _compute_mask_indices fn to work with torch compile
#24668 opened
Jul 5, 2023 -
Remove WWT from README
#24672 opened
Jul 5, 2023 -
Fix non-deterministic Megatron-LM checkpoint name
#24674 opened
Jul 5, 2023 -
[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and `MT5ForQuestionAnswering`
#24684 opened
Jul 6, 2023 -
🌐 [i18n-KO] Updated Korean `serialization.md`
#24686 opened
Jul 6, 2023 -
[DO NOT MERGE] Test PR for studying #24622
#24690 opened
Jul 6, 2023 -
Removing unnecessary `device=device` in modeling_llama.py
#24696 opened
Jul 6, 2023 -
Bump scipy from 1.8.0 to 1.10.0 in /examples/research_projects/decision_transformer
#24699 opened
Jul 6, 2023 -
Suppress warnings from LUKE for unexpected keys
#24703 opened
Jul 7, 2023
71 Issues closed by 20 people
-
Bug on Gather all remaining tensors and put them back on the CPU
#24391 closed
Jul 7, 2023 -
trainer evaluation strucked when using dynamic padding in distributed evaluation
#23780 closed
Jul 6, 2023 -
Sliding window for finetuning
#23835 closed
Jul 6, 2023 -
Add EMD loss
#23838 closed
Jul 6, 2023 -
InstructBlipProcessor not working with load_in_4bit and load_in_8bit
#24564 closed
Jul 6, 2023 -
TrainingArguments not working in transformers v 4.30
#24676 closed
Jul 6, 2023 -
AssertionError: Dynamo only supports FSDP with use_orig_params=True
#24641 closed
Jul 6, 2023 -
Language Modeling on Already Tokenized Data
#24673 closed
Jul 6, 2023 -
Gradient clipping is no longer recommended?
#24677 closed
Jul 6, 2023 -
Custom vision encoder-decoder problem
#24679 closed
Jul 6, 2023 -
Finetuning Whisper with prompts
#24272 closed
Jul 6, 2023 -
RuntimeError
#23815 closed
Jul 5, 2023 -
AttributeError: EagerTensor object has no attribute 'size'
#23819 closed
Jul 5, 2023 -
index out of range in self torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
#23822 closed
Jul 5, 2023 -
IndexError when training with GLUE dataset using pretrained from scratch ELECTRA.
#23831 closed
Jul 5, 2023 -
Deepspeed ZeRO2 + Trainer does not resume training after evaluation
#24313 closed
Jul 5, 2023 -
LlamaForCausalLM returning prompt without answer
#24624 closed
Jul 5, 2023 -
'eos_token_id' for llama model.generate is not working
#24644 closed
Jul 5, 2023 -
TrainingArguments.report_to is not configured as documented
#24646 closed
Jul 4, 2023 -
Simplifying Output from Text Classification Pipelines
#23092 closed
Jul 4, 2023 -
`torch.compile` is ignored when using DeepSpeed
#23095 closed
Jul 4, 2023 -
Causal language modeling documentation is wrong?
#23841 closed
Jul 4, 2023 -
Training siamese (biencoder) based transformer model with gradient checkpointing throws error
#23801 closed
Jul 4, 2023 -
object of type 'IterableDataset' has no len()
#23809 closed
Jul 4, 2023 -
How to convert flax model to pytorch?
#23810 closed
Jul 4, 2023 -
LoRA training adapter_model.bin is 888 bytes always
#24551 closed
Jul 4, 2023 -
XLNetTokenizerFast conversion fails with identity normalization in Sentencepiece tokenizer
#24616 closed
Jul 4, 2023 -
Datasets in run_translation.py
#24579 closed
Jul 3, 2023 -
.generate() supports contrastive-search on multi-device?
#24634 closed
Jul 3, 2023 -
My QUESTION is how run a very big model like bloom on a cluster of machines ?
#23761 closed
Jul 3, 2023 -
Trainer do model generation during evaluation loop
#23763 closed
Jul 3, 2023 -
Transformer trainer training crashed with GLM models
#23794 closed
Jul 3, 2023 -
Type hinting Inconsistency in beam_search.py
#22856 closed
Jul 3, 2023 -
Trainer的使用问题
#24626 closed
Jul 3, 2023 -
Hi
#24623 closed
Jul 3, 2023 -
Problem with Deepspeed integration
#24438 closed
Jul 2, 2023 -
no dependency package `accelerate` installed when we install transformers v4.29.1
#23323 closed
Jul 2, 2023 -
BertTokenizer.save_vocabulary does not save the full vocab
#23743 closed
Jul 2, 2023 -
ImportError: cannot import name 'PartialState' from 'transformers.trainer_pt_utils'
#23744 closed
Jul 2, 2023 -
Seems some Bart models from facebook are removed
#24617 closed
Jul 2, 2023 -
BART is not found 404
#24620 closed
Jul 2, 2023 -
tokenizer = AutoTokenizer.from_pretrained('distilroberta-base') report error
#24586 closed
Jul 1, 2023 -
OneFormerImageProcessor does not support passing local config file, always tries to download from repo
#23116 closed
Jul 1, 2023 -
llama model can't generate EOS
#23230 closed
Jul 1, 2023 -
Save checkpoint asynchronously on cpu to keep GPU training going
#23419 closed
Jul 1, 2023 -
Loading LLaMA hf format from local folder is not using GPU in Google Colab
#23425 closed
Jul 1, 2023 -
`run_mlm.py` doesn't log perplexity to `wandb`
#23593 closed
Jul 1, 2023 -
Unable to download Google/vit-base-patch-16-224 / Getting 404 tepo not found error
#23696 closed
Jul 1, 2023 -
Help with TrOCR training for Spanish
#23700 closed
Jul 1, 2023 -
Can not using LION optimizer
#23727 closed
Jul 1, 2023 -
High memory usage for BigBirdForPreTraining
#23733 closed
Jul 1, 2023 -
Is it possible to use the transformers library with models, e.g. t5-small, commercially?
#23734 closed
Jul 1, 2023 -
PreTrainedTokenizerFast - whitespace merge skipped
#24610 closed
Jul 1, 2023 -
Protobuf 4 support (again)
#24323 closed
Jul 1, 2023 -
RuntimeError: Could not infer dtype of NoneType
#24606 closed
Jul 1, 2023 -
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:3!
#24410 closed
Jun 30, 2023 -
[BUG] Protobuf not being correctly installed
#24533 closed
Jun 30, 2023 -
Add MQTTS
#24142 closed
Jun 30, 2023 -
TransformerEngine FP8 inference
#23660 closed
Jun 30, 2023 -
AutoTokenizer Encode Error
#23671 closed
Jun 30, 2023 -
About Tokenizer
#23676 closed
Jun 30, 2023 -
How to check word ids for BartTokenizer?
#23679 closed
Jun 30, 2023 -
[RWKV] Inference memory leak unless use_cache=False is specified
#23687 closed
Jun 30, 2023 -
Token Alignment
#23692 closed
Jun 30, 2023 -
ZeRO 3 error: expected the next 4 parameters in the parameter fetch queue to be ... but got ()
#23693 closed
Jun 30, 2023
45 Issues opened by 45 people
-
bf16 with DeepSpeed stage 3 with CPU offload breaks LLaMA 13b+ training
#24702 opened
Jul 7, 2023 -
In RWForCausalLM.prepare_inputs_for_generation, the past_key_values are always None.
#24701 opened
Jul 7, 2023 -
Pix2StructImageProcessor does not accept list of PIL Images
#24700 opened
Jul 7, 2023 -
Assertion `srcIndex < srcSelectDimSize` failed
#24698 opened
Jul 6, 2023 -
`Trainer` class on Mac uses `accelerate` to incorrectly set MPS device
#24697 opened
Jul 6, 2023 -
Time Series Transformer - Dynamic Categorical Features
#24695 opened
Jul 6, 2023 -
Make correct padding for text generation with GPT-NEO
#24694 opened
Jul 6, 2023 -
TF : tensor mismatch error in training with opus100 and t5-small
#24693 opened
Jul 6, 2023 -
Breaking change in upcoming PyTorch version for weight norm and loading pretrained models
#24692 opened
Jul 6, 2023 -
is there any plan to add falcon to instructblip?
#24688 opened
Jul 6, 2023 -
OSError: Error no file named pytorch_model.bin
#24687 opened
Jul 6, 2023 -
How to get the last 4 Hidden states from the feature extraction pipeline
#24685 opened
Jul 6, 2023 -
Model checkpoint twice as large when saved with safetensors
#24683 opened
Jul 6, 2023 -
Unable to use Trainer with T5ForQuestionAnswering
#24682 opened
Jul 6, 2023 -
Is there any plan to add kosmos-2 to the transformers.
#24671 opened
Jul 5, 2023 -
Unable to Get Decoded Output from Whisper
#24670 opened
Jul 5, 2023 -
Add ELECTRA/DeBERTa v3 pretraining script (replaced token detection pretraining)
#24665 opened
Jul 5, 2023 -
Loading mT5 checkpoint will load from UMT5 class
#24662 opened
Jul 5, 2023 -
Add HyenaDNA model
#24659 opened
Jul 5, 2023 -
CUDA error: out of memory with zero3 offload
#24658 opened
Jul 5, 2023 -
At least one model's inference seems to have broken from transformers 4.29.2 -> 4.30.*
#24657 opened
Jul 5, 2023 -
discontinuity learning rate while resume from checkpoint
#24656 opened
Jul 5, 2023 -
Add a mechanism to transform the forward pass on Flax models
#24655 opened
Jul 5, 2023 -
CLIP pooling is not compatible with adding new tokens
#24650 opened
Jul 4, 2023 -
"RuntimeError: 'weight' must be 2-D" training with DeepSpeed
#24643 opened
Jul 4, 2023 -
openlm-research/open_llama_13b_easylm cannot be downloaded
#24642 opened
Jul 4, 2023 -
'DummyOptim' object has no attribute 'step'
#24640 opened
Jul 3, 2023 -
attention weight clipping
#24638 opened
Jul 3, 2023 -
TFOPTForCausalLM Attention mask size mismatch exception
#24637 opened
Jul 3, 2023 -
TrOCRProcessor.from_pretrained raise KeyError(key)
#24632 opened
Jul 3, 2023 -
Fine tunning Bloom model - Failed to import transformers.training_args
#24631 opened
Jul 3, 2023 -
Loading GPT-Neo-2.7B has error
#24630 opened
Jul 3, 2023 -
[i18n-<languageCode>] Translating docs to <languageName>
#24628 opened
Jul 3, 2023 -
Cannot load BART model
#24615 opened
Jul 1, 2023 -
Fine-tune T5 on SQuAD
#24613 opened
Jul 1, 2023 -
CUDA error: an illegal memory access was encountered
#24608 opened
Jul 1, 2023 -
`logging_dir` is not being generated.
#24607 opened
Jul 1, 2023 -
Contradictory information in documentation about the ability to push qunatized models to hub
#24604 opened
Jun 30, 2023 -
Support gradient checkpointing for ESM models
#24602 opened
Jun 30, 2023 -
IndexError: index -1 is out of bounds for dimension 1 with size 0
#24600 opened
Jun 30, 2023 -
Falcon-40b-instruct on Runpod
#24598 opened
Jun 30, 2023 -
Test with Pydantic V2
#24597 opened
Jun 30, 2023 -
stuck in the evaluation_loop of trainer.py when training
#24589 opened
Jun 30, 2023
120 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Falcon port
#24523 commented on
Jul 6, 2023 • 105 new comments -
Add Pop2Piano
#21785 commented on
Jul 6, 2023 • 83 new comments -
add VITS model
#24085 commented on
Jul 6, 2023 • 71 new comments -
Add bark
#24086 commented on
Jul 6, 2023 • 66 new comments -
Add Classifier-Free Guidance sampling
#24536 commented on
Jul 6, 2023 • 41 new comments -
Add UDOP
#22940 commented on
Jul 5, 2023 • 17 new comments -
Add Image Completion Transformer (ICT)
#21990 commented on
Jul 4, 2023 • 14 new comments -
Add image object
#24577 commented on
Jul 3, 2023 • 9 new comments -
Add Compact Convolutional Transformer model (CCT)
#24507 commented on
Jul 6, 2023 • 7 new comments -
📓 Text Generation docs rework
#24575 commented on
Jul 6, 2023 • 6 new comments -
Add Multi Resolution Analysis (MRA) (New PR)
#24513 commented on
Jul 6, 2023 • 6 new comments -
to is not supported for `8-bit` models
#23336 commented on
Jul 6, 2023 • 5 new comments -
FP-16 training producing nans on t5-large/flan-t5-xl
#23918 commented on
Jul 6, 2023 • 5 new comments -
[Feature Request] Add timestamp prediction for TF Whisper
#23928 commented on
Jul 6, 2023 • 5 new comments -
Contrastive Search peak memory reduction
#24120 commented on
Jul 6, 2023 • 5 new comments -
LlamaTokenizer: Slow implementation opts for whitespace-lead token (different from fast)
#24569 commented on
Jul 4, 2023 • 4 new comments -
Add ViViT
#22518 commented on
Jun 30, 2023 • 4 new comments -
Separate kwargs of tokenizer and feature_extractor in `ClapProcessor`
#24503 commented on
Jul 3, 2023 • 4 new comments -
Failed to import transformers.pipelines because of the following error (look up to see its traceback): cannot import name 'PartialState' from 'accelerate'
#23340 commented on
Jul 3, 2023 • 3 new comments -
Whisper with Elastic Weight Consolidation
#23880 commented on
Jul 3, 2023 • 3 new comments -
TF Swiftformer
#22771 commented on
Jul 6, 2023 • 3 new comments -
Cannot reproduce results for Pix2struct on InfographicVQA
#23877 commented on
Jul 6, 2023 • 3 new comments -
[WIP]Add TF BEiT Implementation
#18559 commented on
Jul 6, 2023 • 3 new comments -
Add DINOv2
#24016 commented on
Jul 6, 2023 • 3 new comments -
[InstructBLIP] Fix bos token of LLaMa checkpoints
#24492 commented on
Jul 6, 2023 • 3 new comments -
MT5 data padding not working
#24567 commented on
Jun 30, 2023 • 2 new comments -
QLoRA Training does not give expected results
#24212 commented on
Jun 30, 2023 • 2 new comments -
Regression Models
#23189 commented on
Jul 1, 2023 • 2 new comments -
`.to_dict` does not correctly serialize `torch.dtype` in some cases (e.g., vision models)
#23876 commented on
Jul 2, 2023 • 2 new comments -
Get output_hidden_state and output_scores from Flax whisper model
#22612 commented on
Jul 3, 2023 • 2 new comments -
`prompt_ids` does not seem to work with `repetition_penalty`
#23951 commented on
Jul 3, 2023 • 2 new comments -
Add support for BLIP and GIT in image-to-text and VQA pipelines
#21110 commented on
Jul 5, 2023 • 2 new comments -
[WIP] Add CLIPViP
#23802 commented on
Jul 6, 2023 • 2 new comments -
Addition of test code for GPTNeoX Flax support
#24002 commented on
Jul 5, 2023 • 2 new comments -
Add Flax diverse group search
#24508 commented on
Jun 30, 2023 • 2 new comments -
🌐 [i18n-KO] Translated `custom_tools.mdx` to Korean
#24580 commented on
Jul 3, 2023 • 2 new comments -
Model resources contribution
#20055 commented on
Jun 30, 2023 • 1 new comment -
Open AI GPT Model Implementation in Flax
#22647 commented on
Jun 30, 2023 • 1 new comment -
Add training support for EnCodec
#24295 commented on
Jun 30, 2023 • 1 new comment -
fsdp support bool type in trainArgs, use len(args.fsdp) would evoke TypeError when set fsdp=True
#24584 commented on
Jun 30, 2023 • 1 new comment -
[XLMModel, FlaubertModel] Compatibility with torch make_fx
#23907 commented on
Jun 30, 2023 • 1 new comment -
OSError: When i am loading pszemraj\flan-t5-large-grammar-synthesis from hugging face
#23902 commented on
Jun 30, 2023 • 1 new comment -
MarkupLM: feature_extraction_markuplm.py only extracts SearchableText
#23887 commented on
Jun 30, 2023 • 1 new comment -
MarkupLM: TypeError: unsupported operand type(s) for +: 'Tensor' and 'NoneType' in modeling_markuplm.py", line 217
#23886 commented on
Jun 30, 2023 • 1 new comment -
distillation training for arabic langauge
#23879 commented on
Jun 30, 2023 • 1 new comment -
importing of transformers 4.29.2 slows down PyToch DataLoader's multi-processing significantly
#23870 commented on
Jun 30, 2023 • 1 new comment -
forced_decoder_ids in Whisper models significantly impacts performance, use decoder_input_ids instead
#23845 commented on
Jun 30, 2023 • 1 new comment -
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
#22994 commented on
Jun 30, 2023 • 1 new comment -
GPU memory not completely free after one BERT/RoBERTa fwd pass
#23931 commented on
Jul 1, 2023 • 1 new comment -
4.29.0 bug
#23927 commented on
Jul 1, 2023 • 1 new comment -
AttributeError: 'Wav2Vec2Processor' object has no attribute 'set_lang'
#23925 commented on
Jul 1, 2023 • 1 new comment -
Training "microsoft/beit-large-finetuned-ade-640-640" on my dataset present some worry can not to solve
#23924 commented on
Jul 1, 2023 • 1 new comment -
Adding support for 3D deep learning models.
#23923 commented on
Jul 1, 2023 • 1 new comment -
I want to use 'from_ Pretrained' to read the '.safetensors' model file. What should I do?
#23176 commented on
Jul 1, 2023 • 1 new comment -
Name Error: "Partial State" is not defind
#22816 commented on
Jul 1, 2023 • 1 new comment -
Two tokenizer initialization methods result in inconsistent segmentation results for special words
#23930 commented on
Jul 2, 2023 • 1 new comment -
https://huggingface.co/sentence-transformers/clip-ViT-B-32 license?
#23956 commented on
Jul 2, 2023 • 1 new comment -
Got errors when calling AutoModelForCausalLM related APIs
#23950 commented on
Jul 2, 2023 • 1 new comment -
Unable to load from pretrained by inputting state_dict directly
#23947 commented on
Jul 2, 2023 • 1 new comment -
Dataset load train data have values but when training got num_examples 0
#23893 commented on
Jul 2, 2023 • 1 new comment -
How to generate one token after the other when no past_key_values is returned?
#23639 commented on
Jul 2, 2023 • 1 new comment -
libssl.so.10: cannot open shared object file: No such file or directory
#21805 commented on
Jul 2, 2023 • 1 new comment -
Adding GPTNeoX (Tensorflow version)
#23814 commented on
Jul 2, 2023 • 1 new comment -
Improve EncoderDecoderModel docs
#16135 commented on
Jul 3, 2023 • 1 new comment -
DeepSpeed ZeRO stage3+huggyllama/llama: RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
#24468 commented on
Jul 3, 2023 • 1 new comment -
Inference API takes forever and output: "Model ... is currently loading"
#23793 commented on
Jul 3, 2023 • 1 new comment -
Inconsistent issue in multi gpu training single machine
#22511 commented on
Jul 3, 2023 • 1 new comment -
load_in_8bit=True returns gibberish when inferencing on multi GPU
#23989 commented on
Jul 4, 2023 • 1 new comment -
Bug Report - Tokenizer Issue with Tensor Device Assignment in transformers/pipelines/text_generation.py
#23988 commented on
Jul 4, 2023 • 1 new comment -
How can I using deepspeed along with LION optimizer/.?
#23987 commented on
Jul 4, 2023 • 1 new comment -
Improve PreTrainedModel.from_pretrained return type
#23980 commented on
Jul 4, 2023 • 1 new comment -
Incorrect handling of EOS tokens in DataCollatorForLanguageModeling when pad_token is set to eos_token
#23530 commented on
Jul 4, 2023 • 1 new comment -
Implement QFormer for pretrain
#22645 commented on
Jul 4, 2023 • 1 new comment -
VideoMAEForVideoClassification does not support `device_map='auto'` yet.
#23086 commented on
Jul 4, 2023 • 1 new comment -
Handle `g_state` in RWKV's customized CUDA kernel to overcome sequence length limitation
#23979 commented on
Jul 4, 2023 • 1 new comment -
Unexpect behaviour
#24006 commented on
Jul 5, 2023 • 1 new comment -
Zero-shot classification pipeline does not appear to batch examples
#24005 commented on
Jul 5, 2023 • 1 new comment -
Use python generator instead of streamer for generation
#23640 commented on
Jul 5, 2023 • 1 new comment -
MaskFormerSwin shows as unsupported the index
#22948 commented on
Jul 5, 2023 • 1 new comment -
PhraseConstraints apearing only directly after input or at the end of the generated sentence
#19070 commented on
Jul 5, 2023 • 1 new comment -
VideoMAE pretraining error when customizing compute_metrics
#24474 commented on
Jul 6, 2023 • 1 new comment -
How to use LogitsWarper within .generate()?
#24021 commented on
Jul 6, 2023 • 1 new comment -
Parameter: encoder_no_repeat_ngram_size or something that makes model not repeat input tokens in the output.
#23834 commented on
Jul 6, 2023 • 1 new comment -
sequences and scores dimensions are mismatch when using generate()
#23750 commented on
Jul 6, 2023 • 1 new comment -
lr_scheduler not updated when auto_find_batch_size set to True and batch_size decays
#21521 commented on
Jul 6, 2023 • 1 new comment -
Question about resum_from_checkpoint in run_translation_no_trainer.py
#23213 commented on
Jul 6, 2023 • 1 new comment -
set fsdp and bf16 don't save memory
#22821 commented on
Jul 6, 2023 • 1 new comment -
Add `MegatronT5ForConditionalGeneration`
#22317 commented on
Jul 2, 2023 • 1 new comment -
Add support of output_scores to flax models
#22700 commented on
Jun 30, 2023 • 1 new comment -
Changed "perplexity" to "eval_perplexity"
#23875 commented on
Jun 30, 2023 • 1 new comment -
[`bnb`] Fix blip2 4bit
#23895 commented on
Jul 6, 2023 • 1 new comment -
fixed direct input of state dict functionality
#23948 commented on
Jul 2, 2023 • 1 new comment -
Improve typing for bitsandbytes util
#23981 commented on
Jul 4, 2023 • 1 new comment -
Add overloads for from_pretrained and from_dict on PretrainedConfig
#23983 commented on
Jul 4, 2023 • 1 new comment -
Create overload signatures for cached_file
#23984 commented on
Jul 4, 2023 • 1 new comment -
Add none check when instantiating tokenizer from auto
#24019 commented on
Jul 6, 2023 • 1 new comment -
Fixes all hidden states output in FlaxT5
#24027 commented on
Jul 6, 2023 • 1 new comment -
Add overloads for PretrainedModel.from_pretrained
#24035 commented on
Jul 6, 2023 • 1 new comment -
RWKV can't stop correctly.
#23852 commented on
Jun 30, 2023 • 0 new comments -
Please fix Lora model resume in transformers when using DeepSpeed
#23881 commented on
Jun 30, 2023 • 0 new comments -
CLIP image processor fails when resizing a 1x1 image
#21638 commented on
Jun 30, 2023 • 0 new comments -
Function infer_channel_dimension_format has a bug
#21981 commented on
Jun 30, 2023 • 0 new comments -
HF CLIP image features different from OpenAI CLIP image features
#22505 commented on
Jun 30, 2023 • 0 new comments -
Going above version 4.21.3 gives UnicodeDecodeError
#22675 commented on
Jun 30, 2023 • 0 new comments -
OwlVit gives different results compared to original colab version
#21206 commented on
Jun 30, 2023 • 0 new comments -
🌐 [i18n-KO] Translating docs to Korean
#20179 commented on
Jul 1, 2023 • 0 new comments -
MPT
#23174 commented on
Jul 3, 2023 • 0 new comments -
Add DINOv2 to Transformers
#23739 commented on
Jul 3, 2023 • 0 new comments -
Error projecting concatenated Fourier Features.
#22769 commented on
Jul 3, 2023 • 0 new comments -
Model outputs are impacted by the aspect ratios of other images in a batch
#23218 commented on
Jul 3, 2023 • 0 new comments -
_keys_to_ignore_on_load_unexpected not working with GPT2 model
#24581 commented on
Jul 4, 2023 • 0 new comments -
[Bug] Failiure to generate Diffusion images / AI language responses when upgrading past 4.19.2
#23238 commented on
Jul 4, 2023 • 0 new comments -
OneFormer processor does not return correctly formatted class_labels tensors
#23058 commented on
Jul 4, 2023 • 0 new comments -
RWKV - Inference NF4 quantization broken, also Int8 quantization weirdness.
#23848 commented on
Jul 6, 2023 • 0 new comments -
save_pretrained 4-bit models with bitsandbytes
#23904 commented on
Jul 6, 2023 • 0 new comments -
4bit Blip2 compatibility
#23839 commented on
Jul 6, 2023 • 0 new comments -
Add keypoint-detection task
#24044 commented on
Jul 6, 2023 • 0 new comments -
Add accelerate support - vision MAE models
#23114 commented on
Jun 30, 2023 • 0 new comments -
[`config.to_dict()`] update test and models that have a composition
#23888 commented on
Jul 2, 2023 • 0 new comments -
owl-vit-eval-postprocessor
#23982 commented on
Jul 4, 2023 • 0 new comments