Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update with forked from #1

Merged
merged 956 commits into from Mar 20, 2023

Conversation

oushu1zhangxiangxuan1
Copy link
Owner

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

ydshieh and others added 30 commits February 23, 2023 09:41
* Update expect output values - as Hub repo. files are updated

* Update expect output values - as librosa is from 0.9.2 to 0.10.0 on CI docker

* fix

* update one more

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fix bug

* forward contrib credits from discussions

* change logic

---------

Co-authored-by: edbeeching <edbeeching@users.noreply.github.com>
* Ran Black formatting

* Added imports and reformatted

* Update src/transformers/models/encoder_decoder/modeling_tf_encoder_decoder.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
* troubleshooting guide: added an error description for missing auto-mapping

* minor polishing

* changed the example

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/troubleshooting.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [deepspeed tests] fix issues introduced by #21700

* fix

* fix
* Removed useless check for backend

* fix style check for graphormer

* Reverted change and corrected requires_backend for cython

* code qual
#21612)

* fix: Change is_last chunk calc and add conditional break

* format fix

* account for 0 and full stride_rights, add comment

* add new test

* make style

* update slow whisper asr test timestamps

* use nested_simplify on output and round timestamp to hundreths place
* [flax] adding support for batch norm layers

* fixing bugs related to pt+flax integration

* cleanup, batchnorm support in sharded pt to flax

* support for batchnorm tests in pt+flax integration

* simplifying checking batch norm layer
…1756)

* [Examples] Generalise run audio classification for log-mel models

* batch feature extractor

* make style
* Different behavior in DistilBERT when using "inputs_embeds"
Fixes #21089

* fix failing test
* Return and rescale attention_mask

* Add SpecAugment to Whisper modeling

* Fix test

* Update docstring

* Add SpecAug related parameters to model config

* Add the _mask_input_features function to doc

* Fix quality

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove dev comments

* Add test

* Resolve conflict

* feat: mask {feature, time} prob fast tests

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix history

* input_features instead of input ids for TFWhisport doctest

* use translate intead of transcribe
* updated expected

* prediction_length fix

* prediction_length default value

* default prediction_length 24

* revert back prediction_length default

* move prediction_length test
* fix gradient checkpointing bug

* fix gradient checkpointing bug

* ran make fix-copies

* fixed bug

* fixed bug
* Fix resume_from_checkpoint for deepspeed

Fix resume_from_checkpoint for deepspeed, by ensuring that the deepspeed engine is the one to load the checkpoint.

* Empty commit to trigger CI

* Removed deepspeed skipping 

Removed deepspeed skipping inside the _load_from_checkpoint function, as it is obsolete

* another adjustment

* Trigger CI

* trigger circleci

* style

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
* Override the decoding parameters of Seq2SeqTrainer

* Fix quality

* Fix max_length parameter

* Fix quality

* Remove redundant parameter max_length

* Separate the preprocess of train and validation to use different max_target_length
* fix wrong url

* typos in english documentation
make concrete_args from outside available
* add pipeline

* update init

* add zero shot to init

* update inits and correct checkpoints

* update base to support input features

* add tests

* Update src/transformers/pipelines/zero_shot_audio_classification.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/pipelines/zero_shot_audio_classification.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* update pieline code

* use tiny checkpoint

* nits and expected value with tiny model

* style

* last nit on tests values

* fix styling

* fix collate fn that was casting t float

* update

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* uint8 -> bool

* fix copies

* style

* update test modeling commen when checking attention buffers

* style

* use logical not on random mask instead of subtraction with 1

* remove torch uint8

* quality

* remove modified modeling utils

* Update based on review

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

---------

Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* add `accelerate` marker

* add to docs

* Update docs/source/en/testing.mdx
fix nn.init.trunc_normal_ call on half data
* Fix gradient checkpointing bug in gptneox

* Remove use_cache block
amyeroberts and others added 29 commits March 15, 2023 18:37
* Fix regression in pipeline when device=-1 is passed

* Add regression test
* Use return_loss for BridgeTowerForContrastiveLearning, add example

* fix tests

* Update example in BridgeTowerForContrastiveLearning

* Update test_modeling_bridgetower.py

* update model output format

* minor update

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

* make style

---------

Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* t5 remove data dependency

* make style

* make fix-copies

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
* Deal with torch-tensorrt

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Fix align docs typo
Update values

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Tranlstion Italian: migration

* Update migration.mdx

minor fixes

* Update _toctree.yml

* Delete migration.mdx

* Add italian translation of migration.mdx

* Update of migration.mdx translation and toctree
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* Deal with 2 model tester classes in single test file

* make style and quality

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143

* Reduced column width

* Fix formatting.

* Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143"

This reverts commit 6e95a10.

* Fix export error.

* Revert "Fix formatting."

This reverts commit 8310f60.

* Propagated changes made in SwinV2 to Swin2SR
* add `accelerate` support for XGLM

* fix order
* py38 + torch 2

* increment cache versions

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
Use dash 2.8.1 for now

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* added doc to toc, auto tip with  supported models, mention of task guide in model docs

* make style

* removed "see also"

* minor fix
* LLaMA house-keeping

* Doc links
* fix AutoTP in deepspeed could not work for bloom

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add a method in BloomModel to build ailib

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

---------

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
fix(docs): task guide links in model docs
* Add kernel size to NATTEN's QK arguments.

The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
argument to the QK operation to allow optional RPBs.

This ends up failing NATTEN tests.

This commit adds NATTEN back to circleci and adds the arguments to get
it working again.

* Force NATTEN >= 0.14.5
Revert "Use `dash==2.8.1` for now for daily CI (#22227)"

This reverts commit 5321867.
@oushu1zhangxiangxuan1 oushu1zhangxiangxuan1 merged commit 22ead54 into oushu1zhangxiangxuan1:main Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment