Feat: WandbCallback upgrades #2

parambharat · 2023-10-25T11:56:16Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

src/transformers/integrations/integration_utils.py

* first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

huggingface#29507) Revert "Automatic safetensors conversion when lacking these files (huggingface#29390)" This reverts commit a69cbf4.

# Conflicts: # src/transformers/integrations/integration_utils.py

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (huggingface#3) * Make Fix (huggingface#5) * Pr fixes (huggingface#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (huggingface#8) * tokenizer test * format fix * Adding Docs and other minor changes (huggingface#7) * Add modeling tests (huggingface#9) * Smol Fix (huggingface#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (huggingface#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (huggingface#14) * Update chat templates to use the new API (huggingface#15) --------- Co-authored-by: ahmetustun <ahmetustun89@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

…/callback-upgrade

* update * feedback

* fix bug and add tests * nit * otherway to get the cur len instead of attention mask * more places where this might have been broken * nit * oups * inputs_embeds vs input_embeds * test generated outptus * style * nit * fix * skip failing biogpt

…LM (huggingface#29904) * Fix sinusoidal_embeddings in FlaubertModel * Fix for Informer * Fix for XLM * Move sinusoidal emb for XLM * Move sinusoidal emb for Flaubert * Small cleanup * Add comments on tests code copied from * Add with Distilbert->

fix bug

* fix issue with logit processor in beam search in Flax * adding FlaxNoRepeatNGramLogitsProcessor class + unit test * style correction and code verification * add FlaxNoRepeatNGramLogitsProcessor to the test_processor_list and test_processor_list_jitted tests * fix an issue where ngrams are banned only if they appear ==1 time + update description of get_previous_ngrams * replace non-jit compatible masking of ngrams that are not yet generated with jittable version * Revert "fix issue with logit processor in beam search in Flax" This reverts commit 09b70d7. * add FlaxNoRepeatNGramLogitsProcessor to _get_logits_processor * change the method of casting to boolean of banned tokens indices * fix code style * remove some useless operations + significantly faster computation of update indices using jax.lax.fori_loop * remove useless loop iterations * set some variables that were calculated and used multiple times * fix format

…gface#29939) * add FA2 to o.g Musicgen * make style * add FA2 support to Musicgen Melody * add generation FA2 tests to o.g Musicgen * make style and fix copies * add Musicgen to FA2 docs + deprecate list * add sdpa supports to Musicgen's * make style and fix copies * refactor attention implementation arguments * add Copied from to sdpa tests * add copied form in sdpa tests melody * add copied for FA2 generation tests * add FA2 inference copied from * make style

…ingface#29949)

…face#29311) * Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode * Exclude pad_token filtering since it is used as CTC-blank token * Add small test for skip_special_tokens * Update decoding test for added new token

) * Hard error when ignoring tensors. (huggingface#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* fix vipllava generation * consistent llava code * revert llava tests changes

new audio file

quick fix

* fix * sort imports

* Docstring to note about zero init * Check for accelerate * Change conditional return * Tweak * Add new accelerate-specific zero3 check * Fix import * Revert to RTFM * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

feat: enable mult-idevice for efficientnet

* implement convert_mamba_ssm_checkpoint_to_pytorch * Add test test_model_from_mamba_ssm_conversion * moved convert_ssm_config_to_hf_config to inside mamba_ssm_available check * fix skipif clause * moved skips to inside test since skipif decorator isn't working for some reason * Added validation * removed test * fixup * only compare logits * remove weight rename * Update src/transformers/models/mamba/convert_mamba_ssm_checkpoint_to_pytorch.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

) * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * redefaulted padding=longest again * fixup/doc

* changes * addressing comments * smol fix

Add whisper Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

…0044) skip test_encode_decode_fast_slow_all_tokens for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

huggingface#29722) * if output is tuple like facebook/hf-seamless-m4t-medium, waveform is the first element Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * add test and fix batch issue Signed-off-by: Wang, Yi <yi.a.wang@intel.com> * add dict output support for seamless_m4t Signed-off-by: Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi <yi.a.wang@intel.com>

* fix mixtral onnx export * fix qwen model

* Add image processor to trainer * Replace tokenizer=image_processor everywhere

# Conflicts: # src/transformers/integrations/integration_utils.py

parambharat added 3 commits October 25, 2023 15:38

feat: add peft config to wandb if it exists in the model

bb7e5fd

feat: add model parameter count to wandb config and model metadata

2b386bb

feat: add metrics on prediction to wandb

665f284

morganmcg1 reviewed Oct 27, 2023

View reviewed changes

src/transformers/integrations/integration_utils.py Show resolved Hide resolved

parambharat and others added 5 commits October 27, 2023 11:19

feat: add model architecture to the model artifact

d0f3176

feat: add initial model and architecture to the model artifact on setup

46d0115

Merge branch 'main' into wandb/callback-upgrade

72480ff

feat: add markdown badge to model card

7a3b476

feat: add parameters for peft models and model card badge

44a4226

parambharat added 5 commits February 19, 2024 09:46

Merge branch 'main' into wandb/callback-upgrade

e59e15e

refactor: change checkpoints to log and model and rename initial to base

bf93923

feat: add step and epoch aliases to the checkpoints

8ab50ad

chore: run fixup and style fixes

08ced55

Merge branch 'main' into wandb/callback-upgrade

f0bcb24

parambharat added 2 commits March 12, 2024 18:04

Merge branch 'main' into wandb/callback-upgrade

62155b2

# Conflicts: # src/transformers/integrations/integration_utils.py

fix: address review comments related to DRY and naming consistency

b1a3110

parambharat and others added 11 commits March 21, 2024 10:14

Merge branch 'main' of github.com:parambharat/transformers into wandb…

9042c82

…/callback-upgrade

Merge branch 'main' of github.com:parambharat/transformers into wandb…

b50e13b

…/callback-upgrade

[docs] Big model loading (huggingface#29920)

096f304

* update * feedback

[bnb] Fix bug in _replace_with_bnb_linear (huggingface#29958)

33288ff

fix bug

[Docs] Make an ordered list prettier in add_tensorflow_model.md (hugg…

cb5927c

…ingface#29949)

zucchini-nlp and others added 28 commits April 3, 2024 17:00

Fix vipllava for generation (huggingface#29874)

cc75f1a

* fix vipllava generation * consistent llava code * revert llava tests changes

[docs] Fix audio file (huggingface#30006)

34bfe95

new audio file

Superpoint imports fix (huggingface#29898)

c10b5dd

quick fix

[Main CIs] Fix the red cis (huggingface#30022)

695d823

* fix * sort imports

Enable multi-device for efficientnet (huggingface#29989)

03732de

feat: enable mult-idevice for efficientnet

Refactor Cohere Model (huggingface#30027)

517a3e6

* changes * addressing comments * smol fix

Add whisper to IMPORTANT_MODELS (huggingface#30046)

24d787c

Add whisper Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

skip test_encode_decode_fast_slow_all_tokens for now (huggingface#3…

8b52fa6

…0044) skip test_encode_decode_fast_slow_all_tokens for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Fix mixtral ONNX Exporter Issue. (huggingface#29858)

d704c0b

* fix mixtral onnx export * fix qwen model

[Trainer] Allow passing image processor (huggingface#29896)

1ab7136

* Add image processor to trainer * Replace tokenizer=image_processor everywhere

feat: add peft config to wandb if it exists in the model

ec7e47a

feat: add model parameter count to wandb config and model metadata

d1717c6

feat: add metrics on prediction to wandb

042d1aa

feat: add model architecture to the model artifact

cf31c9a

feat: add initial model and architecture to the model artifact on setup

13a4d43

chore: update and rebase with upstream main

940f296

# Conflicts: # src/transformers/integrations/integration_utils.py

feat: add parameters for peft models and model card badge

859b414

refactor: change checkpoints to log and model and rename initial to base

f43dd42

feat: add step and epoch aliases to the checkpoints

a98ffeb

chore: run fixup and style fixes

e80a34e

fix: address review comments related to DRY and naming consistency

b25675b

chore: update and rebase with upstream main

4e5e2a4

# Conflicts: # src/transformers/integrations/integration_utils.py

chore: update and rebase with upstream main

e5ad376

# Conflicts: # src/transformers/integrations/integration_utils.py

chore: run make fixup

10c1142

parambharat closed this Apr 9, 2024

parambharat deleted the wandb/callback-upgrade branch April 9, 2024 05:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: WandbCallback upgrades #2

Feat: WandbCallback upgrades #2

parambharat commented Oct 25, 2023

Feat: WandbCallback upgrades #2

Feat: WandbCallback upgrades #2

Conversation

parambharat commented Oct 25, 2023

What does this PR do?

Before submitting

Who can review?