fix get_scheduler when name is warmup_stable_decay #31128

zspo · 2024-05-30T03:10:04Z

What does this PR do?

fix #31085

I also add some cases for test.

@ArthurZucker and @amyeroberts

amyeroberts

Thanks for fixing!

LGTM, but would be good to have a second look from @SunMarc to confirm this is as expected

SunMarc

Make sense ! Thanks for fixing @zspo !

MostHumble · 2024-05-30T14:53:48Z

@SunMarc this means that you'd have to pass, num_warmup_steps=num_warmup_steps, AND scheduler_specific_kwargs = {num_stable_steps, num_decay_steps, num_cycles, min_lr_ratio} as they're required params, this can be confusing as all of them are specific to that scheduler, I'm not sure why scheduler_specific_kwargs exists

zspo · 2024-05-30T15:10:09Z

@SunMarc this means that you'd have to pass, num_warmup_steps=num_warmup_steps, AND scheduler_specific_kwargs = {num_stable_steps, num_decay_steps, num_cycles, min_lr_ratio} as they're required params, this can be confusing as all of them are specific to that scheduler, I'm not sure why scheduler_specific_kwargs exists

I considered whether to put num_warmup_steps in scheduler_specific_kwargs. But this parameter appears in the function get_scheduler explicitly.

SunMarc · 2024-05-30T15:12:12Z

scheduler_specific_kwargs exists so that we don't have a get_scheduler with too many args + that are specific to only a few schedulers.

MostHumble · 2024-05-30T15:22:55Z

I see so it's juste two different styles clashing, thanks.

fix get_scheduler args

commit bf6ea14 Merge: b3261f5 96eb062 Author: Vasqu <antonprogamer@gmail.com> Date: Sat Jun 1 02:49:53 2024 +0200 Merge remote-tracking branch 'origin/main' commit b3261f5 Author: Arthur <48595927+ArthurZucker@users.noreply.github.com> Date: Fri May 31 18:37:43 2024 +0200 Diff converter v2 (huggingface#30868) * current working example! * commit regex and result file * update * nit * push the conversion file * oups * roadmap and nits * attempt diffs for 3 files * persimmon * nit * add diff file that is the same as the modeling_llama.py * fix rope nits * updates * updates with converted versions * give some breathing space to the code * delete * update * update * push the actual result * update regex patterns * update regex patterns * fix some issues * fix some issues * fix some issues * updates * updates * updates * updates * updates * revert changes done to llama * updates * update gemma * updates * oups * current state * current state * update * ouiiii * nit * clear diffs * nit * fixup * update * doc 🚀 * 🔥 * for now use gemma * deal with comments * style * handle funtions * deal with assigns * todos * process inheritage * keep decorators? * 🤗 * deal with duplicates * fixup * correctly remove duplicate code * run ruff post script * ruff deals pretty well with imports, let's leave it to him * ah maybe not lol * for now remove all imports from child. * nit * conversion of llama * okay * convert starcoder2 * synch with main * update llama diff * updates * https://docs.astral.sh/ruff/rules/redefined-while-unused/ fixes the imports, bit needs later version of ruff * updates * okay actual state * non zero exit * update! * revert unrelated * remove other diff files * updates * cleanup * update * less diff! * stash * current updates * updates * No need for call * finished fining deps * update * current changes * current state * current state * new status * nit * finally * fixes * nits * order is now expected * use logger info instead of prints * fixup * up * nit * update * nits * update * correct merge * update * update * update * add warning * update caution message * update * better merging strategy * copy class statements :wink * fixups * nits * update * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * smaller header * do cleanup some stuff * even simpler header? * fixup * updates * ruff * update examples * nit * TODO * state * OUUUUUUF * current state * nits * final state * add a readme * fixup * remove diff llama * fix * nit * dummy noy funny * ruff format tests src utils --check * everless diffs * less diffs and fix test * fixes * naming nit? * update converter and add supper example * nits * updated for function signatures * update * update * add converted dummies * autoformat * single target assign fix * fixup * fix some imports * fixes * don't push them * `# noqa: F841` --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit ba34b39 Author: Vallepu Vamsi Krishna <vallepu670@gmail.com> Date: Fri May 31 21:53:11 2024 +0530 Added description of quantization_config (huggingface#31133) * Description of quantization_config Added missing description about quantization_config in replace_with_bnb_linear for better readability. * Removed trailing spaces commit 2a2ec42 Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Fri May 31 16:56:17 2024 +0100 Instance segmentation examples (huggingface#31084) * Initial setup * Metrics * Overfit on two batches * Train 40 epochs * Memory leak debugging * Trainer fine-tuning * Draft * Fixup * Trained end-to-end * Add requirements * Rewrite evaluator * nits * Add readme * Add instance-segmentation to the table * Support void masks * Remove sh * Update docs * Add pytorch test * Add accelerate test * Update examples/pytorch/instance-segmentation/README.md * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Fix consistency oneformer * Fix imports * Fix imports sort * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Add resources to docs * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove explicit model_type argument * Fix tests * Update readme * Note about other models --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit 3231ed4 Author: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com> Date: Fri May 31 14:16:23 2024 +0200 Add streaming, various fixes (huggingface#30838) * Implement streaming run in ReAct agents * Allow additional imports in code agents * Python interpreter: support classes and exceptions, fixes commit 899d73f Author: Marc Sun <57196510+SunMarc@users.noreply.github.com> Date: Fri May 31 12:44:20 2024 +0200 [trainer] add sanity evaluation option (huggingface#31146) * add sanity evaluation * fix * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> commit 09daece Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Fri May 31 12:36:46 2024 +0200 Quantization: Enhance bnb error message (huggingface#31160) enhance error message commit 390c9f4 Author: Asif Ajrof <asifajrof@gmail.com> Date: Fri May 31 16:34:29 2024 +0600 Update sam.md (huggingface#31130) `mask` variable is not defined. probably a writing mistake. it should be `segmentation_map`. `segmentation_map` should be a `1` channel image rather than `RGB`. [on a different note, the `mask_url` is the same as `raw_image`. could provide a better example. commit a6967c0 Author: Marc Sun <57196510+SunMarc@users.noreply.github.com> Date: Fri May 31 12:08:55 2024 +0200 Fix quantized cache output (huggingface#31143) commit aa2e1d4 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Fri May 31 10:35:54 2024 +0200 pytest -rsfE (huggingface#31140) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 6c33f18 Author: Arthur <48595927+ArthurZucker@users.noreply.github.com> Date: Fri May 31 08:49:33 2024 +0200 helper (huggingface#31152) * helper * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * more doc --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit adb74a2 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Thu May 30 17:21:10 2024 +0200 Workflow: Remove `IS_GITHUB_CI` (huggingface#31147) remove `IS_GITHUB_CI` commit 3553184 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Thu May 30 16:47:35 2024 +0200 Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (huggingface#31136) Replace all occurences of `load_in_8bit` with bnb config commit e6dcdfd Author: zspo <songpo.zhang@foxmail.com> Date: Thu May 30 22:25:43 2024 +0800 fix get_scheduler when name is warmup_stable_decay (huggingface#31128) fix get_scheduler args commit 9d8b6ea Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Thu May 30 11:45:03 2024 +0200 FIX / Quantization: Add extra validation for bnb config (huggingface#31135) add validation for bnb config commit 7fc432f Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Wed May 29 19:43:51 2024 +0200 Cleanup docker build (huggingface#31119) * remove * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit c350b52 Author: Dhruv Pai <46631243+dhruvbpai@users.noreply.github.com> Date: Wed May 29 07:20:59 2024 -0700 Add on_optimizer_step to callback options (huggingface#31095) * Modified test * Added on_optimizer_step to callbacks * Move callback after step is called * Added on optimizer step callback commit 545d7ca Author: Joao Gante <joaofranciscocardosogante@gmail.com> Date: Wed May 29 15:17:14 2024 +0100 Add VLM generation default contributor (huggingface#31115) * add Raushan * add Raushan commit 296c546 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Wed May 29 15:56:28 2024 +0200 FIX / Docs: Fix GPTQ expected number of bits (huggingface#31111) Update overview.md commit b643801 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Wed May 29 15:42:39 2024 +0200 Fix nightly circleci (huggingface#31114) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 89261a1 Author: Zach Mueller <muellerzr@gmail.com> Date: Wed May 29 09:35:37 2024 -0400 Rm maintainer + migrate (huggingface#31089) commit 0e3643c Author: Matt <Rocketknight1@users.noreply.github.com> Date: Wed May 29 13:33:26 2024 +0100 Fix faulty rstrip in module loading (huggingface#31108) commit a41deea Author: Matt <Rocketknight1@users.noreply.github.com> Date: Wed May 29 13:20:36 2024 +0100 Fix env.py in cases where torch is not present (huggingface#31113) * Fix env.py in cases where torch is not present * Simplify the fix (and avoid some issues) commit 61f854a Author: Huazhong Ji <hzji210@gmail.com> Date: Wed May 29 18:57:54 2024 +0800 Improve `transformers-cli env` reporting (huggingface#31003) * Improve `transformers-cli env` reporting * move the line `"Using GPU in script?": "<fill in>"` to in if conditional statement * same option for npu commit 40ed3a8 Author: Lucain <lucainp@gmail.com> Date: Wed May 29 12:55:43 2024 +0200 Use `HF_HUB_OFFLINE` + fix has_file in offline mode (huggingface#31016) * Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs commit 300d03c Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Wed May 29 11:43:54 2024 +0200 FEAT: Add mistral v3 conversion script (huggingface#30981) * add mistral v3 conversion script * Update src/transformers/models/mistral/convert_mistral_weights_to_hf.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> commit 524d7bf Author: Raushan Turganbay <raushan@huggingface.co> Date: Wed May 29 14:25:44 2024 +0500 Quantized KV cache: update quanto (huggingface#31052) * quanto latest version was refactored * add error msg * incorrect compare sign * Update src/transformers/cache_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit 9f98c9c Author: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Date: Tue May 28 18:07:07 2024 +0100 Deprecate low use models (huggingface#30781) * Deprecate models - graphormer - time_series_transformer - xlm_prophetnet - qdqbert - nat - ernie_m - tvlt - nezha - mega - jukebox - vit_hybrid - x_clip - deta - speech_to_text_2 - efficientformer - realm - gptsan_japanese * Fix up * Fix speech2text2 imports * Make sure message isn't indented * Fix docstrings * Correctly map for deprecated models from model_type * Uncomment out * Add back time series transformer and x-clip * Import fix and fix-up * Fix up with updated ruff commit 1cb30f0 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 18:29:22 2024 +0200 Docs / Quantization: Redirect deleted page (huggingface#31063) Update _redirects.yml commit 1ed4924 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 18:29:11 2024 +0200 TST: Fix instruct-blip tests (huggingface#31088) * fix flan t5 tests * better format commit 2a08fd3 Author: Jonny Li <jonny_li@live.ca> Date: Tue May 28 12:25:15 2024 -0400 Fix DeepSpeed compatibility with weight_norm (huggingface#30881) (huggingface#31018) commit b5f4ec6 Author: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> Date: Tue May 28 17:47:35 2024 +0200 Fix PretrainedConfig docstring with deprecated resume_download (huggingface#31014) commit 454cbe0 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Tue May 28 17:44:52 2024 +0200 skip `test_multi_gpu_data_parallel_forward` for `vit` and `deit` (huggingface#31086) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit e70c2ea Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 17:06:00 2024 +0200 FIX / OPT: Fix OPT multi-GPU training for `OPTForQuestionAnswering` (huggingface#31092) Update modeling_opt.py commit 6560e25 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 17:05:44 2024 +0200 FIX: Add `accelerate` as a hard requirement (huggingface#31090) add accelerate commit 9bf05ec Author: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Date: Tue May 28 16:02:51 2024 +0200 Render chat template tojson filter as unicode (huggingface#31041) * Render chat template tojson filter as unicode * ruff-- commit e405f2b Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 15:04:43 2024 +0200 Docs / PEFT: Add PEFT API documentation (huggingface#31078) * add peft references * add peft references * Update docs/source/en/peft.md * Update docs/source/en/peft.md commit 5237955 Author: Raushan Turganbay <raushan@huggingface.co> Date: Tue May 28 17:07:42 2024 +0500 Watermark: fix tests (huggingface#30961) * fix tests * style * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit f2a7f7c Author: Lysandre Debut <hi@lysand.re> Date: Tue May 28 13:34:23 2024 +0200 Fix failing tokenizer tests (huggingface#31083) * Fix failing tokenizer tests * Use small tokenizer * Fix remaining reference commit 0e1935b Author: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Date: Tue May 28 13:22:06 2024 +0200 [SuperPoint, PaliGemma] Update docs (huggingface#31025) * Update docs * Add PaliGemma resources * Address comment * Update docs commit 2fe8356 Author: Sina Taslimi <33656391+taslimisina@users.noreply.github.com> Date: Tue May 28 13:09:32 2024 +0200 Fix typo in trainer.py (huggingface#31048) commit b74960c Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Tue May 28 11:06:06 2024 +0000 Fix OWLv2 post_process_object_detection for multiple images (huggingface#31082) * Add test for multiple images * [run slow] owlv2 * Fix box rescaling * [run slow] owlv2 commit 3e3599d Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Tue May 28 10:41:40 2024 +0000 Remove float64 cast for OwlVit and OwlV2 to support MPS device (huggingface#31071) Remove float64 commit 48d33da Author: oOraph <13552058+oOraph@users.noreply.github.com> Date: Tue May 28 11:56:05 2024 +0200 fix from_pretrained in offline mode when model is preloaded in cache (huggingface#31010) * Unit test to verify fix Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> * fix from_pretrained in offline mode when model is preloaded in cache Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> * minor: fmt Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> --------- Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> Co-authored-by: Raphael Glon <oOraph@users.noreply.github.com> commit 7c472e6 Author: Hengwen Tong <kevint324@gmail.com> Date: Tue May 28 17:52:47 2024 +0800 Remove redundant backend checks in training_args.py (huggingface#30999) * Remove backend checks in training_args.py * Expilicit initialize the device --------- Co-authored-by: tonghengwen <tonghengwen@cambricon.com> commit 46b606e Author: AP <108011872+apalkk@users.noreply.github.com> Date: Tue May 28 09:50:45 2024 +0000 Update quicktour.md to fix broken link to Glossary (huggingface#31072) Update quicktour.md to fix broken link Missing '/' in attention mask link in the transformers quicktour commit 580f464 Author: Clint Adams <clint@gcfm.net> Date: Tue May 28 05:48:23 2024 -0400 fix "piano" typo (huggingface#31027) commit 5e211d5 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Tue May 28 11:36:26 2024 +0200 Remove `ninja` from docker image build (huggingface#31080) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 8b91c20 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Tue May 28 10:53:28 2024 +0200 use `@main` (huggingface#31065) use main Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 04440a0 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Mon May 27 18:36:39 2024 +0200 skip `test_model_parallelism` for 2 model test classes (huggingface#31067) skip Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit f803e2b Author: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> Date: Mon May 27 16:09:05 2024 +0200 Fix pad_to_max_length Whisper (huggingface#30787) * fix pad_to_max_length Whisper * add tests * make style commit b6eb29b Author: Marc Sun <57196510+SunMarc@users.noreply.github.com> Date: Mon May 27 15:53:45 2024 +0200 Fix quanto tests (huggingface#31062) fix quanto tests commit e581213 Author: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Date: Mon May 27 14:16:47 2024 +0100 Update feature request label in template (huggingface#30940) commit 05eff71 Author: Eitan Turok <150733043+eitanturok@users.noreply.github.com> Date: Mon May 27 08:57:43 2024 -0400 Follow up: Fix link in dbrx.md (huggingface#30514) * Fix link in dbrx.md * remove "though this may not be up to date" --------- Co-authored-by: Lysandre Debut <hi@lysand.re> commit d5aa839 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Mon May 27 13:47:47 2024 +0200 unpin uv (huggingface#31055) [push-ci-image] Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 165bd7a Author: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com> Date: Mon May 27 10:34:14 2024 +0200 Redirect transformers_agents doc to agents (huggingface#31054) commit 6df5028 Author: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Date: Fri May 24 19:02:55 2024 +0200 Paligemma- fix devices and dtype assignments (huggingface#31008) * fix devices and dtype assignments * [run-slow]paligemma commit 61f1d47 Author: Ita Zaporozhets <31893021+itazap@users.noreply.github.com> Date: Fri May 24 17:38:58 2024 +0200 Add split special tokens (huggingface#30772) * seems like `split_special_tokens` is used here * split special token * add new line at end of file * moving split special token test to common tests * added assertions * test * fixup * add co-author * passing rest of args to gptsan_japanese, fixing tests * removing direct comparison of fast and slow models * adding test support for UDOP and LayoutXLM * ruff fix * readd check if slow tokenizer * modify test to handle bos tokens * removing commented function * trigger build * applying review feedback - updated docstrings, var names, and simplified tests * ruff fixes * Update tests/test_tokenization_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * applying feedback, comments * shutil temp directory fix --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain> Co-authored-by: itazap <itazap@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MacBook-Pro.local> commit e2b9913 Author: BHUVAN M <121122109+bhuvanmdev@users.noreply.github.com> Date: Fri May 24 20:50:09 2024 +0530 added interpolation for vitmae model in pytorch as well as tf. (huggingface#30732) * added interpolation for vitmae model in pytorch as well as tf. * Update modeling_vit_mae.py irreugalr import fixed * small changes and proper formatting * changes suggested in review. * modified decoder interpolate_func * arguments and docstring fix * Apply suggestions from code review doc fixes Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit 96eb062 Author: Arthur <48595927+ArthurZucker@users.noreply.github.com> Date: Fri May 31 18:37:43 2024 +0200 Diff converter v2 (huggingface#30868) * current working example! * commit regex and result file * update * nit * push the conversion file * oups * roadmap and nits * attempt diffs for 3 files * persimmon * nit * add diff file that is the same as the modeling_llama.py * fix rope nits * updates * updates with converted versions * give some breathing space to the code * delete * update * update * push the actual result * update regex patterns * update regex patterns * fix some issues * fix some issues * fix some issues * updates * updates * updates * updates * updates * revert changes done to llama * updates * update gemma * updates * oups * current state * current state * update * ouiiii * nit * clear diffs * nit * fixup * update * doc 🚀 * 🔥 * for now use gemma * deal with comments * style * handle funtions * deal with assigns * todos * process inheritage * keep decorators? * 🤗 * deal with duplicates * fixup * correctly remove duplicate code * run ruff post script * ruff deals pretty well with imports, let's leave it to him * ah maybe not lol * for now remove all imports from child. * nit * conversion of llama * okay * convert starcoder2 * synch with main * update llama diff * updates * https://docs.astral.sh/ruff/rules/redefined-while-unused/ fixes the imports, bit needs later version of ruff * updates * okay actual state * non zero exit * update! * revert unrelated * remove other diff files * updates * cleanup * update * less diff! * stash * current updates * updates * No need for call * finished fining deps * update * current changes * current state * current state * new status * nit * finally * fixes * nits * order is now expected * use logger info instead of prints * fixup * up * nit * update * nits * update * correct merge * update * update * update * add warning * update caution message * update * better merging strategy * copy class statements :wink * fixups * nits * update * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * smaller header * do cleanup some stuff * even simpler header? * fixup * updates * ruff * update examples * nit * TODO * state * OUUUUUUF * current state * nits * final state * add a readme * fixup * remove diff llama * fix * nit * dummy noy funny * ruff format tests src utils --check * everless diffs * less diffs and fix test * fixes * naming nit? * update converter and add supper example * nits * updated for function signatures * update * update * add converted dummies * autoformat * single target assign fix * fixup * fix some imports * fixes * don't push them * `# noqa: F841` --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit 372baec Author: Vallepu Vamsi Krishna <vallepu670@gmail.com> Date: Fri May 31 21:53:11 2024 +0530 Added description of quantization_config (huggingface#31133) * Description of quantization_config Added missing description about quantization_config in replace_with_bnb_linear for better readability. * Removed trailing spaces commit cdc8131 Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Fri May 31 16:56:17 2024 +0100 Instance segmentation examples (huggingface#31084) * Initial setup * Metrics * Overfit on two batches * Train 40 epochs * Memory leak debugging * Trainer fine-tuning * Draft * Fixup * Trained end-to-end * Add requirements * Rewrite evaluator * nits * Add readme * Add instance-segmentation to the table * Support void masks * Remove sh * Update docs * Add pytorch test * Add accelerate test * Update examples/pytorch/instance-segmentation/README.md * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation_no_trainer.py * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py * Fix consistency oneformer * Fix imports * Fix imports sort * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/run_instance_segmentation.py Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> * Add resources to docs * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update examples/pytorch/instance-segmentation/README.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove explicit model_type argument * Fix tests * Update readme * Note about other models --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sangbum Daniel Choi <34004152+SangbumChoi@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit 9837a25 Author: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com> Date: Fri May 31 14:16:23 2024 +0200 Add streaming, various fixes (huggingface#30838) * Implement streaming run in ReAct agents * Allow additional imports in code agents * Python interpreter: support classes and exceptions, fixes commit f8e6ba4 Author: Marc Sun <57196510+SunMarc@users.noreply.github.com> Date: Fri May 31 12:44:20 2024 +0200 [trainer] add sanity evaluation option (huggingface#31146) * add sanity evaluation * fix * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> commit fc5d3e1 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Fri May 31 12:36:46 2024 +0200 Quantization: Enhance bnb error message (huggingface#31160) enhance error message commit bd9d1dd Author: Asif Ajrof <asifajrof@gmail.com> Date: Fri May 31 16:34:29 2024 +0600 Update sam.md (huggingface#31130) `mask` variable is not defined. probably a writing mistake. it should be `segmentation_map`. `segmentation_map` should be a `1` channel image rather than `RGB`. [on a different note, the `mask_url` is the same as `raw_image`. could provide a better example. commit 48cada8 Author: Marc Sun <57196510+SunMarc@users.noreply.github.com> Date: Fri May 31 12:08:55 2024 +0200 Fix quantized cache output (huggingface#31143) commit d19566e Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Fri May 31 10:35:54 2024 +0200 pytest -rsfE (huggingface#31140) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit f3f640d Author: Arthur <48595927+ArthurZucker@users.noreply.github.com> Date: Fri May 31 08:49:33 2024 +0200 helper (huggingface#31152) * helper * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * updates * more doc --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit 6bd511a Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Thu May 30 17:21:10 2024 +0200 Workflow: Remove `IS_GITHUB_CI` (huggingface#31147) remove `IS_GITHUB_CI` commit f5590de Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Thu May 30 16:47:35 2024 +0200 Docs / Quantization: Replace all occurences of `load_in_8bit` with bnb config (huggingface#31136) Replace all occurences of `load_in_8bit` with bnb config commit cda9c82 Author: zspo <songpo.zhang@foxmail.com> Date: Thu May 30 22:25:43 2024 +0800 fix get_scheduler when name is warmup_stable_decay (huggingface#31128) fix get_scheduler args commit 5e5c4d6 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Thu May 30 11:45:03 2024 +0200 FIX / Quantization: Add extra validation for bnb config (huggingface#31135) add validation for bnb config commit 2b9e252 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Wed May 29 19:43:51 2024 +0200 Cleanup docker build (huggingface#31119) * remove * build --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 5c88253 Author: Dhruv Pai <46631243+dhruvbpai@users.noreply.github.com> Date: Wed May 29 07:20:59 2024 -0700 Add on_optimizer_step to callback options (huggingface#31095) * Modified test * Added on_optimizer_step to callbacks * Move callback after step is called * Added on optimizer step callback commit 4af705c Author: Joao Gante <joaofranciscocardosogante@gmail.com> Date: Wed May 29 15:17:14 2024 +0100 Add VLM generation default contributor (huggingface#31115) * add Raushan * add Raushan commit cb879c5 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Wed May 29 15:56:28 2024 +0200 FIX / Docs: Fix GPTQ expected number of bits (huggingface#31111) Update overview.md commit 1f84141 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Wed May 29 15:42:39 2024 +0200 Fix nightly circleci (huggingface#31114) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit d16053c Author: Zach Mueller <muellerzr@gmail.com> Date: Wed May 29 09:35:37 2024 -0400 Rm maintainer + migrate (huggingface#31089) commit 0bef4a2 Author: Matt <Rocketknight1@users.noreply.github.com> Date: Wed May 29 13:33:26 2024 +0100 Fix faulty rstrip in module loading (huggingface#31108) commit 97a58a5 Author: Matt <Rocketknight1@users.noreply.github.com> Date: Wed May 29 13:20:36 2024 +0100 Fix env.py in cases where torch is not present (huggingface#31113) * Fix env.py in cases where torch is not present * Simplify the fix (and avoid some issues) commit c886137 Author: Huazhong Ji <hzji210@gmail.com> Date: Wed May 29 18:57:54 2024 +0800 Improve `transformers-cli env` reporting (huggingface#31003) * Improve `transformers-cli env` reporting * move the line `"Using GPU in script?": "<fill in>"` to in if conditional statement * same option for npu commit c3044ec Author: Lucain <lucainp@gmail.com> Date: Wed May 29 12:55:43 2024 +0200 Use `HF_HUB_OFFLINE` + fix has_file in offline mode (huggingface#31016) * Fix has_file in offline mode * harmonize env variable for offline mode * Switch to HF_HUB_OFFLINE * fix test * revert test_offline to test TRANSFORMERS_OFFLINE * Add new offline test * merge conflicts * docs commit bfe6f51 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Wed May 29 11:43:54 2024 +0200 FEAT: Add mistral v3 conversion script (huggingface#30981) * add mistral v3 conversion script * Update src/transformers/models/mistral/convert_mistral_weights_to_hf.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> commit d521ba5 Author: Raushan Turganbay <raushan@huggingface.co> Date: Wed May 29 14:25:44 2024 +0500 Quantized KV cache: update quanto (huggingface#31052) * quanto latest version was refactored * add error msg * incorrect compare sign * Update src/transformers/cache_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit a564d10 Author: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Date: Tue May 28 18:07:07 2024 +0100 Deprecate low use models (huggingface#30781) * Deprecate models - graphormer - time_series_transformer - xlm_prophetnet - qdqbert - nat - ernie_m - tvlt - nezha - mega - jukebox - vit_hybrid - x_clip - deta - speech_to_text_2 - efficientformer - realm - gptsan_japanese * Fix up * Fix speech2text2 imports * Make sure message isn't indented * Fix docstrings * Correctly map for deprecated models from model_type * Uncomment out * Add back time series transformer and x-clip * Import fix and fix-up * Fix up with updated ruff commit 7f08817 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 18:29:22 2024 +0200 Docs / Quantization: Redirect deleted page (huggingface#31063) Update _redirects.yml commit 3264be4 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 18:29:11 2024 +0200 TST: Fix instruct-blip tests (huggingface#31088) * fix flan t5 tests * better format commit 476890e Author: Jonny Li <jonny_li@live.ca> Date: Tue May 28 12:25:15 2024 -0400 Fix DeepSpeed compatibility with weight_norm (huggingface#30881) (huggingface#31018) commit aada568 Author: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> Date: Tue May 28 17:47:35 2024 +0200 Fix PretrainedConfig docstring with deprecated resume_download (huggingface#31014) commit 3af7bf3 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Tue May 28 17:44:52 2024 +0200 skip `test_multi_gpu_data_parallel_forward` for `vit` and `deit` (huggingface#31086) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit ab19f90 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 17:06:00 2024 +0200 FIX / OPT: Fix OPT multi-GPU training for `OPTForQuestionAnswering` (huggingface#31092) Update modeling_opt.py commit 94d416f Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 17:05:44 2024 +0200 FIX: Add `accelerate` as a hard requirement (huggingface#31090) add accelerate commit 22dab24 Author: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> Date: Tue May 28 16:02:51 2024 +0200 Render chat template tojson filter as unicode (huggingface#31041) * Render chat template tojson filter as unicode * ruff-- commit 4f98b14 Author: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Date: Tue May 28 15:04:43 2024 +0200 Docs / PEFT: Add PEFT API documentation (huggingface#31078) * add peft references * add peft references * Update docs/source/en/peft.md * Update docs/source/en/peft.md commit 779bc36 Author: Raushan Turganbay <raushan@huggingface.co> Date: Tue May 28 17:07:42 2024 +0500 Watermark: fix tests (huggingface#30961) * fix tests * style * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> commit a3c7b59 Author: Lysandre Debut <hi@lysand.re> Date: Tue May 28 13:34:23 2024 +0200 Fix failing tokenizer tests (huggingface#31083) * Fix failing tokenizer tests * Use small tokenizer * Fix remaining reference commit 90da0b1 Author: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Date: Tue May 28 13:22:06 2024 +0200 [SuperPoint, PaliGemma] Update docs (huggingface#31025) * Update docs * Add PaliGemma resources * Address comment * Update docs commit 66add16 Author: Sina Taslimi <33656391+taslimisina@users.noreply.github.com> Date: Tue May 28 13:09:32 2024 +0200 Fix typo in trainer.py (huggingface#31048) commit 98e2d48 Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Tue May 28 11:06:06 2024 +0000 Fix OWLv2 post_process_object_detection for multiple images (huggingface#31082) * Add test for multiple images * [run slow] owlv2 * Fix box rescaling * [run slow] owlv2 commit c31473e Author: Pavel Iakubovskii <qubvel@gmail.com> Date: Tue May 28 10:41:40 2024 +0000 Remove float64 cast for OwlVit and OwlV2 to support MPS device (huggingface#31071) Remove float64 commit 936ab7b Author: oOraph <13552058+oOraph@users.noreply.github.com> Date: Tue May 28 11:56:05 2024 +0200 fix from_pretrained in offline mode when model is preloaded in cache (huggingface#31010) * Unit test to verify fix Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> * fix from_pretrained in offline mode when model is preloaded in cache Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> * minor: fmt Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> --------- Signed-off-by: Raphael Glon <oOraph@users.noreply.github.com> Co-authored-by: Raphael Glon <oOraph@users.noreply.github.com> commit 537deb7 Author: Hengwen Tong <kevint324@gmail.com> Date: Tue May 28 17:52:47 2024 +0800 Remove redundant backend checks in training_args.py (huggingface#30999) * Remove backend checks in training_args.py * Expilicit initialize the device --------- Co-authored-by: tonghengwen <tonghengwen@cambricon.com> commit dd4654e Author: AP <108011872+apalkk@users.noreply.github.com> Date: Tue May 28 09:50:45 2024 +0000 Update quicktour.md to fix broken link to Glossary (huggingface#31072) Update quicktour.md to fix broken link Missing '/' in attention mask link in the transformers quicktour commit e18da4e Author: Clint Adams <clint@gcfm.net> Date: Tue May 28 05:48:23 2024 -0400 fix "piano" typo (huggingface#31027) commit 8e3b1fe Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Tue May 28 11:36:26 2024 +0200 Remove `ninja` from docker image build (huggingface#31080) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 8f0f727 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Tue May 28 10:53:28 2024 +0200 use `@main` (huggingface#31065) use main Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 9d35edb Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Mon May 27 18:36:39 2024 +0200 skip `test_model_parallelism` for 2 model test classes (huggingface#31067) skip Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit d355741 Author: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> Date: Mon May 27 16:09:05 2024 +0200 Fix pad_to_max_length Whisper (huggingface#30787) * fix pad_to_max_length Whisper * add tests * make style commit b84cd67 Author: Marc Sun <57196510+SunMarc@users.noreply.github.com> Date: Mon May 27 15:53:45 2024 +0200 Fix quanto tests (huggingface#31062) fix quanto tests commit cd79777 Author: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Date: Mon May 27 14:16:47 2024 +0100 Update feature request label in template (huggingface#30940) commit 0a064dc Author: Eitan Turok <150733043+eitanturok@users.noreply.github.com> Date: Mon May 27 08:57:43 2024 -0400 Follow up: Fix link in dbrx.md (huggingface#30514) * Fix link in dbrx.md * remove "though this may not be up to date" --------- Co-authored-by: Lysandre Debut <hi@lysand.re> commit d7942d9 Author: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Date: Mon May 27 13:47:47 2024 +0200 unpin uv (huggingface#31055) [push-ci-image] Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> commit 84c4b72 Author: Aymeric Roucher <69208727+aymeric-roucher@users.noreply.github.com> Date: Mon May 27 10:34:14 2024 +0200 Redirect transformers_agents doc to agents (huggingface#31054) commit bdb9106 Author: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Date: Fri May 24 19:02:55 2024 +0200 Paligemma- fix devices and dtype assignments (huggingface#31008) * fix devices and dtype assignments * [run-slow]paligemma commit deba765 Author: Ita Zaporozhets <31893021+itazap@users.noreply.github.com> Date: Fri May 24 17:38:58 2024 +0200 Add split special tokens (huggingface#30772) * seems like `split_special_tokens` is used here * split special token * add new line at end of file * moving split special token test to common tests * added assertions * test * fixup * add co-author * passing rest of args to gptsan_japanese, fixing tests * removing direct comparison of fast and slow models * adding test support for UDOP and LayoutXLM * ruff fix * readd check if slow tokenizer * modify test to handle bos tokens * removing commented function * trigger build * applying review feedback - updated docstrings, var names, and simplified tests * ruff fixes * Update tests/test_tokenization_common.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * applying feedback, comments * shutil temp directory fix --------- Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com> Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MBP.localdomain> Co-authored-by: itazap <itazap@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Ita Zaporozhets <itazaporozhets@Itas-MacBook-Pro.local> commit e5103a7 Author: BHUVAN M <121122109+bhuvanmdev@users.noreply.github.com> Date: Fri May 24 20:50:09 2024 +0530 added interpolation for vitmae model in pytorch as well as tf. (huggingface#30732) * added interpolation for vitmae model in pytorch as well as tf. * Update modeling_vit_mae.py irreugalr import fixed * small changes and proper formatting * changes suggested in review. * modified decoder interpolate_func * arguments and docstring fix * Apply suggestions from code review doc fixes Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

fix get_scheduler args

fix get_scheduler args

3189ff1

zspo changed the title ~~fix get_scheduler args~~ fix get_scheduler when name is warmup_stable_decay May 30, 2024

amyeroberts approved these changes May 30, 2024

View reviewed changes

SunMarc approved these changes May 30, 2024

View reviewed changes

amyeroberts merged commit cda9c82 into huggingface:main May 30, 2024
21 checks passed

vasqu pushed a commit to vasqu/transformers that referenced this pull request Jun 1, 2024

fix get_scheduler when name is warmup_stable_decay (huggingface#31128)

e6dcdfd

fix get_scheduler args

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024

fix get_scheduler when name is warmup_stable_decay (huggingface#31128)

138cb7d

fix get_scheduler args

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix get_scheduler when name is warmup_stable_decay #31128

fix get_scheduler when name is warmup_stable_decay #31128

zspo commented May 30, 2024

amyeroberts left a comment

SunMarc left a comment •

edited

Loading

MostHumble commented May 30, 2024 •

edited

Loading

zspo commented May 30, 2024

SunMarc commented May 30, 2024

MostHumble commented May 30, 2024

fix get_scheduler when name is warmup_stable_decay #31128

fix get_scheduler when name is warmup_stable_decay #31128

Conversation

zspo commented May 30, 2024

What does this PR do?

amyeroberts left a comment

Choose a reason for hiding this comment

SunMarc left a comment • edited Loading

Choose a reason for hiding this comment

MostHumble commented May 30, 2024 • edited Loading

zspo commented May 30, 2024

SunMarc commented May 30, 2024

MostHumble commented May 30, 2024

SunMarc left a comment •

edited

Loading

MostHumble commented May 30, 2024 •

edited

Loading