Skip to content
This repository was archived by the owner on Jun 4, 2025. It is now read-only.

Conversation

@natuan
Copy link

@natuan natuan commented Nov 15, 2021

Add support for overwriting hyper params in recipe. With this change, hyper params, for instance, distill_hardness and distill_temperature defined in a quantization recipe could be modified by passing

--recipe_args "{"distill_hardness": 0.8, "distill_temperature": 10}"

into the training command. The params need to be placed within "eval(...)" for this support.

@spacemanidol spacemanidol merged commit 644c8fa into master Nov 17, 2021
@spacemanidol spacemanidol deleted the recipe_args_ner branch November 18, 2021 00:06
bfineran pushed a commit that referenced this pull request Jun 5, 2024
* Copy model

* changes

* misc

* fixes

* add embed and residual dropout (#30)

* misc

* remove rms norm and gated MLP

* remove copied mentions where its not a copy anymore

* remove unused _shape

* copied from mistral instead

* fix copies

* fix copies

* add not doctested

* fix

* fix copyright

* Update docs/source/en/model_doc/starcoder2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/starcoder2/configuration_starcoder2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/starcoder2/configuration_starcoder2.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix doc

* revert some changes

* add fa2 tests

* fix styling nit

* fix

* push dummy docs

---------

Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants