-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kvcache config for Mistral #1766
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an explanation/comment explaining why this is needed and how this differs than the other llama models?
Also, will this work for the 70b model? In that case, can we update the name of the config to cover what group of llama models this will cover?
@@ -138,6 +138,16 @@ class Config: | |||
multiply_batch_by_num_att_heads=False, | |||
) | |||
|
|||
MISTRAL_CONFIG = KeyValueCacheConfig( | |||
model_name="mistral", | |||
additional_transforms=AdditionalTransformsLLAMA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A warning will show up every time about the number of updated nodes if they aren't the standard, just an fyi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we should fix this for all Llama models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsikka could you elaborate?
Mistral actually uses a "completely" new model/config class so the config.json actually holds a "MistralForCausalLM" architecture. Here is an example config. There is only one pretrained Mistral arch model and it is 7b. Based on how we're currently architecting this, I assume we need to make a new config entry for every new We should make a separate PR in order to support Llama models with GQA enabled. |
@mgoin from what we've seen though, it seems like the config you added could also be used for GQA? |
@dsikka yes, it should work for Llama models with GQA if the name is changed to "llama". My position is that Llama GQA support is a different issue than Mistral support, but if y'all want to add it into this diff I'm good with that. I'm not sure how to structure the change without affecting non-GQA Llama models |
* Add kvcache config for Mistral * Update configs.py * Update configs.py
* - Update `src/sparseml/modifiers/obcq/pytorch.py` to use layer prefix for from model - Remove `layer_prefix` from `SparseGPTModifier` base - Update ModelMetaData to include layer_prefix - Added a convenience function to update missing values in RecipeMetaData instance from another RecipeMetaData instance - Update simplify recipe to also include metadata - Update simplify_combine_recipes to include metadata - Add layer_prefix property to `ModifiableModel` - propagate `layer_prefix` to superclass - update session.py to set_layer_prefix on the model before initializing modifiers - Update example recipe to include layer_prefix in metadata * Add missing docstring * - address review comment - update docstring - add test for `update_missing_metadata` * Add test * Style * Fix tests * Style * [modifier refactor] Add constant pruning tests (#1752) * Initial commit * Add end to end tests * Add e2e tests for constant pruning modifier * Move imports inside the test fuctions so that torch isn't imported unless running the tests * Update setup.py to not run modifier tests unless pytorch is specified * [Bugfix] .dict() method on Recipe (#1753) * Bugfix .dict() method on Recipe * Remove extraneous local test, [faulty commit] * [modifier refactor] Add serialization tests (#1755) * Add serialization tests * Clean up * Keep original stage and group names Clean up _get_yaml_dict * fix comment * Typo * [Unit Tests][Modifier Refactor] (#1756) * Move valid recipes to a helper file Add tests for session.py * Increase test coverage of src/sparseml/core/session.py to 100% Run Style Add logs to .gitignore * Increase coverage of tests/sparseml/core/test_state.py to 100% * add tests for lifecycle/event.py * Increase code coverage of lifecycle/event to 100% * increase lifecycle/session.py code coverage to 93% * Address review comments from @Satrat * Address review comments on 1752 (#1772) Update makefile to only ignore *pytorch.py files in modifier dir Fix order in test Add regex to makefile Add helper function to determine if torch tests should be run Check masks Make transformers import optional in sparsegpt.py * Fix merge conflict * Add more tests to check valid modifiers are created (#1774) * [Bug][ConstantPruningModifier] Fix mask de register bug (#1773) * Fix mask de-register logic * forgot to remove commented out line * Move tests inside pytorch directory as requested * Fix session reset (#1790) * fix datasets version to be compatible with fsspec (#1797) * Add kvcache config for Mistral (#1766) * Add kvcache config for Mistral * Update configs.py * Update configs.py * Fix reset logic * Style after resolving merge conflicts --------- Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>
* - Update `src/sparseml/modifiers/obcq/pytorch.py` to use layer prefix for from model - Remove `layer_prefix` from `SparseGPTModifier` base - Update ModelMetaData to include layer_prefix - Added a convenience function to update missing values in RecipeMetaData instance from another RecipeMetaData instance - Update simplify recipe to also include metadata - Update simplify_combine_recipes to include metadata - Add layer_prefix property to `ModifiableModel` - propagate `layer_prefix` to superclass - update session.py to set_layer_prefix on the model before initializing modifiers - Update example recipe to include layer_prefix in metadata * Add missing docstring * - address review comment - update docstring - add test for `update_missing_metadata` * Add test * Style * Fix tests * Style * [modifier refactor] Add constant pruning tests (#1752) * Initial commit * Add end to end tests * Add e2e tests for constant pruning modifier * Move imports inside the test fuctions so that torch isn't imported unless running the tests * Update setup.py to not run modifier tests unless pytorch is specified * [Bugfix] .dict() method on Recipe (#1753) * Bugfix .dict() method on Recipe * Remove extraneous local test, [faulty commit] * [modifier refactor] Add serialization tests (#1755) * Add serialization tests * Clean up * Keep original stage and group names Clean up _get_yaml_dict * fix comment * Typo * [Unit Tests][Modifier Refactor] (#1756) * Move valid recipes to a helper file Add tests for session.py * Increase test coverage of src/sparseml/core/session.py to 100% Run Style Add logs to .gitignore * Increase coverage of tests/sparseml/core/test_state.py to 100% * add tests for lifecycle/event.py * Increase code coverage of lifecycle/event to 100% * increase lifecycle/session.py code coverage to 93% * Address review comments from @Satrat * Address review comments on 1752 (#1772) Update makefile to only ignore *pytorch.py files in modifier dir Fix order in test Add regex to makefile Add helper function to determine if torch tests should be run Check masks Make transformers import optional in sparsegpt.py * Fix merge conflict * Add more tests to check valid modifiers are created (#1774) * [Bug][ConstantPruningModifier] Fix mask de register bug (#1773) * Fix mask de-register logic * forgot to remove commented out line * Move tests inside pytorch directory as requested * Fix session reset (#1790) * fix datasets version to be compatible with fsspec (#1797) * Add kvcache config for Mistral (#1766) * Add kvcache config for Mistral * Update configs.py * Update configs.py * Fix reset logic * Style after resolving merge conflicts --------- Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>
* Add kvcache config for Mistral * Update configs.py * Update configs.py
* - Update `src/sparseml/modifiers/obcq/pytorch.py` to use layer prefix for from model - Remove `layer_prefix` from `SparseGPTModifier` base - Update ModelMetaData to include layer_prefix - Added a convenience function to update missing values in RecipeMetaData instance from another RecipeMetaData instance - Update simplify recipe to also include metadata - Update simplify_combine_recipes to include metadata - Add layer_prefix property to `ModifiableModel` - propagate `layer_prefix` to superclass - update session.py to set_layer_prefix on the model before initializing modifiers - Update example recipe to include layer_prefix in metadata * Add missing docstring * - address review comment - update docstring - add test for `update_missing_metadata` * Add test * Style * Fix tests * Style * [modifier refactor] Add constant pruning tests (#1752) * Initial commit * Add end to end tests * Add e2e tests for constant pruning modifier * Move imports inside the test fuctions so that torch isn't imported unless running the tests * Update setup.py to not run modifier tests unless pytorch is specified * [Bugfix] .dict() method on Recipe (#1753) * Bugfix .dict() method on Recipe * Remove extraneous local test, [faulty commit] * [modifier refactor] Add serialization tests (#1755) * Add serialization tests * Clean up * Keep original stage and group names Clean up _get_yaml_dict * fix comment * Typo * [Unit Tests][Modifier Refactor] (#1756) * Move valid recipes to a helper file Add tests for session.py * Increase test coverage of src/sparseml/core/session.py to 100% Run Style Add logs to .gitignore * Increase coverage of tests/sparseml/core/test_state.py to 100% * add tests for lifecycle/event.py * Increase code coverage of lifecycle/event to 100% * increase lifecycle/session.py code coverage to 93% * Address review comments from @Satrat * Address review comments on 1752 (#1772) Update makefile to only ignore *pytorch.py files in modifier dir Fix order in test Add regex to makefile Add helper function to determine if torch tests should be run Check masks Make transformers import optional in sparsegpt.py * Fix merge conflict * Add more tests to check valid modifiers are created (#1774) * [Bug][ConstantPruningModifier] Fix mask de register bug (#1773) * Fix mask de-register logic * forgot to remove commented out line * Move tests inside pytorch directory as requested * Fix session reset (#1790) * fix datasets version to be compatible with fsspec (#1797) * Add kvcache config for Mistral (#1766) * Add kvcache config for Mistral * Update configs.py * Update configs.py * Fix reset logic * Style after resolving merge conflicts --------- Co-authored-by: Sara Adkins <sara@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>
* Add kvcache config for Mistral * Update configs.py * Update configs.py
HF Baseline :)