-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for the latest GPTQ models with group-size #530
Merged
Merged
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
ba6b5b5
Update GPTQ_loader.py
oobabooga edd5a7a
Update download-model.py
oobabooga f4b35ee
Fix offloading
oobabooga 5fa482a
Update server.py
oobabooga c7598a0
Add a default prompt for alpaca models
oobabooga aabf072
Update shared.py
oobabooga 04eb089
Update GPTQ_loader.py
oobabooga 40e0cab
Update shared.py
oobabooga b865a41
Update GPTQ_loader.py
oobabooga 4b9d45b
Update models.py
oobabooga 9771017
Update shared.py
oobabooga 5dd9208
Update GPTQ_loader.py
oobabooga 8be8e6d
Update shared.py
oobabooga bf1eeb5
Update GPTQ_loader.py
oobabooga 2aac1fb
Merge main
oobabooga ee58c5f
Remove gptq-group-size
oobabooga 558e7db
Update GPTQ_loader.py
oobabooga 98a1d5f
Update GPTQ_loader.py
oobabooga b79708a
Update GPTQ_loader.py
oobabooga 26fcc62
Better recognize the model type by the model name
oobabooga a47c6e7
Update README.md
oobabooga 58506a9
Update README.md
oobabooga f793ed2
Update shared.py
oobabooga 071d006
Merge branch 'main' into gptq-group-size
oobabooga File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these 128 break act-order? GPTQ states "Currently, groupsize and act-order do not work together and you must choose one of them." So I would imagen using 128 when you are not supposed to will cause issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is imported from here: https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/main/llama_inference.py#L26
Maybe it hasn't been updated yet to work with act-order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed both 128 to -1 and I was able to load act-order. So I guess just make groupsize a parameter as @sgsdxzy said. If act-order becomes the default just have groupsize default to -1 as it already does on GPTQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good to hear! I will make it a parameter.