-
Notifications
You must be signed in to change notification settings - Fork 116
support model_free WOQ quantization #1699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
xin3he
wants to merge
40
commits into
main
Choose a base branch
from
xinhe/4-14
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+3,509
−691
Open
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
dc592e9
implement model free
xin3he 177bf48
polished implementation
xin3he 97e0362
remove useless gpu_concurrency
xin3he ff47a97
添加预编译模式匹配器以提高量化过程中的性能和可扩展性
xin3he 4d9ad0e
fix typo
xin3he 58709e6
update document
xin3he d3951f2
remove useless code and update UT
xin3he 16991ea
mend
xin3he 83b9b4f
remove high_gpu_mem_usage since no performacen benefit.
xin3he 687260d
update regex
xin3he 68d0cb7
fix bug and simplify UT
xin3he 312f75d
fix bug
xin3he 3ca4d3b
add WOQ limiation and support bits group_size setting
xin3he 3f15e02
Merge branch 'main' into xinhe/4-14
xin3he 47b3f35
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 76f9915
update doc
xin3he c588ad2
minor fix
xin3he 0c14165
enable quant_nontext_module
xin3he 405de53
Enhance model-free quantization support and improve documentation
xin3he 6c5ce29
Merge remote-tracking branch 'origin/main' into xinhe/4-14
xin3he 0697324
support loading pytorch_model.bin and ignore conv1d embed by creating…
xin3he f4fc5f4
add UT to cover conv1d detection
xin3he 4f6f97e
support MXFP4/8 dequantization
xin3he ed46cd6
Merge branch 'main' into xinhe/4-14
xin3he 7e3a3f8
fix pylint
xin3he 958191a
Merge branch 'main' into xinhe/4-14
xin3he 7440c32
add auto fallback and change class name
xin3he 8b8d084
fix CI
xin3he eb5fdf4
update readme
xin3he 98a5040
添加回退压缩器功能以支持量化和保存
xin3he 46465c3
Merge branch 'main' into xinhe/4-14
xin3he 7c76188
support diffusion model
xin3he a92acc2
fix bug
xin3he 46ed32c
support layer_config={".ffn.experts.": {"scheme": "W2A16"}} usage
xin3he 6f41cec
fix bug
xin3he 9f81c67
update UT
xin3he 16ead43
fix bug
xin3he 48994a4
Merge remote-tracking branch 'origin/main' into xinhe/4-14
xin3he 3d9812c
add model free for new arch
xin3he bd31861
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.