-
Notifications
You must be signed in to change notification settings - Fork 94
Insights: vllm-project/llm-compressor
Overview
Could not load contribution data
Please try again later
8 Pull requests merged by 4 people
-
[Callbacks] Remove
MagnitudePruningModifier.leave_enabled
#1198 merged
Mar 7, 2025 -
[Docs] Add info on when to use which PTQ/Sparsification
#1157 merged
Mar 6, 2025 -
[Training] Unifying Preprocess + Postprocessing logic for Train/Oneshot
#1212 merged
Mar 6, 2025 -
[BugFix] Fix logging disabling bug and add tests
#1218 merged
Mar 5, 2025 -
[Training] Datasets - update Module
#1209 merged
Mar 5, 2025 -
[Cosmetic] Rename data_args to dataset_args
#1206 merged
Mar 5, 2025 -
Remove MonkeyPatch for GPUs
#1227 merged
Mar 5, 2025 -
[Training] Decouple Argument parser
#1207 merged
Mar 3, 2025
3 Pull requests opened by 2 people
-
fixing reproducibility of lmeval tests
#1220 opened
Mar 4, 2025 -
wandb/tensorboard loggers set default init to False
#1235 opened
Mar 7, 2025
6 Issues closed by 3 people
-
Bug of quantizing part of the Qwen model
#1230 closed
Mar 6, 2025 -
Why quantized model of 640MB took almost 3GB of VRAM?
#1229 closed
Mar 6, 2025 -
How to choose sparsification rates for different layers?
#1231 closed
Mar 6, 2025 -
Does llmcompressor support hybrid sparsity?
#1037 closed
Mar 5, 2025 -
run llmcompressor to transfer deepseek r1 bf16 to w8a8 occur error
#1221 closed
Mar 4, 2025 -
[Clarification] Using FP8 Dynamic quantization saves space on disk, but not when loaded?
#1210 closed
Mar 3, 2025
10 Issues opened by 5 people
-
Fix docker container to build llm-compressor, not sparseml
#1236 opened
Mar 7, 2025 -
Cuda out of memory while loading the quantized model
#1234 opened
Mar 7, 2025 -
[Bug] Sporadic test failure on sparsification test
#1232 opened
Mar 6, 2025 -
Quantization Memory Requirements
#1228 opened
Mar 5, 2025 -
Fail when invalid arguments are provided in the recipe for a modifier
#1226 opened
Mar 5, 2025 -
Fail when faulty recipes are provided during `oneshot`
#1225 opened
Mar 5, 2025 -
Expand tests for `ModuleSparsificationInfo`
#1224 opened
Mar 5, 2025 -
Update to use `loguru`
#1223 opened
Mar 5, 2025 -
Update observers to make `MSE` the default
#1222 opened
Mar 5, 2025
12 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
does it support asymmetric ?
#1190 commented on
Mar 5, 2025 • 0 new comments -
W4A8 model larger than W4A16
#1215 commented on
Mar 5, 2025 • 0 new comments -
Significant Inference Performance Degradation After W8A8 Quantization on CommandR-35B Model
#1196 commented on
Mar 5, 2025 • 0 new comments -
Lazy Loading of Weights for Large Model Quantization
#1216 commented on
Mar 6, 2025 • 0 new comments -
[Question] Has anyone successfully quantinize Deepseek-V3 to int4-w4a16?
#1203 commented on
Mar 7, 2025 • 0 new comments -
Cannot load quantized Multimodal_audio model using whisper.load_model("quantized-model-path)
#1204 commented on
Mar 7, 2025 • 0 new comments -
Perplexity (ppl) Calculation of Local Sparse Model: NaN issue
#853 commented on
Mar 7, 2025 • 0 new comments -
OOM during save_pretrained of compressed model
#1183 commented on
Mar 9, 2025 • 0 new comments -
[Callbacks] Remove double initialization, replace with updating the state directly
#1169 commented on
Mar 6, 2025 • 0 new comments -
Bdellabe/awq modifier v3
#1177 commented on
Mar 5, 2025 • 0 new comments -
[Callbacks] Remove `on_update`
#1199 commented on
Mar 7, 2025 • 0 new comments -
[Train] Training Pipeline
#1214 commented on
Mar 7, 2025 • 0 new comments