Pulse · vllm-project/llm-compressor · GitHub

March 2, 2025 – March 9, 2025

Overview

11 Active pull requests

16 Active issues

Could not load contribution data

Please try again later

8 Pull requests merged by 4 people

[Callbacks] Remove MagnitudePruningModifier.leave_enabled
#1198 merged Mar 7, 2025
[Docs] Add info on when to use which PTQ/Sparsification
#1157 merged Mar 6, 2025
[Training] Unifying Preprocess + Postprocessing logic for Train/Oneshot
#1212 merged Mar 6, 2025
[BugFix] Fix logging disabling bug and add tests
#1218 merged Mar 5, 2025
[Training] Datasets - update Module
#1209 merged Mar 5, 2025
[Cosmetic] Rename data_args to dataset_args
#1206 merged Mar 5, 2025
Remove MonkeyPatch for GPUs
#1227 merged Mar 5, 2025
[Training] Decouple Argument parser
#1207 merged Mar 3, 2025

3 Pull requests opened by 2 people

fixing reproducibility of lmeval tests
#1220 opened Mar 4, 2025
[Quantization] Channel-wise Output Activation Quantization for Attention QKV Modules + KV-cache channel quantization
#1233 opened Mar 7, 2025
wandb/tensorboard loggers set default init to False
#1235 opened Mar 7, 2025

6 Issues closed by 3 people

Bug of quantizing part of the Qwen model
#1230 closed Mar 6, 2025
Why quantized model of 640MB took almost 3GB of VRAM?
#1229 closed Mar 6, 2025
How to choose sparsification rates for different layers?
#1231 closed Mar 6, 2025
Does llmcompressor support hybrid sparsity?
#1037 closed Mar 5, 2025
run llmcompressor to transfer deepseek r1 bf16 to w8a8 occur error
#1221 closed Mar 4, 2025
[Clarification] Using FP8 Dynamic quantization saves space on disk, but not when loaded?
#1210 closed Mar 3, 2025

10 Issues opened by 5 people

Fix docker container to build llm-compressor, not sparseml
#1236 opened Mar 7, 2025
Cuda out of memory while loading the quantized model
#1234 opened Mar 7, 2025
[Bug] Sporadic test failure on sparsification test
#1232 opened Mar 6, 2025
Quantization Memory Requirements
#1228 opened Mar 5, 2025
Fail when invalid arguments are provided in the recipe for a modifier
#1226 opened Mar 5, 2025
Fail when faulty recipes are provided during `oneshot`
#1225 opened Mar 5, 2025
Expand tests for `ModuleSparsificationInfo`
#1224 opened Mar 5, 2025
Update to use `loguru`
#1223 opened Mar 5, 2025
Update observers to make `MSE` the default
#1222 opened Mar 5, 2025
Incomplete warning message during `oneshot` post process - `WARNING - Optimized model not saved. To save, please provide`
#1219 opened Mar 3, 2025

12 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

does it support asymmetric ?
#1190 commented on Mar 5, 2025 • 0 new comments
W4A8 model larger than W4A16
#1215 commented on Mar 5, 2025 • 0 new comments
Significant Inference Performance Degradation After W8A8 Quantization on CommandR-35B Model
#1196 commented on Mar 5, 2025 • 0 new comments
Lazy Loading of Weights for Large Model Quantization
#1216 commented on Mar 6, 2025 • 0 new comments
[Question] Has anyone successfully quantinize Deepseek-V3 to int4-w4a16?
#1203 commented on Mar 7, 2025 • 0 new comments
Cannot load quantized Multimodal_audio model using whisper.load_model("quantized-model-path)
#1204 commented on Mar 7, 2025 • 0 new comments
Perplexity (ppl) Calculation of Local Sparse Model: NaN issue
#853 commented on Mar 7, 2025 • 0 new comments
OOM during save_pretrained of compressed model
#1183 commented on Mar 9, 2025 • 0 new comments
[Callbacks] Remove double initialization, replace with updating the state directly
#1169 commented on Mar 6, 2025 • 0 new comments
Bdellabe/awq modifier v3
#1177 commented on Mar 5, 2025 • 0 new comments
[Callbacks] Remove `on_update`
#1199 commented on Mar 7, 2025 • 0 new comments
[Train] Training Pipeline
#1214 commented on Mar 7, 2025 • 0 new comments