Skip to content

feat: switch autogptq to gptqmodel#41

Merged
johannaSommer merged 5 commits intomainfrom
feat/update-autogptq-to-gptqmodel
Apr 1, 2025
Merged

feat: switch autogptq to gptqmodel#41
johannaSommer merged 5 commits intomainfrom
feat/update-autogptq-to-gptqmodel

Conversation

@johnrachwan123
Copy link
Member

@johnrachwan123 johnrachwan123 commented Mar 27, 2025

Description

AutoGPTQ is now deprecated. This PR switches our GPTQ algorithm to gptqmodel instead.

Related Issue

autogptq already caused problems with supporting python 3.12.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

integration test still runs.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

I did not investigate the extent of the new repo. This is only to make the switch. We might be able to find some nice speedups with some investigations.

@@ -1,13 +1,13 @@
import pytest

from pruna.algorithms.quantization.gptq_model import GPTQQuantizer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you at it (sorry but it needs to be done) can you add a proper post_smash_hook for gptq please? thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good ?

device_map="auto",
torch_dtype="auto",
)
model = imported_modules["GPTQModel"].load(temp_dir, gptq_config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check with the handler - execute at least one evaluation. This will likely require a handler exception :/

@johannaSommer
Copy link
Member

Linking #21

Copy link
Collaborator

@llcnt llcnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the switch! I added only 2 minors comments, and we are good :)

"use_exllama",
default=True,
meta=dict(desc="Whether to use exllama for quantization."),
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this hyperparameter, if it is deprecated in gptqmodel ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is unused we should still deprecate it properly, meaning it stays here and there is a warning in the apply and the argument is not used anymore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it still exists

@@ -44,6 +43,7 @@ class GPTQQuantizer(PrunaQuantizer):
run_on_cuda = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gptqmodel repo mentions that it can run on cpu also. However I did not test it, do you think we should change to run_on_cpu=True?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run it on cpu and it failed during inference time. This needs more investigation as it should be possible to make it work but the way we quantize the models currently might not be the best.

@johnrachwan123 johnrachwan123 force-pushed the feat/update-autogptq-to-gptqmodel branch from 3465f95 to 50b6d28 Compare April 1, 2025 22:45
@johannaSommer johannaSommer merged commit db761d5 into main Apr 1, 2025
20 checks passed
@johannaSommer johannaSommer deleted the feat/update-autogptq-to-gptqmodel branch April 1, 2025 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants