Skip to content

Navigation Menu

Appearance settings

View all features
- BY COMPANY SIZE
  Enterprises
  Small and medium teams
  Startups
  Nonprofits
- BY USE CASE
  App Modernization
  DevSecOps
  DevOps
  CI/CD
  View all use cases
- BY INDUSTRY
  Healthcare
  Financial services
  Manufacturing
  Government
  View all industries
View all solutions
- EXPLORE BY TOPIC
  AI
  Software Development
  DevOps
  Security
  View all topics
- EXPLORE BY TYPE
  Customer stories
  Events & webinars
  Ebooks & reports
  Business insights
  GitHub Skills
- SUPPORT & SERVICES
  Documentation
  Customer support
  Community forum
  Trust center
  Partners
View all resources
- COMMUNITY
  GitHub SponsorsFund open source developers
- PROGRAMS
  Security Lab
  Maintainer Community
  Accelerator
  GitHub Stars
  Archive Program
- REPOSITORIES
  Topics
  Trending
  Collections
- ENTERPRISE SOLUTIONS
  Enterprise platformAI-powered developer platform
- AVAILABLE ADD-ONS
  GitHub Advanced SecurityEnterprise-grade security features
  Copilot for BusinessEnterprise-grade AI features
  Premium SupportEnterprise-grade 24/7 support
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

PrunaAI / pruna Public

Notifications You must be signed in to change notification settings
Fork 83
Star 1.1k

Code
Issues 54
Pull requests 21
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

feat: switch autogptq to gptqmodel#41

Merged

johannaSommer merged 5 commits intomainPrunaAI/pruna:mainfrom

feat/update-autogptq-to-gptqmodelPrunaAI/pruna:feat/update-autogptq-to-gptqmodelCopy head branch name to clipboard

Apr 1, 2025

Conversation Commits5 (5)Checks Files changed

Merged

feat: switch autogptq to gptqmodel#41
johannaSommer merged 5 commits intomainfrom
feat/update-autogptq-to-gptqmodel

Conversation

Copy link

Member

johnrachwan123 commented Mar 27, 2025 •

edited

Loading

Description

AutoGPTQ is now deprecated. This PR switches our GPTQ algorithm to gptqmodel instead.

Related Issue

autogptq already caused problems with supporting python 3.12.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

integration test still runs.

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

I did not investigate the extent of the new repo. This is only to make the switch. We might be able to find some nice speedups with some investigations.

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

johnrachwan123 requested review from johannaSommer and llcnt

March 27, 2025 15:13

johnrachwan123 mentioned this pull request

fix: add support for py312 by making gptq optional #35

Closed

10 tasks

johannaSommer requested changes

View reviewed changes

src/pruna/algorithms/quantization/gptq_model.py Outdated Show resolved Hide resolved

Uh oh!

There was an error while loading. Please reload this page.

src/pruna/algorithms/quantization/gptq_model.py Show resolved Hide resolved

Uh oh!

There was an error while loading. Please reload this page.

tests/algorithms/testers/quantization.py

		@@ -1,13 +1,13 @@
		import pytest

		from pruna.algorithms.quantization.gptq_model import GPTQQuantizer

Copy link

Member

johannaSommer Mar 27, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you at it (sorry but it needs to be done) can you add a proper post_smash_hook for gptq please? thanks!

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Copy link

Member Author

johnrachwan123 Apr 1, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good ?

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

src/pruna/algorithms/quantization/gptq_model.py

-                              device_map="auto",
-                              torch_dtype="auto",
-                          )
+                          model = imported_modules["GPTQModel"].load(temp_dir, gptq_config)

Copy link

Member

johannaSommer Mar 27, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check with the handler - execute at least one evaluation. This will likely require a handler exception :/

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

src/pruna/algorithms/quantization/gptq_model.py Outdated Show resolved Hide resolved

Uh oh!

There was an error while loading. Please reload this page.

Copy link

Member

johannaSommer commented Mar 27, 2025

Linking #21

All reactions

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

llcnt approved these changes

View reviewed changes

Copy link

Collaborator

llcnt left a comment

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the switch! I added only 2 minors comments, and we are good :)

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

src/pruna/algorithms/quantization/gptq_model.py

+                              "use_exllama",
+                              default=True,
+                              meta=dict(desc="Whether to use exllama for quantization."),
+                          ),

Copy link

Collaborator

llcnt Mar 28, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this hyperparameter, if it is deprecated in gptqmodel ;)

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Copy link

Member

johannaSommer Mar 31, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is unused we should still deprecate it properly, meaning it stays here and there is a warning in the apply and the argument is not used anymore

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

llcnt reacted with thumbs up emoji

All reactions

👍 1 reaction

Copy link

Member Author

johnrachwan123 Apr 1, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it still exists

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

src/pruna/algorithms/quantization/gptq_model.py

		@@ -44,6 +43,7 @@ class GPTQQuantizer(PrunaQuantizer):
		run_on_cuda = True

Copy link

Collaborator

llcnt Mar 28, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gptqmodel repo mentions that it can run on cpu also. However I did not test it, do you think we should change to run_on_cpu=True?

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Copy link

Member Author

johnrachwan123 Apr 1, 2025

There was a problem hiding this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run it on cpu and it failed during inference time. This needs more investigation as it should be possible to make it work but the way we quantize the models currently might not be the best.

Sorry, something went wrong.

Uh oh!

There was an error while loading. Please reload this page.

llcnt reacted with thumbs up emoji

All reactions

👍 1 reaction

johnrachwan123 added 5 commits

April 1, 2025 22:43


          feat: switch autogptq to gptqmodel

a6d7888


          fix: remove pickle as we can use the default saving from transformers

05cf9a1


          fix: make gptqmodel install happen in its own bracket

fc6a1b6


          fix: add 8 as an option for weight_bits

f4d9f27


          fix: GPTQ exception handling

50b6d28

johnrachwan123 force-pushed the feat/update-autogptq-to-gptqmodel branch from 3465f95 to 50b6d28 Compare

April 1, 2025 22:45

johnrachwan123 requested a review from johannaSommer

April 1, 2025 22:45

johannaSommer approved these changes

View reviewed changes

johannaSommer merged commit db761d5 into main

20 checks passed

Uh oh!

There was an error while loading. Please reload this page.

johannaSommer deleted the feat/update-autogptq-to-gptqmodel branch

April 1, 2025 22:58

johannaSommer mentioned this pull request

[BUG] Pruna installation fails with Python 3.12 #21

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

johannaSommer johannaSommer approved these changes

llcnt llcnt approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Uh oh!

There was an error while loading. Please reload this page.

3 participants

Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Footer

© 2026 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Community
Docs
Contact

You can’t perform that action at this time.