Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoGPTQ is abandoned? #187

Closed
TheBloke opened this issue Jul 5, 2023 · 6 comments
Closed

AutoGPTQ is abandoned? #187

TheBloke opened this issue Jul 5, 2023 · 6 comments

Comments

@TheBloke
Copy link
Contributor

TheBloke commented Jul 5, 2023

@PanQiWei if you're no longer able to update AutoGPTQ could you let us know.

Just this week, AutoGPTQ was promoted by a member of the Hugging Face team (https://twitter.com/joao_gante/status/1674830924452892672) and that was announced in the Hugging Face Discord today.

But AutoGPTQ is losing relevance every day:

  • Hugging Face Text Generation Inference just added GPTQ support, but they used GPTQ-for-LLaMa instead;
  • So did LmSys' FastChat;
  • No 0.3.0 release, so the awesome PEFT mode is not being seen;
  • ExLlama has completely taken over for Llama inference, because it's 2x faster, uses less VRAM, doesn't slow down with group_size + act-order together, and recently added support for extended context via RoPE;
  • No MPT support;
  • Falcon support is unusably slow.

It is so sad that all your amazing work, and that of others, is being forgotten.

If you can't maintain AutoGPTQ any more, please appoint someone else. I can do it. I'm not an ML coder but I can merge bug fixes, fix the build issues, push out 0.3.0 and then get together some coders to implement new features. I'd even be willing to pay a contractor to work on the project, if necessary.

But there's no point me doing anything right now because I know it won't be merged.

Otherwise I think I will fork AutoGPTQ and see if I can get together a team to continue its work in a new repo.

Please let's not let AutoGPTQ die!

@qwopqwop200
Copy link
Collaborator

qwopqwop200 commented Jul 6, 2023

I hope this project never dies.

@PanQiWei
Copy link
Collaborator

PanQiWei commented Jul 6, 2023

Hi, to all of you that care about this project, AutoGPTQ will not die!! I'm sorry that this project hasn't been updated in the past month for some personal reason. Since I gradually cleared works at hand, I will back to this project next week and there will be much more architecture refactor, performance optimization and new features, stay tune! 🥂

And thank you all again for loving AutoGPTQ! ❤️

@TheBloke
Copy link
Contributor Author

TheBloke commented Jul 6, 2023

That's great to hear @PanQiWei I'm relieved to hear you're coming back.

The timing is potentially very good, because I just heard today that people in the Hugging Face team are investigating the possibility of adding GPTQ support in to Transformers itself. This will bring GPTQ to far more people, far more easily.

I was told they will likely use the ExLlama kernel as it's so much faster and doesn't suffer from performance problems when using group_size + act_order together. But hopefully they will look to AutoGPTQ for the infrastructure code.

@TheBloke
Copy link
Contributor Author

TheBloke commented Jul 7, 2023

@PanQiWei can you please get in touch with me privately, by email (tom@thebloke.ai), Discord (@the_bloke) or Twitter DM (@TheBlokeAI). The 🤗 team want to talk to you about a possible integration of GPTQ into transformers, but don't know how to reach you.

@casper-hansen
Copy link
Contributor

@PanQiWei can you please get in touch with me privately, by email (tom@thebloke.ai), Discord (@the_bloke) or Twitter DM (@TheBlokeAI). The 🤗 team want to talk to you about a possible integration of GPTQ into transformers, but don't know how to reach you.

That would certainly be huge! The Huggingface team should also investigate AWQ as it’s also good for quantization, looks superior to GPTQ.

Check out https://huggingface.co/abhinavkulkarni to see a bunch of quantized models.

Repo: https://github.com/mit-han-lab/llm-awq

@fxmarty
Copy link
Collaborator

fxmarty commented Oct 27, 2023

Closing as not abandoned (hopefully)

@fxmarty fxmarty closed this as completed Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants