-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with new wheels, people can't install 0.2.0 with CUDA 11.8 #124
Comments
Also downloading the prebuilt wheels does not seem to work at least with the syntax mentioned in the README
It seems to expect an already downloaded package. |
@TheBloke Hi, I can install successfully using |
You need to execute the command in a directory where the wheel is downloaded and saved. |
That user is me btw. I Was using it in the colab. I red from @TheBloke Model Card That for Above cuda version 12.0 Should Compile Fro The Source Code. I did it. but the Version it installed is 0.1.0. When i Run Tom's falcon gptq it Shows me RefineWeb model not Supported Error. I Made a Issue on Tom's Repo and He Told me to install from pip. I Did but Even if the pip project shows version 2, When I Install it Installs Only Version 1. After Some Conversation with Tom. I Compiled from the Source Code. It Succesfully compiled and Installed version version 0.2.0+cuda1180. But How did it compiled cuda118 when my colab cuda version is 12 |
It's not a problem for me personally. But I have had several support requests about it this morning, from people trying to use AutoGPTQ from Google Colab and Docker containers - eg @kumpulak is using Docker and @TheFaheem is using Google Colab. I can inform users to unset CUDA_VERSION but is possible to fix whatever problem is causing this issue so this is not necessary going forward. Otherwise I expect it's going to generate a lot of support requests. I've already had four messages about it this morning. |
@TheFaheem you don't have CUDA toolkit 12.0 installed, otherwise it wouldn't work. You have 11.8. You can see your CUDA toolkit version by running:
eg:
|
But, While Running It Shows Cuda Version: 12.0 |
Yes, that is the version supported by your GPU driver. But you have CUDA toolkit 11.8 installed and that is fine. It is the same for me:
|
I just fix the problem that users set CUDA_VERSION when install auto-gptq, I will release a patch fix later. |
Thank you so much! |
@PanQiWei Can You Please Explain What are These?
And is there any parameter or option to stream the output. did you implemented any generator function? |
Ah... Thanks!. I Thought it's Showing The installed Cuda Version. |
Ah yeah, I was going to raise another issue about this @PanQiWei People are quite confused by all the WARNINGs that get printed, which actually are just information I think it would be a good idea to print these messages as People think something is wrong, when it's actually all fine. In the case of these messages:
Perhaps these messages should only be printed if the user actually passed inject_fused_attention=True or inject_fused_mlp=True. But not when it gets set by default? |
You are right, I should improve the warnings and set some arguments' default value to None and reset to proper value internally if users not manually specify them. |
There is no streaming code in AutoGPTQ at the moment I think. But you could use third party software like text-generation-webui. That can provide an API, and that API has a streaming option. See example API script here: https://github.com/oobabooga/text-generation-webui/blob/main/api-example-stream.py |
@TheFaheem auto-gptq is compatiple with hf transformer's TextGenerationPipeline, so it's streamer should also be used for auto-gptq's models, but I haven't try it yet. |
I can confirm unsetting the CUDA_VERSION works. My Dockerfile installs it like this now:
Contents of the CUDA_VERSION variable was |
@PanQiWei @TheBloke When i Tried to Compile AutoGPTQ in kaggle nb i goth the following:
I used
How Should I Get Rid of These and Compile Successfully? It'll be Very Helpful! |
This is because the cuda version used in pytorch isn't compatible with the one that computer installed, because the major version is different. In this case, you can only install using the cu118 pre-compiled wheel, install from source will fail. Or you can also first compile pytorch from sorce, then install auto-gptq from source. |
Can You Please Elaborate More and me What Should I Do More Clearly? |
when the cuda's major version of pytorch used and computer installed are different, you can't compile auto-gptq from source code, instead, you can use pre-compiled wheel to install. |
is there any example scripts? please! |
Yes, I Solved it using PreCompiled wheel |
@TheFaheem can you list the steps you followed, this would be helpful to conclude. |
Here You Go MyFriend! Get the link of the Precompiled wheel from the latest release and install like this
|
@PanQiWei When i'm Using The TextGenerationpipeline, I Got The Following Warning:
But it runs fine. Why Does This Shows up? |
This is a warning raised from hf transformers for models that not officially supported by them, but you can just ignore it in auto-gptq.
This is because the backward compatibility of cuda |
Ahh... Is this All. Thanks For Clarifying This! |
And Please Can You create a inference example script using hf pipeline? |
here is a simple example of using auto-gptq quantized model with hf pipeline, for more advanced usage, for now you can turn to hf's tutorials and documents. Systematic tutorials and example scripts for using auto-gptq will be continued added in with the development progress. |
close this issue for the main problem in here is solved. anyone that have other questions or suggestions can raise in a new issue. ❤️ |
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory |
@lucasjinreal See the issue you opened |
Awesome work on the 0.2.0 release and the wheels, PanQiWei! Thousands of new people are trying AutoGPTQ today and that is amazing.
Got an issue that's affecting some of them:
Describe the bug
People trying to run
pip install auto-gptq
orpip install auto-gptq==0.2.0
are getting the follow errors:Full log:
Software version
Example of one user with the problem:
To Reproduce
Expected behavior
Installs auto-gptq 0.2.0 + cu118
The text was updated successfully, but these errors were encountered: