-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unit/integration testing #31
Comments
I'll take a shot at some of these. If nothing else I'll learn a lot. |
I would love some help here for implementing the tests. T4 has compute capability 7.5, so it is not compatible with the AWQ CUDA kernel for running the quantized layers as they require 8.0 (Ampere architecture or later). EDIT: To add support for earlier GPUs, you would have to implement a completely new CUDA kernel because the current one utilizes tensor cores that are 10x faster than CUDA cores. GPUs that are less than 8.0 in compute capability do not have tensor cores (I believe), so it cannot install or run the current CUDA kernel. |
Ok, will work on tests. Switching to CUDA core from Tensor cores doesn't sound totally out of the realm, esp since I'm just interested in inference only for that task, but I won't even think about it for a while. |
@casper-hansen, @bdambrosio The real reason that AWQ requires GPU sm_80 or higher lies in the fact that the |
A nice list of tests that I would like to implement in order to more easily make sure everything works.
The text was updated successfully, but these errors were encountered: