Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] torch._C._LinAlgError: linalg.cholesky always raised #572

Closed
1649759610 opened this issue Mar 1, 2024 · 5 comments
Closed

[BUG] torch._C._LinAlgError: linalg.cholesky always raised #572

1649759610 opened this issue Mar 1, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@1649759610
Copy link

1649759610 commented Mar 1, 2024

Describe the bug
Hi, @PanQiWei @TheBloke

Thanks for all contributions in quantization.

Recently, I try quantize a llama-style model with 16B params and 32k length,but unfortunately an Exception raised when quantize model at the 43-th layer. More detailed error msg is shown below .

I tried some solutions to solve it with more calibration examples and damp_percent:

  • tried the number of calibration examples: 128, 256, 512, 1024
  • tried damp_percent: 0.01, 0.1, 0.15, 0.2

But no matter what combination of the two params, torch._C._LinAlgError: linalg.cholesky is always raised when quantize model at the 43-th layer. More detailed error msg is shown below.

Any help would be appreciated.

image

Hardware details
GPU: A800
RAM: 400G

Software version
autogptq: 0.7.0+cu118
torch: 2.0.1
cuda: 11.8
python: 3.10

@1649759610 1649759610 added the bug Something isn't working label Mar 1, 2024
@Kk1984up
Copy link

I got the same error. have you solved the error now?

@1649759610
Copy link
Author

Replacing calibration dataset seems to work sometime,but it is not the way that I wanna use.

@fxmarty
Copy link
Collaborator

fxmarty commented Mar 19, 2024

Hi @1649759610 @Kk1984up seem to be the same issue: IST-DASLab/gptq#8 (comment)

Here's Elias suggestion: IST-DASLab/gptq#8 (comment)

@liuy-2019
Copy link

The same error.
But why calibration dataset can cause the Hessian Matrix not positive-definite?

@Qubitium
Copy link
Collaborator

Qubitium commented Apr 13, 2024

Just ran into same issue. Semi-resolved in 636293f in Pr #640 : give user a hint to bypass error with ref to this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants