Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal asset failure #128564

Open
emiletimothy opened this issue Jun 12, 2024 · 1 comment
Open

internal asset failure #128564

emiletimothy opened this issue Jun 12, 2024 · 1 comment
Labels
module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul needs reproduction Someone else needs to try reproducing the issue given the instructions. No action needed from user triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@emiletimothy
Copy link

emiletimothy commented Jun 12, 2024

🐛 Describe the bug

Traceback (most recent call last):
File "/home/ec2-user/experiments/kan/pykan/kan_exp_1.py", line 25, in
model.train(dataset, opt="LBFGS", steps=50);
File "/home/ec2-user/experiments/kan/pykan/kan/KAN.py", line 898, in train
self.update_grid_from_samples(dataset['train_input'][train_id].to(device))
File "/home/ec2-user/experiments/kan/pykan/kan/KAN.py", line 244, in update_grid_from_samples
self.act_fun[l].update_grid_from_samples(self.acts[l])
File "/home/ec2-user/experiments/kan/pykan/kan/KANLayer.py", line 218, in update_grid_from_samples
self.coef.data = curve2coef(x_pos, y_eval, self.grid, self.k, device=self.device)
File "/home/ec2-user/experiments/kan/pykan/kan/spline.py", line 138, in curve2coef
coef = torch.linalg.lstsq(mat.to(device), y_eval.unsqueeze(dim=2).to(device),
RuntimeError: false INTERNAL ASSERT FAILED at "../aten/src/ATen/native/BatchLinearAlgebra.cpp":1539, please report a bug to PyTorch. torch.linalg.lstsq: (Batch element 0): Argument 6 has illegal value. Most certainly there is a bug in the implementation calling the backend library.
=================================== (this code below fails roughly once every three tries)

from kan import *
# create a KAN: 2D inputs, 1D output, and 5 hidden neurons. cubic spline (k=3), 5 grid intervals (grid=5).
model = KAN(width=[2,5,1], grid=5, k=3, seed=0)

# create dataset f(x,y) = exp(sin(pi*x)+y^2)
f = lambda x: torch.exp(torch.sin(torch.pi*x[:,[0]]) + x[:,[1]]**2)
dataset = create_dataset(f, n_var=2)

model.train(dataset, opt="LBFGS", steps=20, lamb=0.01, lamb_entropy=10.)
model = model.prune()
model.train(dataset, opt="LBFGS", steps=50);

mode = "auto" # "manual"

if mode == "manual":
    # manual mode
    model.fix_symbolic(0,0,0,'sin');
    model.fix_symbolic(0,1,0,'x^2');
    model.fix_symbolic(1,0,0,'exp');
elif mode == "auto":
    # automatic mode
    lib = ['x','x^2','x^3','x^4','exp','log','sqrt','tanh','sin','abs']
    model.auto_symbolic(lib=lib)

model.train(dataset, opt="LBFGS", steps=50);
print(model.symbolic_formula()[0][0])

Versions

wget https://raw.githubusercontent.com/pytorch/pytorch/main/torch/utils/collect_env.py

For security purposes, please check the contents of collect_env.py before running it.

python collect_env.py

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @xwang233 @lezcano

@malfet malfet added needs reproduction Someone else needs to try reproducing the issue given the instructions. No action needed from user triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul labels Jun 13, 2024
@malfet
Copy link
Contributor

malfet commented Jun 13, 2024

@emiletimothy can you please run python3 -mtorch.utils.collect_env and post its results here? Most important to know are PyTorch version, how it was installed (wheels, conda, conda-forge, built from source) and CPU architecture (x86 vs aarch64)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul needs reproduction Someone else needs to try reproducing the issue given the instructions. No action needed from user triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

2 participants