Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splatfacto training crashes after 18k steps when training on VR-NeRF dataset #3152

Open
bernard0047 opened this issue May 20, 2024 · 2 comments

Comments

@bernard0047
Copy link

Description

When training Splatfacto on a VR-NeRF dataset, it crashes after 17910 steps with this error:

File "/root/miniconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/autograd/__init__.py", line 251, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

The loss being calculated here loses its gradient after 17910 steps. I verified this by printing it out on the command line but am not sure why this is happening.

To Reproduce
Steps to reproduce the behavior:

  1. install a scene from the VR-NeRF dataset using ns-download-data eyefultower --capture-name apartment --resolution-name jpeg_1k
  2. run ns-train splatfacto --data eyefultower/apartment/images-jpeg-1k/transforms.json

Additional context
From the viewer it seems like the model isn't learning anything either. I'm also unable to run the viewer after this update

@jb-ye
Copy link
Collaborator

jb-ye commented May 24, 2024

Let me take a look into this issue.

@jb-ye
Copy link
Collaborator

jb-ye commented May 24, 2024

Could you try ns-train splatfacto --data eyefultower/apartment/images-jpeg-1k/transforms_300.json --logging.local-writer.max-log-size=0

Currently splatfacto doesn't support well if the training cameras are more than 1k. See #2927

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants