Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Error when preprocessing data #35

Open
mishless opened this issue Nov 11, 2019 · 24 comments
Open

Error when preprocessing data #35

mishless opened this issue Nov 11, 2019 · 24 comments

Comments

@mishless
Copy link

I followed the instructions on how to setup the environment and when I ran the preprocessing script I got many lines with the following two errors.

OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument.
In: /usr/local/include/pangolin/gl/gl.hpp, line 205
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Unfortunately, nothing is generated in the output folder. I have used all the latest versions of the dependencies. I am running the script on a could VM in headless mode. What could be the problem?

@alex-cherg
Copy link

alex-cherg commented Dec 20, 2019

I have the same problem here:

Unable to read texture 'texture_1'
Unable to read texture 'texture_3'
Unable to read texture 'texture_2'
Unable to read texture 'texture_3'
Unable to read texture 'texture_4'
OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument.
In: [...]/Pangolin/include/pangolin/gl/gl.hpp, line 205

Have you fixed it?

@tschmidt23
Copy link

Hi Mihaela,

I personally haven't tried running the code on a cloud VM but I wouldn't necessarily expect that type of setup to have the OpenGL required to do the preprocessing, unfortunately. This error message is likely an indication that the OpenGL implementation provided by the VM is missing some required features. When you say you were running in headless mode, do you mean that the VM was headless or that you ran the code in headless mode as described in the README (i.e. using 'export PANGOLIN_WINDOW_URI=headless://')

Alex, are you also running on a VM?

@alex-cherg
Copy link

alex-cherg commented Dec 22, 2019

@tschmidt23 I am running in locally. I noticed that the script actually outputs the .npz files, regardless of the mentioned error message.

@CuiLily
Copy link

CuiLily commented Dec 22, 2019

I also have met the same problem like:
OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument. In: /usr/local/include/pangolin/gl/gl.hpp, line 205

However, I could get all processed npz files. But when I tried to train the network, it encountered the error below:
Traceback (most recent call last): File "/home/cuili/DeepSDF/train_deep_sdf.py", line 591, in <module> main_function(args.experiment_directory, args.continue_from, int(args.batch_split)) File "/home/cuili/DeepSDF/train_deep_sdf.py", line 511, in main_function chunk_loss.backward() File "/home/cuili/.local/lib/python3.6/site-packages/torch/tensor.py", line 166, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/cuili/.local/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: leaf variable has been moved into the graph interior

How can I fix it?

@zhujunli1993
Copy link

I also have met the same problem like:
OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument. In: /usr/local/include/pangolin/gl/gl.hpp, line 205

However, I could get all processed npz files. But when I tried to train the network, it encountered the error below:
Traceback (most recent call last): File "/home/cuili/DeepSDF/train_deep_sdf.py", line 591, in <module> main_function(args.experiment_directory, args.continue_from, int(args.batch_split)) File "/home/cuili/DeepSDF/train_deep_sdf.py", line 511, in main_function chunk_loss.backward() File "/home/cuili/.local/lib/python3.6/site-packages/torch/tensor.py", line 166, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/cuili/.local/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: leaf variable has been moved into the graph interior

How can I fix it?

I also met the leaf variable issue when using Pytorch 1.2.0.
Finally, I found it works using Pytorch 1.0.0.

@CuiLily
Copy link

CuiLily commented Jan 4, 2020

My torch version is 1.3.1. If I use version 1.0.0 ,will it not match with the CUDA version? My cuda version is 10.2.89

@tschmidt23
Copy link

@CuiLily that is a separate issue and is actually related to a bug in pytorch, see https://discuss.pytorch.org/t/why-am-i-getting-this-error-about-a-leaf-variable/58468/9 and the related GitHub issue here pytorch/pytorch#28370. It is a known issue with version 1.3.0 and apparently happens on 1.3.1 and 1.2.0 as well.. Solutions for now are to use an older version (as @zhujunli1993 did) or to disable the maximum norm constraint for latent codes. This can be done by removing "CodeBound" from the specs.json.

@CuiLily
Copy link

CuiLily commented Jan 9, 2020

@tschmidt23 I followed your advice by removing "CodeBound". When I was training at epoch 5, I met such errors:
`DeepSdf - INFO - epoch 5...
Traceback (most recent call last):
File "/home/cuili/DeepSDF/train_deep_sdf.py", line 591, in
main_function(args.experiment_directory, args.continue_from, int(args.batch_split))
File "/home/cuili/DeepSDF/train_deep_sdf.py", line 511, in main_function
chunk_loss.backward()
File "/home/cuili/.local/lib/python3.6/site-packages/torch/tensor.py", line 166, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/cuili/.local/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
File "/home/cuili/.local/lib/python3.6/site-packages/torch/autograd/function.py", line 77, in apply
return self._forward_cls.backward(self, args)
File "/home/cuili/.local/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 101, in backward
return None, None, None, Gather.apply(ctx.input_device, ctx.dim, grad_output)
File "/home/cuili/.local/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 68, in forward
return comm.gather(inputs, ctx.dim, ctx.target_device)
File "/home/cuili/.local/lib/python3.6/site-packages/torch/cuda/comm.py", line 165, in gather
return torch._C.gather(tensors, dim, destination)
RuntimeError: CUDA error: an illegal memory access was encountered (copy_kernel_cuda at /pytorch/aten/src/ATen/native/cuda/Copy.cu:187)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f2372e04813 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: + 0x529bceb (0x7f22e5428ceb in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #2: + 0x197925d (0x7f22e1b0625d in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #3: + 0x197649f (0x7f22e1b0349f in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #4: at::native::copy
(at::Tensor&, at::Tensor const&, bool) + 0x43 (0x7f22e1b05163 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #5: + 0x40726b2 (0x7f22e41ff6b2 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #6: + 0x1902d94 (0x7f22e1a8fd94 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #7: torch::cuda::gather(c10::ArrayRefat::Tensor, long, c10::optional) + 0xaaa (0x7f22e471d0aa in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #8: + 0x77f4aa (0x7f23782194aa in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #9: + 0x2110f4 (0x7f2377cab0f4 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #10: /usr/bin/python3.6() [0x50ac25]
frame #11: _PyEval_EvalFrameDefault + 0x449 (0x50c5b9 in /usr/bin/python3.6)
frame #12: /usr/bin/python3.6() [0x508245]
frame #13: /usr/bin/python3.6() [0x50a080]
frame #14: /usr/bin/python3.6() [0x50aa7d]
frame #15: _PyEval_EvalFrameDefault + 0x449 (0x50c5b9 in /usr/bin/python3.6)
frame #16: /usr/bin/python3.6() [0x508245]
frame #17: /usr/bin/python3.6() [0x5893bb]
frame #18: PyObject_Call + 0x3e (0x5a067e in /usr/bin/python3.6)
frame #19: THPFunction_apply(_object
, _object
) + 0xa4f (0x7f2377f395ef in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #20: PyCFunction_Call + 0x52 (0x567802 in /usr/bin/python3.6)
frame #21: _PyEval_EvalFrameDefault + 0x55ae (0x51171e in /usr/bin/python3.6)
frame #22: /usr/bin/python3.6() [0x508245]
frame #23: /usr/bin/python3.6() [0x5893bb]
frame #24: PyObject_Call + 0x3e (0x5a067e in /usr/bin/python3.6)
frame #25: _PyEval_EvalFrameDefault + 0x17f6 (0x50d966 in /usr/bin/python3.6)
frame #26: /usr/bin/python3.6() [0x508245]
frame #27: _PyFunction_FastCallDict + 0x2e2 (0x509642 in /usr/bin/python3.6)
frame #28: /usr/bin/python3.6() [0x595311]
frame #29: PyObject_Call + 0x3e (0x5a067e in /usr/bin/python3.6)
frame #30: torch::autograd::PyNode::apply(std::vector<torch::autograd::Variable, std::allocatortorch::autograd::Variable >&&) + 0x163 (0x7f2377f328d3 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #31: + 0x3d4ae06 (0x7f22e3ed7e06 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #32: torch::autograd::Engine::evaluate_function(torch::autograd::NodeTask&) + 0x10b7 (0x7f22e3ed1417 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #33: torch::autograd::Engine::thread_main(torch::autograd::GraphTask*) + 0x1c4 (0x7f22e3ed3424 in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #34: torch::autograd::python::PythonEngine::thread_init(int) + 0x2a (0x7f2377f2be8a in /home/cuili/.local/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #35: + 0xf14f (0x7f2378a5414f in /home/cuili/.local/lib/python3.6/site-packages/torch/_C.cpython-36m-x86_64-linux-gnu.so)
frame #36: + 0x76db (0x7f237cd246db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #37: clone + 0x3f (0x7f237d05d88f in /lib/x86_64-linux-gnu/libc.so.6)

Process finished with exit code 1

`
Do you have any ideas about that?

@Robonchu
Copy link

Robonchu commented Jan 13, 2020

I did

  1. updating the Pangolin
  2. export PANGOLIN_WINDOW_URI=headless://.

However, I also have met the same problem like:
OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument. In: /usr/local/include/pangolin/gl/gl.hpp, line 205

After this error is happened, preprocess_data.py couldn't go forward.

@tschmidt23
Could you help me?

@lywcn
Copy link

lywcn commented Jan 29, 2020

I have the same error:
OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument. In: /usr/local/include/pangolin/gl/gl.hpp, line 205

Any solution from the authors? Is it possible to let us know the build environment they used? e.g. OS? versions of each packages? Thank you!

@ErlerPhilipp
Copy link

I have the same problem. However, it did still process the dataset on my dev PC. I need to wait ~1 min per object.

It did nothing on my training machine, maybe because OpenGL wasn't set up correctly.

@fishfishson
Copy link

I have the similar problem "unable to read texture XXXX". However the error message is

terminate called after throwing an instance of 'pangolin::WindowExceptionNoKnownHandler'
  what():  No known window handler for URI 'headless'

I have no idea about panglin. Can you help me solve this problem.

BTW I was running it on nvidia-docker.

@zhujunli1993
Copy link

Did you run "export PANGOLIN_WINDOW_URI=headless://" before preprocessing it?

@zhujunli1993
Copy link

@fishfishson

@kyewei
Copy link

kyewei commented Aug 2, 2020

I commented out the line https://github.com/facebookresearch/DeepSDF/blob/master/src/ShaderProgram.cpp#L97, as described here gl_PrimitiveID does not need to be declared since its assume to exist by default? This causes GLSL compile failure

and use headless pangolin and it seems to be chugging out stuff slowly, ~1 min per object like #35 (comment)

@CZ-Wu
Copy link

CZ-Wu commented Sep 23, 2020

@tschmidt23 I am running in locally. I noticed that the script actually outputs the .npz files, regardless of the mentioned error message.

I have the same problem here:

Unable to read texture 'texture_1'
Unable to read texture 'texture_3'
Unable to read texture 'texture_2'
Unable to read texture 'texture_3'
Unable to read texture 'texture_4'
OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument.
In: [...]/Pangolin/include/pangolin/gl/gl.hpp, line 205

Have you fixed it?

I have the same problem, but the script does not output the .npz files. How could I do?

@suyz526
Copy link

suyz526 commented Oct 21, 2020

About this Error

OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument. In: /usr/local/include/pangolin/gl/gl.hpp, line 205

I submitted an issue in Pangolin. But this error doesn't matter. The preprocessing still works even with this error.
stevenlovegrove/Pangolin#625

@nikwl
Copy link

nikwl commented Jan 26, 2021

I have this same error.

OpenGL Error 500: GL_INVALID_ENUM: An unacceptable value is specified for an enumerated argument.
In: [...]/Pangolin/include/pangolin/gl/gl.hpp, line 205

The npz files are still produced, but the preprocessing runs unacceptably slow, something like 2 minutes per mesh per thread.

@danielegrattarola
Copy link

@nikwl were you able to make the preprocessing run faster?

@nikwl
Copy link

nikwl commented Feb 9, 2022

@danielegrattarola I wasn't no, I just ended up reimplementing the preprocessing from scratch. This pypi library was helpful: https://pypi.org/project/mesh-to-sdf/

@danielegrattarola
Copy link

@nikwl oh god, two days spent trying to get this to work and there was a python package all along 😵
Did you do something else besides using this library? I'm only really interested in the mesh to sdf part...

Thanks

@nikwl
Copy link

nikwl commented Feb 9, 2022

@danielegrattarola If you only need the sdf values then that library should serve you pretty well. You'll just need to build the infrastructure to run it on shapenet if that's your goal.

@Boltzmachine
Copy link

@danielegrattarola I wasn't no, I just ended up reimplementing the preprocessing from scratch. This pypi library was helpful: https://pypi.org/project/mesh-to-sdf/

Hey would you mind sharing your scripts by this pypi library?

@sinAshish
Copy link

I wish I had come across this page earlier, could have saved me 2 days of hair pulling 🤷

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests