Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mesh2sdf errors #20

Open
kingcodefish opened this issue Sep 15, 2021 · 6 comments
Open

mesh2sdf errors #20

kingcodefish opened this issue Sep 15, 2021 · 6 comments

Comments

@kingcodefish
Copy link

Hi,

I'm trying to build the sdf-net portion of the repository, but I'm having issues with the mesh2sdf library. I'm not entirely sure what library this corresponds to because it's not in the requirements.txt, but I've guessed that it must be the mesh-to-sdf library. I've tried this and changing the imports to match, but the build still breaks in compute_sdf.py. which uses mesh2sdf_gpu and doesn't exist in mesh-to-sdf.

So I'm curious what library I'm supposed to use for this or if this is just legacy code that needs to be reconfigured. I've had other dependency issues, but so far, those have been pretty easy to resolve. Happy to help contribute for this, but want to make sure I'm not making a glaring mistake.

Thanks.

@joeylitalien
Copy link
Collaborator

Hi @kingcodefish,

Our codebase relies on two custom CUDA kernels which can be found in sdf-net/lib/extensions, where mesh2sdf is one of them. You can refer to our read-me file for install instructions. What other dependency issues were you having?

@kingcodefish
Copy link
Author

@joeylitalien Whoops, my bad. Somehow I managed to skip over that part of the setup. I've now built the extensions and have gotten the neural network to train on the armadillo example. However, when I run the following command to export the .npz file:

python app/sdf_renderer.py \
    --net OctreeSDF \
    --num-lods 5 \
    --pretrained _results/models/armadillo.pth \
    --render-res 1280 720 \
    --shading-mode matcap \
    --lod 4
    --export armadillo.npz

I get the following error:

Total number of parameters: 10146213
Traceback (most recent call last):
  File "app/sdf_renderer.py, line 106, in <module>
    net = SOL_NGLOD(net)
  File "~/Documents/nglod/sdf-net/lib/models/SOL_NGLOD.py", line 50, in __init__
    self.vs = voxel_sparsify(2000000, net, self.lod, sol=False)
  File "~/Documents/nglod/sdf-net/lib/renderutils.py", line 63, in voxel_sparsify
    surface = sample_surface(n, net, sol=sol, device=device)[:n]
  File "~/Documents/nglod/sdf-net/lib/renderutils.py", line 33, in sample_surface
    tracer = SphereTracer(device, sol=sol)
TypeError: __init__() got an unexpected keyword argument 'sol'

@tovacinni
Copy link
Collaborator

A lot of this is affected by some recent refactors we've had which includes deprecating some of the exporting capabilities to opt for a Python-based interactive renderer. Said Python-based renderer isn't available yet but will be soon (hopefully over the weekend)!

@kingcodefish
Copy link
Author

@tovacinni Thanks for getting back so quickly. Looking forward to it.

I went ahead and tried the .npz files on the other issues with the real-time renderer portion as well, but all I get is this:
Screenshot from 2021-10-01 10-48-05

Attempting to zoom in or out or really do anything instantly crashes the program leaving the console error:

NLOD Demo starting...
GPU Device 0: "Pascal" with compute capability 6.1

terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from nonzero_cuda_out_impl at /pytorch/aten/src/ATen/native/cuda/Nonzero.cu:64 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f278f51970b in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libc10.so)
frame #1: void at::native::nonzero_cuda_out_impl<bool>(at::Tensor const&, at::Tensor&) + 0x136f (0x7f269cacba5f in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cuda_cu.so)
frame #2: at::native::nonzero_out_cuda(at::Tensor const&, at::Tensor&) + 0x1eb (0x7f269caab29b in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cuda_cu.so)
frame #3: at::native::nonzero_cuda(at::Tensor const&) + 0x105 (0x7f269caab755 in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cuda_cu.so)
frame #4: <unknown function> + 0x2926ecd (0x7f269d8d8ecd in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cuda_cu.so)
frame #5: <unknown function> + 0x2926f30 (0x7f269d8d8f30 in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cuda_cu.so)
frame #6: <unknown function> + 0x18d3e44 (0x7f26e89dae44 in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cpu.so)
frame #7: at::redispatch::nonzero(c10::DispatchKeySet, at::Tensor const&) + 0x63 (0x7f26e89e4b83 in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x33bbdaf (0x7f26ea4c2daf in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cpu.so)
frame #9: <unknown function> + 0x33bbec3 (0x7f26ea4c2ec3 in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cpu.so)
frame #10: at::nonzero(at::Tensor const&) + 0x124 (0x7f26e86a56a4 in ~/Documents/nglod/sol-renderer/third-party/libtorch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0x410f3 (0x556a9759b0f3 in ./sdfRenderer)
frame #12: <unknown function> + 0x27a27 (0x556a97581a27 in ./sdfRenderer)
frame #13: <unknown function> + 0x1906a (0x556a9757306a in ./sdfRenderer)
frame #14: <unknown function> + 0x20194 (0x7f278fa6a194 in /lib/x86_64-linux-gnu/libglut.so.3)
frame #15: fgEnumWindows + 0x39 (0x7f278fa6dc39 in /lib/x86_64-linux-gnu/libglut.so.3)
frame #16: glutMainLoopEvent + 0x1cd (0x7f278fa6a7bd in /lib/x86_64-linux-gnu/libglut.so.3)
frame #17: glutMainLoop + 0x65 (0x7f278fa6aff5 in /lib/x86_64-linux-gnu/libglut.so.3)
frame #18: <unknown function> + 0x19dac (0x556a97573dac in ./sdfRenderer)
frame #19: __libc_start_main + 0xf3 (0x7f269a8970b3 in /lib/x86_64-linux-gnu/libc.so.6)
frame #20: <unknown function> + 0x1726e (0x556a9757126e in ./sdfRenderer)

Aborted (core dumped)

I am using CUDA 11.1 btw with Python 3.8.8 and the latest PyTorch for CUDA 11.1 support on Ubuntu 20.0.4 and an NVIDIA Titan X (Pascal).

I assume this is the same issue in #5. Just want to bring it to your attention again and ask if there is anything I might be able to try on my end to resolve it (changing CUDA toolkit, pytorch?).

@zhaoyuanyuan2011
Copy link

@joeylitalien Whoops, my bad. Somehow I managed to skip over that part of the setup. I've now built the extensions and have gotten the neural network to train on the armadillo example. However, when I run the following command to export the .npz file:

python app/sdf_renderer.py \
    --net OctreeSDF \
    --num-lods 5 \
    --pretrained _results/models/armadillo.pth \
    --render-res 1280 720 \
    --shading-mode matcap \
    --lod 4
    --export armadillo.npz

I get the following error:

Total number of parameters: 10146213
Traceback (most recent call last):
  File "app/sdf_renderer.py, line 106, in <module>
    net = SOL_NGLOD(net)
  File "~/Documents/nglod/sdf-net/lib/models/SOL_NGLOD.py", line 50, in __init__
    self.vs = voxel_sparsify(2000000, net, self.lod, sol=False)
  File "~/Documents/nglod/sdf-net/lib/renderutils.py", line 63, in voxel_sparsify
    surface = sample_surface(n, net, sol=sol, device=device)[:n]
  File "~/Documents/nglod/sdf-net/lib/renderutils.py", line 33, in sample_surface
    tracer = SphereTracer(device, sol=sol)
TypeError: __init__() got an unexpected keyword argument 'sol'

Hi @kingcodefish I'm just wondering have you solved the TypeError: __init__() got an unexpected keyword argument 'sol' issue? Thank you!

@Sylva-Lin
Copy link

@joeylitalien Whoops, my bad. Somehow I managed to skip over that part of the setup. I've now built the extensions and have gotten the neural network to train on the armadillo example. However, when I run the following command to export the .npz file:

python app/sdf_renderer.py \
    --net OctreeSDF \
    --num-lods 5 \
    --pretrained _results/models/armadillo.pth \
    --render-res 1280 720 \
    --shading-mode matcap \
    --lod 4
    --export armadillo.npz

I get the following error:

Total number of parameters: 10146213
Traceback (most recent call last):
  File "app/sdf_renderer.py, line 106, in <module>
    net = SOL_NGLOD(net)
  File "~/Documents/nglod/sdf-net/lib/models/SOL_NGLOD.py", line 50, in __init__
    self.vs = voxel_sparsify(2000000, net, self.lod, sol=False)
  File "~/Documents/nglod/sdf-net/lib/renderutils.py", line 63, in voxel_sparsify
    surface = sample_surface(n, net, sol=sol, device=device)[:n]
  File "~/Documents/nglod/sdf-net/lib/renderutils.py", line 33, in sample_surface
    tracer = SphereTracer(device, sol=sol)
TypeError: __init__() got an unexpected keyword argument 'sol'

how did you solve this problem( TypeError: init() got an unexpected keyword argument 'sol') in the end?Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants