You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I'm running a remote ec2 instance, with a remote desktop client called Nice DCV (a competitor to VNC for enterprise, free for ec2). 24GB VRAM and 64GB RAM.
I can train without a viewer with no problems. However, when I try to run it with a viewer, I get segmentation_fault. The app window opens and nothing gets to load before it crashes.
I have tried both experimental and normal docker builds (I have only tried docker). I have tried checking out multiple versions of the repo (783c41f and e72ae5b), to see if the problem was recently introduced. Nothing has worked so far. The problem I get looks like this:
/workspace/permuto_sdf$ ./permuto_sdf_py/train_permuto_sdf.py --dataset dtu --scene dtu_scan24 --comp_name comp_3 --exp_info default
args.with_mask False
args.low_res False
checkpoint_path /workspace/permuto_sdf/checkpoints
with_viewer True
has_apex True
[ D96CB740]DataLoaderDTU.cxx:173 1| loaded nr of scenes 1 for mode train
[ D96CB740]DataLoaderDTU.cxx:432 1| reading poses and intrinsics for scene "dtu_scan24"
[ D96CB740]DataLoaderDTU.cxx:173 1| loaded nr of scenes 1 for mode test
[ D96CB740]DataLoaderDTU.cxx:432 1| reading poses and intrinsics for scene "dtu_scan24"
[ D96CB740] Mesh.cxx:3390 1| read obj with path /workspace/easy_pbr/data/sphere.obj
Segmentation fault (core dumped)
In contrast, when I train without a viewer, it looks like this:
/workspace/permuto_sdf$ ./permuto_sdf_py/train_permuto_sdf.py --dataset dtu --scene dtu_scan24 --comp_name comp_3 --exp_info default --no_viewer
args.with_mask False
args.low_res False
checkpoint_path /workspace/permuto_sdf/checkpoints
with_viewer False
has_apex True
[ 2A5FF740]DataLoaderDTU.cxx:173 1| loaded nr of scenes 1 for mode train
[ 2A5FF740]DataLoaderDTU.cxx:432 1| reading poses and intrinsics for scene "dtu_scan24"
[ 2A5FF740]DataLoaderDTU.cxx:173 1| loaded nr of scenes 1 for mode test
[ 2A5FF740]DataLoaderDTU.cxx:432 1| reading poses and intrinsics for scene "dtu_scan24"
phase.iter_nr 1000 loss 1.3530950546264648
phase.iter_nr 2000 loss 0.15609805285930634
phase.iter_nr 3000 loss 0.10311679542064667
...
How should I best troubleshoot this?
The text was updated successfully, but these errors were encountered:
The same, I'm using wsl on windows to get a linux virtual system. Then run the docker as instructed. I can train without viewer, but get segmentation fault with the viewer. Now I check the result by creating mesh from the saved checkpoint.
Unfortunately the viewer cannot currently render on headless machines so I only used it when training locally. If you train on EC2 instances I recommend disabling the viewer.
Hello,
I'm running a remote ec2 instance, with a remote desktop client called Nice DCV (a competitor to VNC for enterprise, free for ec2). 24GB VRAM and 64GB RAM.
I can train without a viewer with no problems. However, when I try to run it with a viewer, I get
segmentation_fault
. The app window opens and nothing gets to load before it crashes.I have tried both experimental and normal docker builds (I have only tried docker). I have tried checking out multiple versions of the repo (783c41f and e72ae5b), to see if the problem was recently introduced. Nothing has worked so far. The problem I get looks like this:
In contrast, when I train without a viewer, it looks like this:
How should I best troubleshoot this?
The text was updated successfully, but these errors were encountered: