Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customized Dataset for Stable View Synthesis & CUDA Error #12

Closed
matveymor opened this issue Jun 16, 2021 · 5 comments
Closed

Customized Dataset for Stable View Synthesis & CUDA Error #12

matveymor opened this issue Jun 16, 2021 · 5 comments

Comments

@matveymor
Copy link

Thank you very much for publishing "Stable View Synthesis", it seems to be the significant photorealistic approach for novel view synthesis!
Could you add to your github page https://github.com/intel-isl/StableViewSynthesis the detailed instructions on how to build your own customized dataset, please?

Besides, I am interested in the following questions:

  1. Can you please tell us how you calculated depth maps in your work?
  2. When I am running the training process on my own data, this error is raised:
    invalid configuration argument in /notebook/SVS/StableViewSynthesis/ext/mytorch/include/common_cuda.h at 171
    What might be a reason for this?

Thank you in advance!

@griegler
Copy link
Contributor

Thanks.

Can you please tell us how you calculated depth maps in your work?

I used pyrender to render the depthmaps given the 3D mesh.

When I am running the training process on my own data, this error is raised: ...

What GPU are you using? The problem is raised in
https://github.com/intel-isl/StableViewSynthesis/blob/main/ext/mytorch/include/common_cuda.h#L169-L171
and it could be that the default kernel parameters are problematic with respect to your GPU. If this is the problem, you could try to change https://github.com/intel-isl/StableViewSynthesis/blob/main/ext/mytorch/include/common_cuda.h#L109 to a smaller number.

@KaLiMaLi555
Copy link

KaLiMaLi555 commented Jun 25, 2021

Hey @griegler, I tried setting CUDA_NUM_THREADS to a smaller value. I also tried changing the nvcc flags in setup.py. It didn't help at all.
It would be great if you can suggest some other fix for this issue

@griegler
Copy link
Contributor

@KaLiMaLi555 do you have more information, e.g., error log. Can you post also the command that you execute.

@KaLiMaLi555
Copy link

I ran the cmd which was provided in the README
python exp.py --net resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16 --cmd eval --iter last --eval-dsets tat-subseq

Library versions:

torch==1.6.0
torch-geometric==1.7.1
torch-scatter==2.0.5
torch-sparse==0.6.8
torchvision==0.7.0

I wasn't able to run the code with some versions of these libs given in the README. These versions seemed to work for me

Error log:

[2021-06-25/05:51/INFO/mytorch] Set seed to 42
[2021-06-25/05:51/INFO/mytorch] ================================================================================
[2021-06-25/05:51/INFO/mytorch] Start cmd "eval": tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg
[2021-06-25/05:51/INFO/mytorch] 2021-06-25 05:51:01
[2021-06-25/05:51/INFO/mytorch] host: ip-172-31-44-59
[2021-06-25/05:51/INFO/mytorch] --------------------------------------------------------------------------------
[2021-06-25/05:51/INFO/mytorch] worker env:
    experiments_root: experiments
    experiment_name: tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg
    n_train_iters: -65536
    seed: 42
    train_batch_size: 1
    train_batch_acc_steps: 1
    eval_batch_size: 1
    num_workers: 6
    save_frequency: <co.mytorch.Frequency object at 0x7fd6472f4a50>
    eval_frequency: <co.mytorch.Frequency object at 0x7fd64f6a8910>
    train_device: cuda:0
    eval_device: cuda:0
    clip_gradient_value: None
    clip_gradient_norm: None
    empty_cache_per_batch: False
    log_debug: []
    train_iter_messages: []
    stopwatch:
    train_dsets: ['tat-wo-val']
    eval_dsets: ['tat-subseq']
    train_n_nbs: 3
    train_src_mode: image
    train_nbs_mode: argmax
    train_scale: 0.25
    eval_scale: 0.5
    invalid_depth: 1000000000.0
    point_aux_data: ['dirs']
    point_edges_mode: penone
    eval_n_max_sources: 5
    train_rank_mode: pointdir
    eval_rank_mode: pointdir
    train_loss: VGGPerceptualLoss(
  (vgg): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (17): ReLU(inplace=True)
    (18): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (19): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (24): ReLU(inplace=True)
    (25): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (26): ReLU(inplace=True)
    (27): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (31): ReLU(inplace=True)
    (32): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (33): ReLU(inplace=True)
    (34): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (35): ReLU(inplace=True)
    (36): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
)
    eval_loss: L1Loss()
    exp_out_root: experiments/tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg
    db_path: experiments/tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg/exp.ip-172-31-44-59.db
    db_logger: <co.sqlite.Logger object at 0x7fd640347590>
[2021-06-25/05:51/INFO/mytorch] ================================================================================
[2021-06-25/05:51/INFO/exp] Create eval datasets
[2021-06-25/05:51/INFO/exp]   create dataset for tat_subseq_training_Truck
[2021-06-25/05:51/INFO/dataset]     #tgt_im_paths=25, #tgt_counts=(25, 226), tgt_im=(3, 576, 992), tgt_dm=(576, 992), train=False
[2021-06-25/05:51/INFO/exp]   create dataset for tat_subseq_intermediate_M60
[2021-06-25/05:51/INFO/dataset]     #tgt_im_paths=36, #tgt_counts=(36, 277), tgt_im=(3, 576, 1088), tgt_dm=(576, 1088), train=False
[2021-06-25/05:51/INFO/exp]   create dataset for tat_subseq_intermediate_Playground
[2021-06-25/05:51/INFO/dataset]     #tgt_im_paths=32, #tgt_counts=(32, 275), tgt_im=(3, 576, 1024), tgt_dm=(576, 1024), train=False
[2021-06-25/05:51/INFO/exp]   create dataset for tat_subseq_intermediate_Train
[2021-06-25/05:51/INFO/dataset]     #tgt_im_paths=43, #tgt_counts=(43, 258), tgt_im=(3, 576, 992), tgt_dm=(576, 992), train=False
[2021-06-25/05:51/INFO/modules] [NET][EncNet] resunet3.16
[2021-06-25/05:51/INFO/modules] [NET][RefNet] point_edges_mode=penone
[2021-06-25/05:51/INFO/modules] [NET][RefNet] point_aux_data=dirs
[2021-06-25/05:51/INFO/modules] [NET][RefNet] point_avg_mode=avg
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Seq 9 nets, nets_residual=True
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   Unet(in_channels=16, enc_channels=[16, 32, 64, 128, 128], dec_channels=[128, 64, 32, 16], n_conv=2)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] Single gnn
[2021-06-25/05:51/INFO/modules] [NET][RefNet]   MLPDir(in_channels=16, hidden_channels=64, n_mods=3, out_channels=16, aggr=mean)
[2021-06-25/05:51/INFO/modules] [NET][RefNet] out_conv(16, 3)
[2021-06-25/05:51/INFO/mytorch] [EVAL] loading net for iter last: experiments/tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg/net_0000000000000000.params
[2021-06-25/05:51/INFO/mytorch]
[2021-06-25/05:51/INFO/mytorch] ================================================================================
[2021-06-25/05:51/INFO/mytorch] Evaluating set tat_subseq_training_Truck
[2021-06-25/05:51/INFO/exp] --------------------------------------------------------------------------------
[2021-06-25/05:51/INFO/mytorch] 2021-06-25 05:51:04
[2021-06-25/05:51/INFO/exp] Eval iter 0
[2021-06-25/05:51/INFO/exp]   preprocess all source images
[2021-06-25/05:51/INFO/exp]     feat tmp dir: experiments/tmp_srcfeat_tat-wo-val_bs1_nbs3_rpointdir_s0.25_resunet3.16_penone.dirs.avg.seq+9+1+unet+5+2+16.single+mlpdir+mean+3+64+16_vgg_tat_subseq_training_Truck
[2021-06-25/05:51/INFO/exp]   create target images
invalid device function in /home/ubuntu/PreImage/StableViewSynthesis/ext/mytorch/include/common_cuda.h at 171
[1]    31933 segmentation fault (core dumped)  python exp.py --net  --cmd eval --iter last --eval-dsets tat-subseq

@matveymor matveymor reopened this Jul 1, 2021
@alex04072000
Copy link

@matveymor
Did you solve the customized dataset issue?
I am facing the same problem.
There is no script for generating delaunay_photometric.ply in create_data_own.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants