-
Notifications
You must be signed in to change notification settings - Fork 79
Open
Description
Thank you for sharing code!
After building a custom dataset, learning vqgan3d model and trying to execute ddpm model,
[2023-06-07 15:09:43,745][torch.distributed.nn.jit.instantiator][INFO] - Created a temporary directory at /tmp/tmp_69yy41e
[2023-06-07 15:09:43,746][torch.distributed.nn.jit.instantiator][INFO] - Writing /tmp/tmp_69yy41e/_remote_module_non_scriptable.py
/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
warnings.warn(
/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
loaded pretrained LPIPS loss from /home/airfmt/DosePainting/src/vq_gan_3d/model/cache/vgg.pth
Error executing job with overrides: ['model=ddpm', 'dataset=Brain_TR_GammaKnife', 'model.results_folder_postfix=Brain_TR_GammaKnife_ddpm', 'model.vqgan_ckpt=/home/airfmt/DosePainting/src/checkpoints/vq_gan/Brain_TR_GammaKnife/lightning_logs/version_1/checkpoints/epoch\\=656-step\\=42000-train/recon_loss\\=0.11.ckpt', 'model.diffusion_img_size=32', 'model.diffusion_depth_size=32', 'model.diffusion_num_channels=8', 'model.dim_mults=[1,2,4,8]', 'model.batch_size=10', 'model.gpus=1']
Traceback (most recent call last):
File "/home/airfmt/DosePainting/train/train_ddpm.py", line 54, in run
trainer = Trainer(
File "/home/airfmt/DosePainting/src/ddpm/diffusion.py", line 984, in __init__
self.ema_model = copy.deepcopy(self.model)
File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 270, in _reconstruct
state = deepcopy(state, memo)
File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 146, in deepcopy
y = copier(x, memo)
File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 230, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 172, in deepcopy
y = _reconstruct(x, memo, *rv)
File "/home/airfmt/anaconda3/envs/dosepaint/lib/python3.9/copy.py", line 296, in _reconstruct
value = deepcopy(value, memo)
...
rv = reductor(4)
TypeError: cannot pickle '_thread.lock' object# vqgan
!PL_TORCH_DISTRIBUTED_BACKEND=gloo CUDA_VISIBLE_DEVICES=0 python train/train_vqgan.py dataset="Brain_TR_GammaKnife" model=vq_gan_3d model.gpus=1 model.precision=16 model.embedding_dim=8 model.n_hiddens=16 model.downsample=[2,2,2] model.num_workers=32 model.gradient_clip_val=1.0 model.lr=3e-4 model.discriminator_iter_start=10000 model.perceptual_weight=4 model.image_gan_weight=1 model.video_gan_weight=1 model.gan_feat_weight=4 model.batch_size=2 model.n_codes=16384 model.accumulate_grad_batches=1
#diffusion
!python train/train_ddpm.py model=ddpm dataset="Brain_TR_GammaKnife" model.results_folder_postfix='Brain_TR_GammaKnife_ddpm' model.vqgan_ckpt='/home/airfmt/DosePainting/src/checkpoints/vq_gan/Brain_TR_GammaKnife/lightning_logs/version_1/checkpoints/epoch\=656-step\=42000-train/recon_loss\=0.11.ckpt' model.diffusion_img_size=32 model.diffusion_depth_size=32 model.diffusion_num_channels=8 model.dim_mults=[1,2,4,8] model.batch_size=10 model.gpus=1Metadata
Metadata
Assignees
Labels
No labels