Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testing any other size image #17

Closed
ouyangjiacs opened this issue May 23, 2023 · 9 comments
Closed

testing any other size image #17

ouyangjiacs opened this issue May 23, 2023 · 9 comments

Comments

@ouyangjiacs
Copy link

I tested the 128 * 128 input image with This script:
python scripts/sr_val_ddpm_text_T_vqganfin_old.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt /data/work/StableSR-main/model/stablesr_000117.ckpt --vqgan_ckpt /data/work/StableSR-main/model/vqgan_cfw_00011.ckpt --init-img /data/work/StableSR/datasetoyj/plant/ --outdir /data/work/StableSR/datasetoyj/plant_out --ddpm_steps 200 --dec_w 0.5
there were no problem.

but I encountered an error when testing any other size image with This script:
python scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt /data/work/StableSR-main/model/stablesr_000117.ckpt --vqgan_ckpt /data/work/StableSR-main/model/vqgan_cfw_00011.ckpt --init-img /data/work/StableSR/datasetoyj/human_all/ --outdir /data/work/StableSR/datasetoyj/human_all_out --ddpm_steps 200 --dec_w 0.5 --vqgantile_size 256 --vqgantile_stride 200
the error is:
...
│ /data/work/StableSR/ldm/models/diffusion/ddpm.py:2621 in │
│ p_mean_variance_canvas │
│ │
│ 2618 │ │ │ │ # print(noise_preds[row][col].size()) │
│ 2619 │ │ │ │ # print(tile_weights.size()) │
│ 2620 │ │ │ │ # print(noise_pred.size()) │
│ ❱ 2621 │ │ │ │ noise_pred[:, :, input_start_y:input_end_y, input_sta │
│ 2622 │ │ │ │ contributors[:, :, input_start_y:input_end_y, input_s │
│ 2623 │ │ # Average overlapping areas with more than 1 contributor │
│ 2624 │ │ noise_pred /= contributors │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: The size of tensor a (32) must match the size of tensor b (64) at
non-singleton dimension 3

@ouyangjiacs
Copy link
Author

other size is 3K*4K

@IceClear
Copy link
Owner

image
The VQGAN tile size should be at least 512.

@ouyangjiacs
Copy link
Author

Ohohoh, I previously set the --vqgantile_size is 512, but error:
RuntimeError: CUDA out of memory. Tried to allocate 2.15 GiB (GPU 0; 23.88 GiB
total capacity; 21.09 GiB already allocated; 2.04 GiB free; 21.18 GiB reserved
in total by PyTorch) If reserved memory is >> allocated memory try setting
max_split_size_mb to avoid fragmentation. See documentation for Memory
Management and PYTORCH_CUDA_ALLOC_CONF

And then I forgot about this restriction, I just made it smaller.

What size GPU is required for the --vqgantile_size of 512?

@ouyangjiacs
Copy link
Author

but now, My Test Script:
python scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt /data/work/StableSR-main/model/stablesr_000117.ckpt --vqgan_ckpt /data/work/StableSR-main/model/vqgan_cfw_00011.ckpt --init-img /data/work/StableSR/datasetoyj/human/ --outdir /data/work/StableSR/datasetoyj/human_out --ddpm_steps 200 --dec_w 0.5 --vqgantile_size 512 --vqgantile_stride 384
,the error still:
... │ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ /data/work/StableSR/ldm/modules/diffusionmodules/openaimodel.py:1333 in │
│ forward │
│ │
│ 1330 │ │ │ hs.append(h) │
│ 1331 │ │ h = self.middle_block(h, emb, context, struct_cond) │
│ 1332 │ │ for module in self.output_blocks: │
│ ❱ 1333 │ │ │ h = th.cat([h, hs.pop()], dim=1) │
│ 1334 │ │ │ h = module(h, emb, context, struct_cond) │
│ 1335 │ │ h = h.type(x.dtype) │
│ 1336 │ │ if self.predict_codebook_ids: │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2
but got size 1 for tensor number 1 in the list.

@IceClear
Copy link
Owner

IceClear commented May 24, 2023

Hi.
I have just found a bug related to the tile operation and fixed it.
You may pull the latest code and try again.
If the bug still exits, you can show me the img and the settings and I will check then.
Thx.

@ouyangjiacs
Copy link
Author

Hello, thank you very much for your kind reply,
I used the latest code (2023.5.25), I still report an error when I set vqgantile_size=512:
The test script is:
python scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt /data/work/StableSR-main/model/stablesr_000117.ckpt --vqgan_ckpt /data/work/StableSR-main/model/vqgan_cfw_00011.ckpt --init-img /data/work/StableSR/datasetoyj/test_data_hq6_x2/ --outdir /data/work/StableSR/datasetoyj/test_data_hq6_x2_out --ddpm_steps 200 --dec_w 0.5 --vqgantile_size 512 --vqgantile_stride 384 --colorfix_type adain
the error is:
RuntimeError: CUDA out of memory. Tried to allocate 8.58 GiB (GPU 0; 31.75 GiB
total capacity; 23.00 GiB already allocated; 7.59 GiB free; 23.07 GiB reserved
in total by PyTorch) If reserved memory is >> allocated memory try setting
max_split_size_mb to avoid fragmentation. See documentation for Memory
Management and PYTORCH_CUDA_ALLOC_CONF

image size6K*8K, Is this related to image size? What I understand is that GPU needs to only be related to vqgantile_size ?

@IceClear
Copy link
Owner

IceClear commented May 25, 2023

With upscale 4.0 by default, the output will be 24k*32k. I do not know why you need such a large image. But it is identical to loading about 3k images with 512x512 all at once in the GPU memory. I conjecture you need at least 10G VRAM to store the input even if you do nothing else.

@ouyangjiacs
Copy link
Author

Uh huh, I made the test image smaller, but I still encountered errors during the test:
Sampling t: 96%|█████████▌| 192/200 [02:07<00:05, 1.52it/s]�[A

Sampling t: 96%|█████████▋| 193/200 [02:08<00:04, 1.52it/s]�[A

Sampling t: 97%|█████████▋| 194/200 [02:09<00:03, 1.51it/s]�[A

Sampling t: 98%|█████████▊| 195/200 [02:09<00:03, 1.51it/s]�[A

Sampling t: 98%|█████████▊| 196/200 [02:10<00:02, 1.51it/s]�[A

Sampling t: 98%|█████████▊| 197/200 [02:11<00:01, 1.51it/s]�[A

Sampling t: 99%|█████████▉| 198/200 [02:11<00:01, 1.51it/s]�[A

Sampling t: 100%|█████████▉| 199/200 [02:12<00:00, 1.51it/s]�[A

Sampling t: 100%|██████████| 200/200 [02:13<00:00, 1.50it/s]�[A
Sampling t: 100%|██████████| 200/200 [02:13<00:00, 1.50it/s]

Sampling: 0%| | 0/146 [7:19:17<?, ?it/s]
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ /data/work/StableSR/scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py:41 │
│ 6 in │
│ │
│ 413 │
│ 414 │
│ 415 if name == "main": │
│ ❱ 416 │ main() │
│ 417 │
│ │
│ /data/work/StableSR/scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py:40 │
│ 1 in main │
│ │
│ 398 │ │ │ │ │ │ im_sr = im_sr.cpu().numpy().transpose(0,2,3,1) │
│ 399 │ │ │ │ │ │ │
│ 400 │ │ │ │ │ │ if flag_pad: │
│ ❱ 401 │ │ │ │ │ │ │ im_sr = im_sr[:, :ori_hsf, :ori_wsf, ] │
│ 402 │ │ │ │ │ │ │
│ 403 │ │ │ │ │ │ for jj in range(im_lq_bs.shape[0]): │
│ 404 │ │ │ │ │ │ │ img_name = str(Path(im_path_bs[jj]).name) │
╰──────────────────────────────────────────────────────────────────────────────╯
NameError: name 'sf' is not defined

@IceClear
Copy link
Owner

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants