You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I use 2~ gpus on inference, following error occurs.
Traceback (most recent call last):
File "/hub_data1/minhyuk/diffusion/opensora/scripts/inference.py", line 114, in <module>
main()
File "/hub_data1/minhyuk/diffusion/opensora/scripts/inference.py", line 95, in main
samples = scheduler.sample(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/__init__.py", line 72, in sample
samples = self.p_sample_loop(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 434, in p_sample_loop
for sample in self.p_sample_loop_progressive(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 485, in p_sample_loop_p
rogressive
out = self.p_sample(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 388, in p_sample
out = self.p_mean_variance(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 94, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 267, in p_mean_variance
model_output = model(x, t, **model_kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 127, in __call__
return self.model(x, new_ts, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/__init__.py", line 89, in forward_with_cfg
model_out = model.forward(combined, timestep, y, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 267, in forward
x = auto_grad_checkpoint(block, x, y, t0, y_lens, tpe)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/acceleration/checkpoint.py", line 24, in auto_grad_checkpoint
return module(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 111, in forward
x = x + self.cross_attn(x, y, mask)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/layers/blocks.py", line 313, in forward
kv = self.kv_linear(cond).view(B, -1, 2, self.num_heads, self.head_dim)
RuntimeError: shape '[4, -1, 2, 16, 72]' is invalid for input of size 105523
I tested on 2/3/4 gpus, and all give the same error.
The text was updated successfully, but these errors were encountered:
"My solution is to change the 'batch_size' value on line 32 of 'configs/opensora/inference/16x512x512.py' to 1, and that resolves the error." 16x512x512.py
* format
* format
* fix eval loss
* format
* use default seed
* format
* change back ckpt_every to 1k
---------
Co-authored-by: Shen-Chenhui <shen_chenhui@u.nus.edu>
odb9402
pushed a commit
to odb9402/Open-Sora
that referenced
this issue
Jul 18, 2024
If I use 2~ gpus on inference, following error occurs.
I tested on 2/3/4 gpus, and all give the same error.
The text was updated successfully, but these errors were encountered: