Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel size error when rave.decode(z) #15

Closed
robinmeier opened this issue Dec 18, 2021 · 15 comments
Closed

Kernel size error when rave.decode(z) #15

robinmeier opened this issue Dec 18, 2021 · 15 comments

Comments

@robinmeier
Copy link

i get the following error when i try generating from the prior

Traceback (most recent call last): File "/home/syrinx/RAVE/ravezeke1-generate.py", line 43, in <module> y = rave.decode(z) File "/home/syrinx/RAVE/rave/model.py", line 582, in decode y = self.decoder(z, add_noise=True) File "/home/syrinx/miniconda2/envs/rave/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/syrinx/RAVE/rave/model.py", line 235, in forward x = self.net(x) File "/home/syrinx/miniconda2/envs/rave/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/syrinx/miniconda2/envs/rave/lib/python3.9/site-packages/torch/nn/modules/container.py", line 141, in forward input = module(input) File "/home/syrinx/miniconda2/envs/rave/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1120, in _call_impl result = forward_call(*input, **kwargs) File "/home/syrinx/miniconda2/envs/rave/lib/python3.9/site-packages/cached_conv/convs.py", line 74, in forward return nn.functional.conv1d( RuntimeError: Calculated padded input size per channel: (6). Kernel size: (7). Kernel size can't be greater than actual input size

this is the code

`################ PRIOR GENERATION ################

STEP 1: CREATE DUMMY INPUT TENSOR

generation_length = 2**18 # approximately 6s at 48kHz
x = torch.randn(1, 1, generation_length) # dummy input
z = rave.encode(x) # dummy latent representation
z = torch.zeros_like(z)

STEP 2: AUTOREGRESSIVE GENERATION

z = prior.quantized_normal.encode(prior.diagonal_shift(z))
z = prior.generate(z)
z = prior.diagonal_shift.inverse(prior.quantized_normal.decode(z))

STEP 3: SYNTHESIS AND EXPORT

y = rave.decode(z)
sf.write("output_audio.wav", y.reshape(-1).numpy(), sr)
`

when i change generation_length to a smaller size, i get the error

RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

my audio is in 44.1kHz

@robinmeier
Copy link
Author

ps: reconstruction is working though

@caillonantoine
Copy link
Collaborator

Can you check the size of z before the decoding ?

@robinmeier
Copy link
Author

when i do z.size() i get
[1, 128, 0]

@caillonantoine
Copy link
Collaborator

So your tensor is empty... what does generation_length equals to ?

@robinmeier
Copy link
Author

generation_length = 2**18

@caillonantoine
Copy link
Collaborator

Didn't you lower it ?

@robinmeier
Copy link
Author

with
generation_lenght = 2**17 i get

i = i // self.groups Traceback (most recent call last): File "/home/syrinx/RAVE/ravezeke1-generate.py", line 37, in <module> z = prior.quantized_normal.encode(prior.diagonal_shift(z)) File "/home/syrinx/RAVE/prior/core.py", line 24, in encode return self.to_stack_one_hot(x.long()) File "/home/syrinx/RAVE/prior/core.py", line 29, in to_stack_one_hot x = x.reshape(x.shape[0], x.shape[1], -1) RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

and in any case i get the following warning

/home/syrinx/RAVE/prior/core.py:52: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').

@caillonantoine
Copy link
Collaborator

Mhhh.. how many dimensions inside the latent space ? I'm guessing the problem comes from the diagonal shift which requires at least 2xlatent_dim time steps to operate !

@robinmeier
Copy link
Author

sorry, how do i see the number of dims inside the latent space?
i was using the default values of the cli helper:

choose a name for the training: starling2
path to the .wav files: /home/syrinx/robin-voizo/zekestar/b1159/wavs/
temporary folder (fast drive): /home/syrinx/RAVE/
sampling rate (defaults to 48000): 44100
multiband number (defaults to 16):
training example duration (defaults to 65536 samples):
model capacity (defaults to 6):
number of steps for stage 1 (defaults to 1000000):
prior resolution (defaults to 32):
reconstruction fidelity (defaults to 0.95):
latency compensation (defaults to false):

@caillonantoine
Copy link
Collaborator

It's the size of the dummy latent representation

@robinmeier
Copy link
Author

robinmeier commented Dec 19, 2021

wiith generation_length 2**17 z.size() is torch.Size([1, 128, 64]) with generation_length 2**18 it's torch.Size([1, 128, 128])

i'm getting my checkpoints from here, in case this could be relevant

rave = RAVE.load_from_checkpoint("/home/syrinx/RAVE/runs/starling1/rave/version_0/checkpoints/best.ckpt", strict=False).eval()
prior = Prior.load_from_checkpoint("/home/syrinx/RAVE/runs/starling1/prior/version_0/checkpoints/best.ckpt", strict=False).eval()

@caillonantoine
Copy link
Collaborator

Ok I see the problem ! The README is quite wrong, I'm about to fix it asap

@robinmeier
Copy link
Author

awesome. thanks.
can i continue to run training sets or should i wait for the fix?

@caillonantoine
Copy link
Collaborator

You can, and I've modified the instructions in the readme :)

@robinmeier
Copy link
Author

ah yes of course - helps to load the model :)
thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants