Fix inefficient phase unwrapping #7

roebel · 2023-06-16T11:41:26Z

Thanks for sharing the code of this study. Reading the paper on arxiv I was astonished to see a highly redundant phase unwrapping code. I explain the problem in the code extract below

p = predicted phase
# calculate cos/sin with phase wrapping 
x = torch.cos(p)
y = torch.sin(p)
# recalculate wrapped phase
phase = torch.atan2(y, x)
# calculate complex values (here with unit magnitude) from the 1j * phase
S = torch.exp(phase * 1j)

I'd suggest changing to a 100% equivalent code that is about 3 times faster:

p = predicted phase
# calculate cos/sin with phase wrapping 
x = torch.cos(p)
y = torch.sin(p)
S = x + 1j  * y

You may not know that exp(jp) internally calculates cos(p) +1j * sin(p). On the other hand, recalculating
the phase after the first pass through cos/sin does produce a warped phase but does not avoid any
of the numerical problems that may arise when p is >> 1. The problem happened already within
cos/sin. Besides additional rounding errors, the final S will always be what you get for the proposed change, avoiding the call to atan as well as the call to exp(1j * p). In the end, this will be three times faster.

See here

https://github.com/roebel/vocos/blob/main/Improving_Vocos_vocoder_phase_unwrapping_issue.ipynb

for a benchmark.

Personally, if I would be you, I would fix this also in the arxiv paper, it really looks strange that you recalculate again what you have already once you have performed the cos/sin operations.

Thanks for sharing the code of this study. Reading the paper on arxiv I was astonished to see a highly redundant phase unwrapping code. I explain the problem in the code extract below ``` p = predicted phase # calculate cos/sin with phase wrapping x = torch.cos(p) y = torch.sin(p) # recalculate wrapped phase phase = torch.atan2(y, x) # calculate complex values (here with unit magnitude) from the 1j * phase S = torch.exp(phase * 1j) ``` I'd suggest changing into a 100% equivalent code that is about 3 times faster: ``` p = predicted phase # calculate cos/sin with phase wrapping x = torch.cos(p) y = torch.sin(p) S = x + 1j * y ``` You may not know that exp(jp) internally calculates cos(p) +1j * sin(p). On the other hand recalculating the phase after the first pass though cos/sin does produce a warped phase but does not avoid any of the numerical problems that may arise when p is >> 1. The problem happened already within cos/sin. Besides additional rounding errors the final S will always be what you get for the proposed change, avoiding the call to atan as well as the call to exp(1j * p). At the end this will be three times faster. See here https://github.com/roebel/vocos/blob/main/Improving_Vocos_vocoder_phase_unwrapping_issue.ipynb for a benchmark.

hubertsiuzdak · 2023-06-20T18:38:40Z

Thanks for pointing this out. You're absolutely right, the current implementation is awkward and indeed redundant. Now it's clear to me that we can certainly simplify it using your suggestion.

Interestingly, when I trained the model with your proposed change up to 200k iterations, I noted a slight decrease in the UTMOS score. Although the forward pass seems perfectly equivalent, I suspect that the different backward computations might be influencing the training dynamics. However, I believe that with more runs, it should converge similarly.

Regarding the arXiv paper, yes, it's likely to be updated. Please note that it's not yet peer-reviewed and should be considered a work in progress.

roebel · 2023-06-20T18:53:24Z

Interestingly, when I trained the model with your proposed change up to 200k iterations, I noted a slight decrease in the UTMOS score. Although the forward pass seems perfectly equivalent, I suspect that the different backward computations might be influencing the training dynamics. However, I believe that with more runs, it should converge similarly.

I don't know what you consider a slight decrease. To be able to see whether this is a systematic or a random effect you would need to train multiple times and calculate the mean and std deviation of the UTMOS score, and then see whether the slight decrease is outside the std-deviation which is naturally present due to the fact that all initializations are random. It is tempting to only look at a single result but often is leading to wrong conclusions. Here given both forward paths are different only due to rounding errors, if your original approach would turn out to be systematically better I would be very surprised.

I just wondered whether the difference makes a significant difference for the overall computational cost at the end.
atan/cos/sin are costly but given it is calculated only in the output layer it does not help so much.

hubertsiuzdak

I'd be happy to accept this PR, but I guess that we no longer need this notebook in the main codebase after merging, so I suggest removing it

Notebook no longer needed.

roebel and others added 3 commits June 16, 2023 11:47

Created using Colaboratory

e4cea35

Created using Colaboratory

dfa40ec

hubertsiuzdak requested changes Jun 20, 2023

View reviewed changes

Delete Improving_Vocos_vocoder_phase_unwrapping_issue.ipynb

8bf7f22

Notebook no longer needed.

hubertsiuzdak approved these changes Jun 21, 2023

View reviewed changes

hubertsiuzdak merged commit 748219e into gemelo-ai:main Jun 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inefficient phase unwrapping #7

Fix inefficient phase unwrapping #7

roebel commented Jun 16, 2023

hubertsiuzdak commented Jun 20, 2023

roebel commented Jun 20, 2023

hubertsiuzdak left a comment

Fix inefficient phase unwrapping #7

Fix inefficient phase unwrapping #7

Conversation

roebel commented Jun 16, 2023

hubertsiuzdak commented Jun 20, 2023

roebel commented Jun 20, 2023

hubertsiuzdak left a comment

Choose a reason for hiding this comment