Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inefficient phase unwrapping #7

Merged
merged 4 commits into from
Jun 21, 2023

Conversation

roebel
Copy link
Contributor

@roebel roebel commented Jun 16, 2023

Thanks for sharing the code of this study. Reading the paper on arxiv I was astonished to see a highly redundant phase unwrapping code. I explain the problem in the code extract below

p = predicted phase
# calculate cos/sin with phase wrapping 
x = torch.cos(p)
y = torch.sin(p)
# recalculate wrapped phase
phase = torch.atan2(y, x)
# calculate complex values (here with unit magnitude) from the 1j * phase
S = torch.exp(phase * 1j)

I'd suggest changing to a 100% equivalent code that is about 3 times faster:

p = predicted phase
# calculate cos/sin with phase wrapping 
x = torch.cos(p)
y = torch.sin(p)
S = x + 1j  * y

You may not know that exp(jp) internally calculates cos(p) +1j * sin(p). On the other hand, recalculating
the phase after the first pass through cos/sin does produce a warped phase but does not avoid any
of the numerical problems that may arise when p is >> 1. The problem happened already within
cos/sin. Besides additional rounding errors, the final S will always be what you get for the proposed change, avoiding the call to atan as well as the call to exp(1j * p). In the end, this will be three times faster.

See here

https://github.com/roebel/vocos/blob/main/Improving_Vocos_vocoder_phase_unwrapping_issue.ipynb

for a benchmark.

Personally, if I would be you, I would fix this also in the arxiv paper, it really looks strange that you recalculate again what you have already once you have performed the cos/sin operations.

roebel and others added 3 commits June 16, 2023 11:47
Thanks for sharing the code of this study. Reading the paper on arxiv I was astonished to see a highly redundant phase unwrapping code. I explain the problem in the code extract below

```
p = predicted phase
# calculate cos/sin with phase wrapping 
x = torch.cos(p)
y = torch.sin(p)
# recalculate wrapped phase
phase = torch.atan2(y, x)
# calculate complex values (here with unit magnitude) from the 1j * phase
S = torch.exp(phase * 1j)
```

I'd suggest changing into a 100% equivalent code that is about 3 times faster:

```
p = predicted phase
# calculate cos/sin with phase wrapping 
x = torch.cos(p)
y = torch.sin(p)
S = x + 1j  * y
```

You may not know that exp(jp) internally calculates cos(p) +1j * sin(p). On the other hand recalculating 
the phase after the first pass though cos/sin does produce a warped phase but does not avoid any 
of the numerical problems that may arise when p is >> 1. The problem happened already within 
cos/sin. Besides additional rounding errors the final S will always be what you get for the proposed change, avoiding the call to atan as well as the call to exp(1j * p). At the end this will be three times faster. 

See here

https://github.com/roebel/vocos/blob/main/Improving_Vocos_vocoder_phase_unwrapping_issue.ipynb
 
for  a benchmark.
@hubertsiuzdak
Copy link
Collaborator

Thanks for pointing this out. You're absolutely right, the current implementation is awkward and indeed redundant. Now it's clear to me that we can certainly simplify it using your suggestion.

Interestingly, when I trained the model with your proposed change up to 200k iterations, I noted a slight decrease in the UTMOS score. Although the forward pass seems perfectly equivalent, I suspect that the different backward computations might be influencing the training dynamics. However, I believe that with more runs, it should converge similarly.

Regarding the arXiv paper, yes, it's likely to be updated. Please note that it's not yet peer-reviewed and should be considered a work in progress.

@roebel
Copy link
Contributor Author

roebel commented Jun 20, 2023

Interestingly, when I trained the model with your proposed change up to 200k iterations, I noted a slight decrease in the UTMOS score. Although the forward pass seems perfectly equivalent, I suspect that the different backward computations might be influencing the training dynamics. However, I believe that with more runs, it should converge similarly.

I don't know what you consider a slight decrease. To be able to see whether this is a systematic or a random effect you would need to train multiple times and calculate the mean and std deviation of the UTMOS score, and then see whether the slight decrease is outside the std-deviation which is naturally present due to the fact that all initializations are random. It is tempting to only look at a single result but often is leading to wrong conclusions. Here given both forward paths are different only due to rounding errors, if your original approach would turn out to be systematically better I would be very surprised.

I just wondered whether the difference makes a significant difference for the overall computational cost at the end.
atan/cos/sin are costly but given it is calculated only in the output layer it does not help so much.

Copy link
Collaborator

@hubertsiuzdak hubertsiuzdak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to accept this PR, but I guess that we no longer need this notebook in the main codebase after merging, so I suggest removing it

@hubertsiuzdak hubertsiuzdak merged commit 748219e into gemelo-ai:main Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants