Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnx exported problem #8

Open
unparalleled-ysj opened this issue Nov 16, 2022 · 19 comments
Open

onnx exported problem #8

unparalleled-ysj opened this issue Nov 16, 2022 · 19 comments

Comments

@unparalleled-ysj
Copy link

May I ask how you dealt with the “RuntimeError: Unknown number type: complex” problem caused by torch.istft when exporting the onnx model

@Jackiexiao
Copy link

Jackiexiao commented Nov 28, 2022

torch.istft is currently not support to convert to onnx and still in development, see: pytorch/pytorch#81075

@MasayaKawamura
Copy link
Owner

Hi @unparalleled-ysj, I'm sorry to be late...
torch.istft is currently not supported by onnx, so please exclude only the istft part when exporting to onnx.
@Jackiexiao, thank you for your comment!

@MasayaKawamura
Copy link
Owner

Maybe this URL will also be useful for torch onnx.

@Jackiexiao
Copy link

Jackiexiao commented Nov 28, 2022

I'm confused, we have to pass istft function to get wav, if we exclude istft part, we can't get result @MasayaKawamura

@MasayaKawamura
Copy link
Owner

@Jackiexiao
I think that it is possible to export the processes except for istft using onnx. During inference, I think that wav can be obtained by combining onnx and torch istft code.

@Jackiexiao
Copy link

ok, I get, looking forward to get istft support in torch nightly, so we just need onnx during inference

@unparalleled-ysj
Copy link
Author

You can use

def inverse(self, magnitude, phase):
instead of
def inverse(self, magnitude, phase):
when exporting, which can successfully export the model as onnx, but at the same time, there will be rattling noise in the speech. After my ablation comparison, the problem still appears in the istft export (because the original model has no problem)

@unparalleled-ysj
Copy link
Author

@Jackiexiao
Copy link

thx

@FanhuaandLuomu
Copy link

Hi @MasayaKawamura
Can you share your code to save onnx model, i got some problems when i convert to onnx.

@Jackiexiao
Copy link

FYI see: https://github.com/wenet-e2e/wetts/blob/main/wetts/vits/export_onnx.py but you can't export istft vocoder to onnx here @FanhuaandLuomu

@abylouw
Copy link

abylouw commented Dec 21, 2022

FYI see: https://github.com/wenet-e2e/wetts/blob/main/wetts/vits/export_onnx.py but you can't export istft vocoder to onnx here @FanhuaandLuomu

hi @Jackiexiao,

I have tried the above script to export but I have had no success. Would you mind sharing your export code?

@abylouw
Copy link

abylouw commented Dec 21, 2022

Hi @MasayaKawamura Can you share your code to save onnx model, i got some problems when i convert to onnx.

Hi @FanhuaandLuomu, have you succeeded in exporting the model?

@Jackiexiao
Copy link

@abylouw it just work in original vits(not for mbistft, but they work the same way, except the vocoder part), and wetts repo has all code you need

@JohnHerry
Copy link

You can use

def inverse(self, magnitude, phase):

instead of

def inverse(self, magnitude, phase):

when exporting, which can successfully export the model as onnx, but at the same time, there will be rattling noise in the speech. After my ablation comparison, the problem still appears in the istft export (because the original model has no problem)

Do we need to use the class STFT instead of TorchSTFT during the training in this case?

@JohnHerry
Copy link

@Jackiexiao I think that it is possible to export the processes except for istft using onnx. During inference, I think that wav can be obtained by combining onnx and torch istft code.

I have tried to split the MB-iSTFT-VITS into this two parts and the former transfered into onnx, it is succeed. but as to the MS-iSTFT-VITS, I have to split the model into three parts, which case the first and the third part should be transfer into onnx models. as to the third part, the multi-band filter, the self.multistream_conv_post layer there is a weight_norm, should I keep the weight_norm layer there? I saw your remove_weight_norm function in the class did not remove this part. If the weight_norm can be removed during transfer into onnx, should I just put the "dec.multistream_conv_post.weight_v" value in the checkpoint , into my self defined third model part?

@nshmyrev
Copy link

nshmyrev commented Oct 1, 2023

Do we need to use the class STFT instead of TorchSTFT during the training in this case?

You do not have to use STFT during training, only during export. See here

alphacep/MB-iSTFT-VITS2@29c91d4

see also

FENRlR/MB-iSTFT-VITS2#3

@Insensiblee
Copy link

在这种情况下,我们在训练过程中是否需要使用 STFT 类来代替 TorchSTFT ?

您不必在训练期间使用 STFT,只需在导出期间使用。看这里

alphacep/MB-iSTFT-VITS2@ 29c91d4

也可以看看

FENRlR/MB-iSTFT-VITS2#3

I used this code to transfer onnx to process the pre-trained model provided, why did I report this error:AttributeError: 'ResidualCouplingLayer' object has no attribute 'remove_weight_norm'

@JohnHerry
Copy link

JohnHerry commented Nov 10, 2023

在这种情况下,我们在训练过程中是否需要使用 STFT 类来代替 TorchSTFT ?

您不必在训练期间使用 STFT,只需在导出期间使用。看这里
alphacep/MB-iSTFT-VITS2@ 29c91d4
也可以看看
FENRlR/MB-iSTFT-VITS2#3

I used this code to transfer onnx to process the pre-trained model provided, why did I report this error:AttributeError: 'ResidualCouplingLayer' object has no attribute 'remove_weight_norm'

No, the STFT from this project, which is the same with the one from the iSTFTNet project, is not good for onnx model exporation.
it can help generate a onnx model for inference, but this model will failed for some input cases. and even for those success, the generated wavform will contains some noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants