Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export model to onnx without FFT/IFFT #530

Open
netv1 opened this issue Jul 24, 2023 · 8 comments
Open

export model to onnx without FFT/IFFT #530

netv1 opened this issue Jul 24, 2023 · 8 comments
Labels
question Further information is requested

Comments

@netv1
Copy link

netv1 commented Jul 24, 2023

❓ Questions

Hi guys, I've tried to export the model to onnx but because of the FFT (real-to-complex) operations in the model it seems the export cannot work. I have used the latest supported PyTorch 2.0.1 and ONNX opset 18. I want to use demucs from C/C++ and I can do the FFT part directly in C/C++ and provide it to the model along with the raw audio. However, I've no idea how to export the model without the FFT parts.

Eg. I want to input myself the FFT and the raw audio and get back the data for the inverse FFT and the time-domain data and add them myself. PyTorch is not my main expertise area as you can tell, but I can do the DSP pre/post-processing in C/C++.

What's the best way to go about this (exporting a partial graph or a model with multiple inputs/outputs)? I would appreciate any help or hints. Thanks

@netv1 netv1 added the question Further information is requested label Jul 24, 2023
@Bin-ze
Copy link

Bin-ze commented Aug 17, 2023

I want to convert the model to onnx for deployment on embedded devices, but I can't achieve it with simple logic, because the model experiment apply_model function processing, can you tell me how to convert the model to onnx? Also have you implemented inference using the onnx backend.
Looking forward to your reply, I will be very grateful

@netv1
Copy link
Author

netv1 commented Aug 21, 2023

Unfortunately I haven't done any progress on this. Maybe the community will be kind enough to at least steer us in the right direction.

@Bin-ze
Copy link

Bin-ze commented Aug 22, 2023

Thanks for the replies, here are some possible relevant discussions I found in the community:

  1. [ONNX][Complex] Support view_as_complex pytorch/pytorch#49793
  2. https://github.com/adobe-research/convmelspec
  3. [ONNX] STFT Support pytorch/pytorch#92087
  4. [ONNX] Support opset 17 operators pytorch/pytorch#81075
  5. Exporting the operator stft to ONNX opset version 9 is not supported speechbrain/speechbrain#1455

I am currently trying to avoid the FFT part everywhere, so as to bypass the problem that onnx does not support some operators

But I don't understand the audio field at all, I checked the code, but I can't understand the recursion in apply_model, and I can't understand what exactly Bagmodel implements, but I think if you want to completely convert this algorithm to onnx, then you have to convert The recursive implementation is converted to a loop, and then the corresponding models are exported separately, I hope to get your help, if you have already figured out this part, can you tell me how to do it

@alexvoina
Copy link

I'm interested in doing this too

@netv1
Copy link
Author

netv1 commented Sep 11, 2023

Unfortunately I still haven't had the time to revisit this (I'm still planning to maybe next month). It is 100% doable as both VirtualDJ and Serato are using it, so ...

@alexvoina
Copy link

let's keep in touch, we could join forces! I have some experience with this kind of work

@AdarshAcharya5
Copy link

Thanks for the replies, here are some possible relevant discussions I found in the community:

  1. [ONNX][Complex] Support view_as_complex pytorch/pytorch#49793
  2. https://github.com/adobe-research/convmelspec
  3. [ONNX] STFT Support pytorch/pytorch#92087
  4. [ONNX] Support opset 17 operators pytorch/pytorch#81075
  5. Exporting the operator stft to ONNX opset version 9 is not supported speechbrain/speechbrain#1455

I am currently trying to avoid the FFT part everywhere, so as to bypass the problem that onnx does not support some operators

But I don't understand the audio field at all, I checked the code, but I can't understand the recursion in apply_model, and I can't understand what exactly Bagmodel implements, but I think if you want to completely convert this algorithm to onnx, then you have to convert The recursive implementation is converted to a loop, and then the corresponding models are exported separately, I hope to get your help, if you have already figured out this part, can you tell me how to do it

If you're using HTDemucs or HDemucs, you can actually put STFT and ISTFT outside the model's forward call as it's only used in the beginning and the end of the call. I did the same and almost managed to convert it, however it seems ONNX doesn't support nn.MultiheadAttention operator in it's opset yet. Unfortunately the tracer doesn't show the exact line where it's failing to parse. I think the only way to go about this problem is to write our own multiheadattention function.
For context this is the exception it throws :

raise errors.UnsupportedOperatorError(
torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::_native_multi_head_attention' to ONNX opset version 17 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.

@jie-chen
Copy link

It seems only HTDemucs use MultiheadAttention in transformer. So HDemucs should be good to go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants