Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ONNX export for Pytorch Model #93

Open
H-G-11 opened this issue Sep 20, 2023 · 1 comment
Open

Add ONNX export for Pytorch Model #93

H-G-11 opened this issue Sep 20, 2023 · 1 comment

Comments

@H-G-11
Copy link

H-G-11 commented Sep 20, 2023

Hello,

I have been using the Pytorch model on a raspberry Pi. I am running it on a 2 seconds audio to detect a Wakeword every 200ms (see this issue regarding why I am not running it on the independents 200ms chunks). It takes between 10ms and 40ms to run it.

The performance are still good, but could be improved with ONNX. I have therefore tried to export the model to ONNX (code below), and got several errors (below too).

This is not an issue with this repository. Simply, Pytorch does not support exporting neither istft nor stft to ONNX. See this issue that tracks it down.

Nonetheless, on our end, we could maybe use directly ONNX STFT. For the istft, it seems that they very recently are thinking about adding it (see this issue), but that is is still not here.

What are your thought on this?

Note: once a model is exported to ONNX, the parameters cannot be changed as far as I know. So, probably, a great thing to do here would be to allow the export with a to_onnx method on an instantiated TorchGate object. If we find a solution for this istft and stft issue, I'd be willing to make a PR for it :)

Note 2: From here and here it seems that we might just have to wait until torch.onnx supports opset 19, which should contain the other operators... Not sure though

Annex:

Code to export to ONNX (to be put here):

if __name__ == "__main__":
    import torch
    import torchaudio

    data, _ = torchaudio.load("path/to/test/file.wav")

    model = TorchGate(
        sr=16000,
        nonstationary=False,
        n_fft=1024,
        prop_decrease=0.8,
        n_std_thresh_stationary=2,
        freq_mask_smooth_hz=None,
        time_mask_smooth_ms=None,
    )

    torch.onnx.export(
        model,
        data,
        "noise_supression.onnx",
        verbose=True,
        input_names=["x"],
        output_names=["y"],
        opset_version=17
    )

Error:

Exporting the operator 'aten::stft' to ONNX opset version 17 is not supported. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues.
@nuniz
Copy link
Collaborator

nuniz commented Sep 23, 2023

Hi HuguesGallier,
Thanks for the feedback, I'm glad to see that you are using the spectral gate as a nn.module :-).

The STFT or iSTFT operations can be performed externally (you need to remove the STFT and iSTFT computations inside the spectral gating code), or you can implement the STFT operation as a nn.module using conv1d and precomputed Fourier basis and integrate it with the spectral gate (see this issue: pytorch/pytorch#31317). Since it is scheduled to be supported in the next op set, we think it is unnecessary to add it to noisereduce.

By the way, if you are only running the spectral gating on 2 seconds of audio, it may not be enough, as it expects both noise and speech to be in the same recording. I suggest that you capture the noise profile externally and pass it to the y_noise argument. We may add the ability to continuously stream noise statistics to the streamer function in the future.

I hope this is helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants