Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnx conversion #539

Open
Bin-ze opened this issue Aug 17, 2023 · 6 comments
Open

onnx conversion #539

Bin-ze opened this issue Aug 17, 2023 · 6 comments
Labels
question Further information is requested

Comments

@Bin-ze
Copy link

Bin-ze commented Aug 17, 2023

❓ Questions

I want to convert the model to onnx and deploy it on the development board. I want to know whether the conversion is similar to the CNN model and can be directly exported through torch.onnx.export.

Looking forward to your reply!

@Bin-ze Bin-ze added the question Further information is requested label Aug 17, 2023
@Bin-ze
Copy link
Author

Bin-ze commented Aug 22, 2023

To achieve this, I made some effort,here are some possible relevant discussions I found in the community:

pytorch/pytorch#49793
https://github.com/adobe-research/convmelspec
pytorch/pytorch#92087
pytorch/pytorch#81075
speechbrain/speechbrain#1455

But there are some problems:

  1. the stft operator does not support
  2. The real_to_complex operator does not support
  3. I can't understand what the recursive call in apply_model is doing and how to convert this implementation into a logically clear loop call,because I don't have any experience with audio algorithms
  4. I don't understand how BagOfModels should be disassembled into normal models

In order to avoid problems 1 and 2, I tried to rewrite the forward function in the network, so that the stft was executed externally, but I couldn't directly instantiate the model:

 args = {'sources': ['drums', 'bass', 'other', 'vocals', 'guitar', 'piano'],
         'audio_channels': 2,
         'samplerate': 44100,
         'segment': Fraction(39, 5),
         'channels': 48,
         'channels_time': None,
         'growth': 2,
         'nfft': 4096,
         'wiener_iters': 0,
         'end_iters': 0,
         'wiener_residual': False,
         'cac': True, 'depth': 4,
         'rewrite': True, 'multi_freqs': [],
         'multi_freqs_depth': 3, 'freq_emb': 0.2,
         'emb_scale': 10, 'emb_smooth': True,
         'kernel_size': 8, 'stride': 4, 'time_stride': 2,
         'context': 1, 'context_enc': 0, 'norm_starts': 4,
         'norm_groups': 4, 'dconv_mode': 3, 'dconv_depth': 2,
         'dconv_comp': 8, 'dconv_init': 0.001, 'bottom_channels': 0,
         't_layers': 5, 't_hidden_scale': 4.0, 't_heads': 8, 't_dropout': 0.02,
         't_layer_scale': True, 't_gelu': True, 't_emb': 'sin', 't_max_positions': 10000,
         't_max_period': 10000.0, 't_weight_pos_embed': 1.0, 't_cape_mean_normalize': True,
         't_cape_augment': True, 't_cape_glob_loc_scale': [5000.0, 1.0, 1.4], 't_sin_random_shift': 0,
         't_norm_in': True, 't_norm_in_group': False, 't_group_norm': False, 't_norm_first': True,
         't_norm_out': True, 't_weight_decay': 0.0, 't_lr': None, 't_sparse_self_attn': False,
         't_sparse_cross_attn': False, 't_mask_type': 'diag', 't_mask_random_seed': 42,
         't_sparse_attn_window': 400, 't_global_window': 100, 't_sparsity': 0.95,
         't_auto_sparsity': False, 't_cross_first': False, 'rescale': 0.1}
          model = HTDemucs(args)
          checkpoint_dict = torch.load(checkpoint_path, map_location='cpu')

          state_dict = checkpoint_dict['state']
          model.load_state_dict(state_dict)

And the implementation of that repo is:

          model = get_model_from_args(args)

My changes have no effect when using a call like this

I have tried a lot, but without any progress.
I would be very grateful if someone could help me
If this task is particularly difficult, then please let me know
@adefossez

@alexvoina
Copy link

would be interested as well to get this model in C++, mainly to run the inference faster

@Bin-ze Bin-ze closed this as completed Sep 8, 2023
@Bin-ze Bin-ze reopened this Sep 8, 2023
@Achuttarsing
Copy link

Interested too!

@alexvoina
Copy link

would be cool if all of us joined forces to get this done :D

@gongouveia
Copy link

@alexvoina @Achuttarsing have this issue gone further? I am facing same problem

@Achuttarsing
Copy link

I didn't pursue this problem further

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants