Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equally Divided Subband and Complex Spectrogram #6

Closed
sophia1488 opened this issue Dec 7, 2021 · 3 comments
Closed

Equally Divided Subband and Complex Spectrogram #6

sophia1488 opened this issue Dec 7, 2021 · 3 comments

Comments

@sophia1488
Copy link

Hi, I've read your paper and have a few questions,

  1. As lower frequency contains more information, will the result be better if the subband is not equally divided? Perhaps log-scale? (and maybe add another transformation so that they have the same shape to concatenate.)
    I'd like to try it, yet I'm not sure how the filters (models/filters/*.mat) are generated.
  2. If I understand it correctly, the U-Net cannot see the phase information,
    sp, cos_in, sin_in = self.f_helper.wav_to_mag_phase_subband_spectrogram(input)

    since only sp is forwarded to U-Net.
    I've tried adding phase on other channels, so that the input to the U-Net will be (batch, channel*2, time, frequency), and the rest of the code is the same. But the result is worse. Do you have any thoughts on this?

Thanks a million!

@haoheliu
Copy link
Owner

haoheliu commented Dec 8, 2021

@sophia1488 Hi, you made very good points! Below is my understanding.

  1. Yes definitely. I divide the band equally so that it can be easily modeled by CNN and simple concatenation. If you divide the band unequally, like the filter below, you can use some projection layers to transform them into the same shape for concatenation. Alternatively, you can use other architecture instead of pure CNN to fit with subbands with different dimensions. The filter can be built using this tool.

  2. Yes. We only forward sp to U-Net because it only needs to predict phase variations instead of phase information itself. In this case, original phase information becomes less important. You can try to add phase information on each channel. But personally, I suppose original phase information is quite complex and highly nonstructural, making it hard to learn from.

Welcome more discussions!

image

image

@sophia1488
Copy link
Author

Hello, thanks a lot for your quick reply and detailed comments! I'll dig into this.
Just one more question, is it okay for you to provide the command for the filter above? Since I'm not familiar with filters and I'd like to reproduce it.

Thank you 😊

@haoheliu
Copy link
Owner

haoheliu commented Dec 9, 2021

@sophia1488 I've already lost the original code. But you can start with trying the following command:

[h,f,distortion,total_aliasing]=Filter_Bank_Design([1,2,3,4],[0.1,0.2,0.3,0.4],[0.02,0.04,0.08,0.16],64,32,100,20000,[1 1],[]);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants