-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some question of pad_with_border #8
Comments
Sorry Many thanks, |
Hi Nick,
The picture you show is correct. pad_with_border simply extend the left and right border.
You may obtain enhanced speech from by running this code. Then ASR may apply post-hoc.
Best wishes,
Qiuqiang
…________________________________
From: Nickkk1124 <notifications@github.com>
Sent: 24 April 2018 09:57:30
To: yongxuUSTC/sednn
Cc: Subscribed
Subject: Re: [yongxuUSTC/sednn] some question of pad_with_border (#8)
Sorry
In addition, I would like to ask if I want to use this speech-enhanced system on the front of the asr. How do I do this?
Many thanks,
Nick
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#8 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMt5ydHnaYUDLH5wENARAsUg_HJAvFJbks5truj6gaJpZM4ThGhz>.
|
Hello Qiuqiang, Mat_2d_to_3d is to convert features to (n_segs, n_concat, n_freq). The center frame of the first round of stacking frames is t=1, and the center frame of the second round of stacking frames should not be t=2? But as shown in the following figure, why is the center frame of the second round of stacking frames t=4? Many thanks, Nick |
Hi Nick,
Yes, you can use the enhanced features for ASR. But maybe you should use retraining or joint-training of your backend acoustic model for ASR.
Good luck.
Best regards,
yong
…--------------------------------------------------------
Dr. Yong XU
https://sites.google.com/view/xuyong/home
From: Nickkk1124
Date: 2018-04-24 09:57
To: yongxuUSTC/sednn
CC: Subscribed
Subject: Re: [yongxuUSTC/sednn] some question of pad_with_border (#8)
Sorry
In addition, I would like to ask if I want to use this speech-enhanced system on the front of the asr. How do I do this?
Many thanks,
Nick
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hi Yong, Thank you for your replying!
Many thanks, |
Hi Nick,
In the picture you draw, it is correct. center frame=1 and center frame=4 in your drawing. It also depends on the hop.
"The "enhanced features for ASR" you mentioned, do you mean the magnitudes of log power spectrogram?"
- It means either enhanced spectrogram or log power spectrogram.
"Do you think using recover enhanced wav as ASR input is feasible?"
It is feasible if the dataset is small. However bare in mind any speech denoising
- method will lose some information. Some work did a joint enhancement and recognition.
"What would you recommend about applying the enhancement system to dealing with the environmental noise?"
- I think applying on environmental noise should be fine, as long as the noise for training covers most environmental noise.
Best wishes,
Qiuqiang
…________________________________
From: Nickkk1124 <notifications@github.com>
Sent: 24 April 2018 17:18:58
To: yongxuUSTC/sednn
Cc: Kong Q Mr (PG/R - Elec Electronic Eng); Comment
Subject: Re: [yongxuUSTC/sednn] some question of pad_with_border (#8)
Hi Yong,
Thank you for your replying!
There are some questions I'd like to ask:
1. The "enhanced features for ASR" you mentioned, do you mean the magnitudes of log power spectrogram?
2. Do you think using recover enhanced wav as ASR input is feasible?
3. What would you recommend about applying the enhancement system to dealing with the environmental noise?
Many thanks,
Nick
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#8 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMt5yahThNECOw9f22-pO8B3RIlbgshRks5tr1BxgaJpZM4ThGhz>.
|
Hello Qiuqiang, This is a great work. It will be of great help if you could elaborate on below points mentioned by you in above discussion. "- method will lose some information. Some work did a joint enhancement and recognition." I get the point of information loss. Can you please tell more about Joint enhancement and recognition? Is it like two 2 DNN models interlinked or preprocessing and ASR. Thank-you. |
Hi Nick,
If speech enhancement and ASR are done separately, the ASR performance might be reduced. Because sometimes speech enhancement will also move out some useful information of a speech. However, if they are combined to a single neural network it might be helpful. For example, use speech enhancement as lower layer of a neural network and use ASR as a high layer neural network. The loss function can combine the ASR and speech enhancement. It is just my conjecture and I am not aware if there is such work or not.
Best wishes,
Qiuqiang
…________________________________
From: akshayaCap <notifications@github.com>
Sent: 05 July 2018 12:12:45
To: yongxuUSTC/sednn
Cc: Kong Q Mr (PG/R - Elec Electronic Eng); Comment
Subject: Re: [yongxuUSTC/sednn] some question of pad_with_border (#8)
Hello Qiuqiang,
This is a great work. It will be of great help if you could elaborate on below points mentioned by you in above discussion.
"- method will lose some information. Some work did a joint enhancement and recognition."
I get the point of information loss. Can you please tell more about Joint enhancement and recognition?
Is it like two 2 DNN models interlinked or preprocessing and ASR.
Thank-you.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#8 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMt5ybZnHdThX_lUbV1r7wLirIbLZnQuks5uDfStgaJpZM4ThGhz>.
|
Hi Nick,
Yes, there are joint SE & ASR training papers:
https://www.isca-speech.org/archive/interspeech_2014/i14_0616.html
https://ieeexplore.ieee.org/abstract/document/7178797/
Best regards,
yong
…----------------------------------------------------------
Yong XU
https://sites.google.com/view/xuyong/home
From: qiuqiangkong
Date: 2018-07-06 03:55
To: yongxuUSTC/sednn
CC: yong xu @ seattle; Comment
Subject: Re: [yongxuUSTC/sednn] some question of pad_with_border (#8)
Hi Nick,
If speech enhancement and ASR are done separately, the ASR performance might be reduced. Because sometimes speech enhancement will also move out some useful information of a speech. However, if they are combined to a single neural network it might be helpful. For example, use speech enhancement as lower layer of a neural network and use ASR as a high layer neural network. The loss function can combine the ASR and speech enhancement. It is just my conjecture and I am not aware if there is such work or not.
Best wishes,
Qiuqiang
________________________________
From: akshayaCap <notifications@github.com>
Sent: 05 July 2018 12:12:45
To: yongxuUSTC/sednn
Cc: Kong Q Mr (PG/R - Elec Electronic Eng); Comment
Subject: Re: [yongxuUSTC/sednn] some question of pad_with_border (#8)
Hello Qiuqiang,
This is a great work. It will be of great help if you could elaborate on below points mentioned by you in above discussion.
"- method will lose some information. Some work did a joint enhancement and recognition."
I get the point of information loss. Can you please tell more about Joint enhancement and recognition?
Is it like two 2 DNN models interlinked or preprocessing and ASR.
Thank-you.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#8 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AMt5ybZnHdThX_lUbV1r7wLirIbLZnQuks5uDfStgaJpZM4ThGhz>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Dear Yong, Thank-you, |
Hello:
Really impressed by your work and got a few questions in terms of how you process the data.
Do pad_with_border mean this?
Many thanks,
Nick
The text was updated successfully, but these errors were encountered: