-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sound source localization along with speech recognition #1
Comments
I haven't tried the speech recognition code of HARK, so I can't really tell
you.
If you use some other speech recognition engine you should be able to guess
who spoke with a bit of code to keep track of the last sound localization
results.
Good luck!
…On Sep 19, 2017 22:50, "srinivasanviki" ***@***.***> wrote:
As part of the project we are trying to implement Sound Localization along
with Speech Recognition in ROS using Xbox Kinect. We have run into problem
where we need to find the position of a person who spoke something ( which
is handled by Sound Localization module ) along with what the person said (
Speech Recognition ).
Can you please advise us on the implementation of the same as we are not
able to figure out if HARK can get the data of localization along with
speech recognition ( who said what from which direction ) in single go.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABpFdGOya-WDNugeFky8uDmH76L5v7bWks5sj7h3gaJpZM4PcV0W>
.
|
But your localization result is constant stream how do we keep track of last localization results in ros topic please suggest |
Keep a buffer of the last... Few seconds of localization results, when you
get a speech recognition result, estimate the duration of the speech given
the text recognition + the delay of getting the result and make an average
of the loudest localized sounds in that timeframe.
For example, keep all the messages of the last 10s from the localization.
Estimate how long from when you speak to when the recognition engine gives
a result it takes. For example 100ms.
When you get a callback from the recognition engine, for example, it
recognized "hello world", estimate that those 3 syllables (this paper
https://www.google.com.au/url?sa=t&source=web&rct=j&url=http://www.asel.udel.edu/icslp/cdrom/vol4/301/a301.pdf&ved=0ahUKEwiyiezqv7HWAhWLQpQKHVHNBZkQFggdMAA&usg=AFQjCNHjHYikZy-oDmBYdZ04jNqRwjAYVg
says the average duration of a syllabe is 150ms~). Then you got 3 * 150 =
450ms.
Now go to your buffer and from the end go back -100ms and from there get
all the messages to -550ms. Average the localization, probably by taking
also only the messages with a louder volume.
That's how I would try. From a kinda hacky perspective.
Other than that, learn how to use HARK speech recognizer too.
…On Sep 20, 2017 00:46, "srinivasanviki" ***@***.***> wrote:
But your localization result is constant stream how do we keep track of
last localization results in ros topic please suggest
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABpFdGYcIcXWk11zR7_pbqXoju4fq6Xnks5sj9PQgaJpZM4PcV0W>
.
|
Thanks for the suggestion I have a problem with localization too , When i do ROSLANCH pr2_kinect iam getting a continuous stream of localization results on topic HarkSource even when iam not speaking. |
As part of the project we are trying to implement Sound Localization along with Speech Recognition in ROS using Xbox Kinect. We have run into problem where we need to find the position of a person who spoke something ( which is handled by Sound Localization module ) along with what the person said ( Speech Recognition ).
Can you please advise us on the implementation of the same as we are not able to figure out if HARK can get the data of localization along with speech recognition ( who said what from which direction ) in single go.
The text was updated successfully, but these errors were encountered: