You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It depends what you mean. If you have labeled recordings of a speaker and want to recognize those exact recordings being played, then yes.
If however you want to recognize (1:N) or verify (1:1) a person's speech by their particular idiosyncrasies of speech, then no. dejavu works off of a fingerprinting (read: hashing) system. Like any good hashing scheme, a small perturbation of the input (in dejavu's case, timing and frequency) will cause very different fingerprints.
While very robust to noise, trying to recognize voice, which is not reliably the same timing or frequency each time, won't work. dejavu is meant for recognizing exact duplicates of previously recorded audio.
I was thinking about having a long recording of an individual repeating their name many times, and then as input having them say their name once. The fingerprinting approach may be useful there. I or a friend will try it out when we get a chance and get back to you.
Could this system be used for speaker recognition?
The text was updated successfully, but these errors were encountered: