Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nnet3 online decoder endpointing doesn't use frame subsampling rate #1184

Closed
alumae opened this issue Nov 12, 2016 · 6 comments
Closed

Nnet3 online decoder endpointing doesn't use frame subsampling rate #1184

alumae opened this issue Nov 12, 2016 · 6 comments

Comments

@alumae
Copy link
Contributor

alumae commented Nov 12, 2016

I believe the endpointing code, originally developed for nnet2, doesn't take into account the frame subsampling rate used in chain models. Thus, when using chain models, the silence needs to be 3x as long as it should to be identified as an endpoint.

@ognjentodic
Copy link

Yes, same for the timings (e.g. word start times); they all need to be adjusted by frame subsampling factor. I "solved" this at the higher level (outside Kaldi); didn't see an easy way to access this config param as is.

@danpovey
Copy link
Contributor

Tanel, do you have time to work on a fix for the endpointing issue? It
does seem closer to a bug than a mis-feature, because those times are
expressed in seconds, not in frames.

Ognjen: regarding things like word start times, I'm not quite sure what
tools you are referring to, but things like lattice-to-ctm-conf take a
--frame-shift parameter that should be set to 0.03 for chain systems.

On Sat, Nov 12, 2016 at 11:11 AM, ognjentodic notifications@github.com
wrote:

Yes, same for the timings (e.g. word start times); they all need to be
adjusted by frame subsampling factor. I "solved" this at the higher level
(outside Kaldi); didn't see an easy way to access this config param as is.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1184 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVuw_J55Kci-YcaKCMzSMipoU4qfWTks5q9eVFgaJpZM4KwZzn
.

@alumae
Copy link
Contributor Author

alumae commented Nov 13, 2016

Yes, I can work on this.

Ognjen probably refers to kaldi-gstreamer-server that can output timing information. I'll fix this too.

@ognjentodic
Copy link

I was actually referring to methods that return timing information via number of frames (for example, CompactLatticeToWordAlignment); but, those "issues" are not of the same nature as the endpointing thresholds issue since the latter are specified in seconds (vs frames), so my comment is really a false alarm for this issue.

Perhaps just an explanation/note in various places in a method description on what the frame (rate) really means would have been useful. (and it's quite possible that's already nicely described somewhere, but I missed it)

@vince62s
Copy link
Contributor

I happen to use quite often the get_ctm.sh script and a modified version with lattice-to-ctm-conf.
@alumae I also use a modified version of kaldi-offline-transcriber which calls get_ctm.sh (just a reminder)

I didn't know about this --frame-shift parameter, so I still have default 0.01 with my chain models, but why don't I get garbage then ? what is the exact impact in lattice-to-ctm-conf for instance ?

@danpovey
Copy link
Contributor

The only impact is that the times in the ctm would be wrong. This might
affect some NIST scoring scripts, for instance.

On Sun, Nov 13, 2016 at 2:37 PM, vince62s notifications@github.com wrote:

I happen to use quite often the get_ctm.sh script and a modified version
with lattice-to-ctm-conf.
@alumae https://github.com/alumae I also use a modified version of
kaldi-offline-transcriber which calls get_ctm.sh (just a reminder)

I didn't know about this --frame-shift parameter, so I still have default
0.01 with my chain models, but why don't I get garbage then ? what is the
exact impact in lattice-to-ctm-conf for instance ?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1184 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVuwBpqDjMihNgvJnit0-sVCekasSPks5q92cKgaJpZM4KwZzn
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants