Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About BK-tree you mentioned in the paper #142

Closed
JackSnowWolf opened this issue Dec 20, 2018 · 2 comments
Closed

About BK-tree you mentioned in the paper #142

JackSnowWolf opened this issue Dec 20, 2018 · 2 comments

Comments

@JackSnowWolf
Copy link

Hi!

You mentioned that you used BK-tree data structure to improve efficiency. Could you some how explain that how you use that data structure? I felt confused after I didn't find any details of BK-tree in your training code or inference code.

Thanks!

@githubharald
Copy link

this decoding strategy is pretty simple:

  1. find approximation for recognized word using best path decoding
  2. then, find the words (given as a dictionary) most similar to the approximation and put it in list of candidates (this can be done using a BK tree)
  3. compute the probability (loss) of all possible candidates
  4. return best scoring word

Python implementation see: https://github.com/githubharald/CTCDecoder/blob/master/src/LexiconSearch.py

@JackSnowWolf
Copy link
Author

@githubharald Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants