Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
FastVectorHighlighter fails with StackOverflow on terms with large TermFrequency #3486
FVH deploys some recursive logic to extract terms from documents that need to highlighted. For documents that have terms with super large term frequency like a document that repeats a terms very very often this can produce some very large stacks when extracting the terms. Taken to an extreme this causes stack overflow errors when this grow beyond a term frequency >= 6000.
I will attach a possible fix and a test case that reproduces the problem in a bit.