Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up views generation #7

Closed
benel opened this issue Oct 7, 2011 · 2 comments
Closed

Speed up views generation #7

benel opened this issue Oct 7, 2011 · 2 comments
Labels

Comments

@benel
Copy link
Member

benel commented Oct 7, 2011

For now, different views have a loop on words (corpus_lexicometrics, document_lexicometrics, kwic, phrase). To improve views generation performance, we could try to emit in the same loop views that have similar keys and reduce functions.

This seems to be the case of phrase and corpus_lexicometrics:

for each word1 {
  get word2 and word3
  if you have word3 {
    emit([word1, word2, word3])
  } else {
    emit(word1)
  }
}
@benel
Copy link
Member Author

benel commented Oct 7, 2011

Note: This algorithm is just an illustration. There is a more optimized way to do that by getting words only once and remembering the 2 immediate previous words.

@benel
Copy link
Member Author

benel commented Oct 10, 2011

No gain in generation time with the all-in-one view (c844940) compared to corpus_lexicometrics + document_lexicometrics + phrase (e46c11e). :'(

@benel benel closed this as completed Oct 29, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant