Fix tests for corrected tokenize_skip_ngrams() #106

lmullen · 2018-03-14T00:53:59Z

This pull request fixes two failing tests caused by an imminent release of v0.2 of the tokenizers package. The version 0.1.4 of the tokenizers package currently on CRAN has an incorrect implementation of skip n-grams. (See this issue for details.) The master branch of tokenizers has a fix. As a result the indices used in the tests are now incorrect and I have updated them.

Note that tokenize_skip_ngrams() now returns vastly more tokens, which is how it is supposed to work, but which might come as a surprise.

lmullen · 2018-03-14T01:04:13Z

The CI checks failed because they are using the CRAN version of tokenizers.

juliasilge · 2018-03-14T01:27:57Z

Thank you so much, Lincoln! 🙌

juliasilge · 2018-03-14T01:29:07Z

Wowwwwwwwww, that is a lot more skip grams. 😯

lmullen · 2018-03-21T15:11:29Z

@juliasilge Thanks for the quick response to the CRAN maintainers, and for your flexibility in having to create a new release because of tokenizers changes.

juliasilge · 2018-03-21T15:24:26Z

@lmullen Do you think the CRAN maintainers intend for me to submit it right away (when my Travis builds and win-builder are still failing because binaries aren't built yet) or to wait until I can demonstrate that everything is passing?

lmullen · 2018-03-21T15:47:05Z

I think they probably want it right away. They will run their own tests on the current CRAN, I'd guess.

github-actions · 2022-03-24T00:09:30Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

Fix tests for corrected tokenize_skip_ngrams()

5dc82ad

juliasilge merged commit 517da8e into juliasilge:master Mar 21, 2018

github-actions bot locked and limited conversation to collaborators Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tests for corrected tokenize_skip_ngrams() #106

Fix tests for corrected tokenize_skip_ngrams() #106

lmullen commented Mar 14, 2018

lmullen commented Mar 14, 2018

juliasilge commented Mar 14, 2018

juliasilge commented Mar 14, 2018

lmullen commented Mar 21, 2018

juliasilge commented Mar 21, 2018

lmullen commented Mar 21, 2018

github-actions bot commented Mar 24, 2022

Fix tests for corrected tokenize_skip_ngrams() #106

Fix tests for corrected tokenize_skip_ngrams() #106

Conversation

lmullen commented Mar 14, 2018

lmullen commented Mar 14, 2018

juliasilge commented Mar 14, 2018

juliasilge commented Mar 14, 2018

lmullen commented Mar 21, 2018

juliasilge commented Mar 21, 2018

lmullen commented Mar 21, 2018

github-actions bot commented Mar 24, 2022