Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor gensim.doctest to work for gensim 4.0.0 and up #2914

merged 3 commits into from Dec 19, 2021


Copy link

@tomaarsen tomaarsen commented Dec 14, 2021


Pull Request Overview

  • Update gensim.doctest allowing us to remove the <4.0.0 version lock of gensim.


See #2913 (comment) for a short discussion on gensim. It is currently locked to gensim < 4.0.0, and we'll want to start supporting newer versions. However, in 4.0.0, gensim made some changes in syntax, meaning that you either support before 4.0.0, or after 4.0.0.
This is us moving to the latter.

Beyond this PR

Lastly, I would like to propose the following, which is not included in this PR:
Let's deprecate gensim from pip-req.txt. gensim is not used anywhere in the codebase, except for this doctest, so we should remove it from pip-req.txt, and keep it only in requirements-ci.txt/requirements-test.txt. I have some other issues with these files at the moment, which I've brought up before in #2881:

  • Perhaps we can merge requirements-test.txt and requirements-ci.txt. They're both for testing, and are already very similar.
  • It somewhat bothers me that pip-req.txt and requirements-ci.txt/requirements-test.txt don't have the same naming structure. Perhaps we can change this? (To be honest, I don't remember ever seeing a pip-req.txt file before NLTK - I only encounter requirements.txt)

cc: @purificant This PR might be relevant for you and your work to help with #2913.

  • Tom Aarsen

requirements-ci.txt Show resolved Hide resolved
Copy link

@purificant purificant commented Dec 14, 2021

requirements-ci.txt was created as a clean slate for CI when moving to GitHub Actions, since then this has become our primary CI pipeline, we could definitely clean up other requirement files and consolidate. 👍

Copy link

@purificant purificant commented Dec 14, 2021

Amazing work @tomaarsen, thank you for doing this. One step closer to py 3.10

Copy link

@dannysepler dannysepler left a comment

+1, great work!

Copy link
Member Author

@tomaarsen tomaarsen commented Dec 19, 2021

I've decided it's best to remove the unused gensim dependency as it might mess with people who wish to use specific gensim versions alongside NLTK, and there is no need to potentially introduce unnecessary version conflict.

@tomaarsen tomaarsen merged commit 6b60213 into nltk:develop Dec 19, 2021
16 checks passed
@tomaarsen tomaarsen deleted the feature/gensim-refactor branch Dec 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

Successfully merging this pull request may close these issues.

None yet

3 participants