Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pattern dependency #3012

Merged
merged 18 commits into from
Jan 17, 2021
Merged

Remove pattern dependency #3012

merged 18 commits into from
Jan 17, 2021

Conversation

mpenkov
Copy link
Collaborator

@mpenkov mpenkov commented Dec 18, 2020

@mpenkov mpenkov added the breaks backward-compatibility Change breaks backward compatibility label Dec 18, 2020
Trying to work around "has no attribute '__reduce_cython__'" problem
Why was it removed? Parts of the code still need it.
gensim/scripts/segment_wiki.py Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
@mpenkov
Copy link
Collaborator Author

mpenkov commented Dec 26, 2020

Is there anything left to do here? I think this may be ready to merge.

gensim/scripts/segment_wiki.py Outdated Show resolved Hide resolved
gensim/scripts/segment_wiki.py Outdated Show resolved Hide resolved
@piskvorky
Copy link
Owner

Yes, looks ready, but all tests are failing.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Dec 27, 2020

Those failures are unrelated to this PR. In between me making this PR one week ago, and the build triggered by your changes to segment_wiki.py yesterday, there has been a new sklearn release and they deprecated one of the modules we were relying on.

$ python
Python 3.8.5 (default, Jul 21 2020, 10:48:26)
[Clang 11.0.3 (clang-1103.0.32.62)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn.datasets.base
/Users/misha/envs/gensim/lib/python3.8/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.datasets.base module is  deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.datasets. Anything that cannot be imported from sklearn.datasets is now part of the private API.
  warnings.warn(message, FutureWarning)

I'll disable the failing tests and open a ticket to deal with the problem properly.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Dec 27, 2020

#3016

@gojomo
Copy link
Collaborator

gojomo commented Dec 27, 2020

This is another case where for the sake of maintaining the tests a trustworthy indicator of whether the code is ready for release, I would not be so quick to disable them, even with a note-linking-to-a-pending-issue.

If previous functionality is now broken due to outside library changes – and also broken on develop, not just this PR branch where it's 1st noticed – the tests should remain failing until either (a) a true fix/update is made; or (b) an explicit decision is made/announced/added-to-release-notes that such functionality is no longer supported (or temporarily not supported).

If test-disablement is chosen, because that functionality can be ignored for a while, it'd be best done in a standalone PR that's applied to develop separately, rather than mixed with a different set of in-process changes.

@piskvorky
Copy link
Owner

piskvorky commented Dec 27, 2020

@mpenkov can you please look into removing sklearn, as the proper fix, independent of this PR? (part of #2852, due for 4.0.0 anyway).

I agree the changes to unit tests do not belong here.

@mpenkov
Copy link
Collaborator Author

mpenkov commented Dec 28, 2020

Sure, I'll have a look at it, most likely in the New Year.

I disabled the tests because it was least bad alternative out of the following list:

  1. Do not merge until tests are properly fixed (I don't know when that will be,, and chances are this PR will be stale by then)
  2. Merge with broken tests (this is bad form, because now other contributors will trip over the same failures)
  3. Disable unit tests temporarily

If you disagree with my decision, I can re-enable the tests in a separate PR. It's simple.

@gojomo
Copy link
Collaborator

gojomo commented Dec 29, 2020

  1. Merge with broken tests (this is bad form, because now other contributors will trip over the same failures)

If this PR's changes caused the test-failures (even via bumping dependency versions), I'd agree that merging it would be a problem. But, I'm pretty sure develop without this PR would also be failing, given the upstream changes. The develop trunk only looks 'healthy' because no build/test has been triggered, and this PR was only "unlucky" to be the 1st experiencing the side-effects of external changes.

Given that, merging this PR, with that known-but-unrelated problem, wouldn't create any new problems for others, so I think (2) would be OK. With regard to passing tests, I think the precise policy could be: "any PR shouldn't break any tests that were working before the PR - but it doesn't have to fix everything that's currently broken".

@mpenkov mpenkov changed the title get rid of pattern dependency remove pattern dependency Jan 17, 2021
@mpenkov mpenkov merged commit 67f45da into develop Jan 17, 2021
@mpenkov mpenkov deleted the rmpattern branch January 17, 2021 07:14
@piskvorky piskvorky changed the title remove pattern dependency Remove pattern dependency Mar 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaks backward-compatibility Change breaks backward compatibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot use gensim 3.8.x when nltk package is installed
3 participants