Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate 'return_str' parameter in NLTKWordTokenizer and TreebankWordTokenizer #2883

Merged
merged 2 commits into from Nov 18, 2021

Conversation

adamjhawley
Copy link
Contributor

@adamjhawley adamjhawley commented Nov 10, 2021

Resolves #2879

As per the discussion on #2879, I have deprecated the use of the 'return_str' parameter in the tokenize method of both NLTKWordTokenizer and TreebankWordTokenizer.

I have currently done this in a way that raises a DeprecationWarning if a non-default value is passed to the parameter.

Copy link
Member

@tomaarsen tomaarsen left a comment

I have two small comments on your work. Beyond that, looks great!

Eventually we'll want the deprecation of return_str to be described in the method documentation, but the PR for that documentation is still open: #2878. That's something for when both this and that PR are merged (or at least one of the two).

nltk/tokenize/destructive.py Outdated Show resolved Hide resolved
nltk/tokenize/treebank.py Outdated Show resolved Hide resolved
@adamjhawley adamjhawley force-pushed the feature/deprecate_return_str branch from 6b78e28 to 7f6ebf8 Compare Nov 10, 2021
@adamjhawley
Copy link
Contributor Author

@adamjhawley adamjhawley commented Nov 10, 2021

@tomaarsen thanks for the review! I have amended the commit to contain your suggestions

Copy link
Member

@tomaarsen tomaarsen left a comment

Looks good to me! Thank you.
I'll leave this open for a little bit so others can have a look at it.

@12mohaned
Copy link
Contributor

@12mohaned 12mohaned commented Nov 15, 2021

I have a nit comment, I would suggest refactoring the if's in Treebank and WordTokenizer to be:

if return_str:
  ..........

instead of

if return_str is not False:
  ..........

It is more readable and faster. Also, it is more pythonic and follows the PEP8 standard.

nltk/tokenize/destructive.py Outdated Show resolved Hide resolved
as suggested by 12mohaned
@stevenbird stevenbird merged commit e629d7e into nltk:develop Nov 18, 2021
16 checks passed
@stevenbird
Copy link
Member

@stevenbird stevenbird commented Nov 18, 2021

Thanks @adamjhawley, @12mohaned, @tomaarsen

@adamjhawley adamjhawley deleted the feature/deprecate_return_str branch Nov 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants