Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use chain.from_iterable in _text.py #333

Merged
merged 2 commits into from Jun 23, 2020
Merged

Use chain.from_iterable in _text.py #333

merged 2 commits into from Jun 23, 2020

Conversation

cool-RR
Copy link
Contributor

@cool-RR cool-RR commented Jun 17, 2020

This is a faster and more idiomatic way of using itertools.chain. Instead of computing all the items in the iterable and storing them in memory, they are computed one-by-one and never stored as a huge list. This can save on both runtime and memory space.

@sloria
Copy link
Owner

sloria commented Jun 17, 2020

Thanks! The _text.py is a vendorized module from pattern.en. Usually I'd suggest making the change upstream as well, but it seems that the pattern library isn't actively maintained. So I think it's OK for things to diverge here.

No other action necessary other than adding yourself to AUTHORS.rst? Can you do that please?

@cool-RR
Copy link
Contributor Author

cool-RR commented Jun 17, 2020

I didn't notice that, thanks for checking.

I ran the tests and got an error on test_tokenize_with_multiple_punctuation, but I see the same error on the dev branch, so I'm guessing it's unrelated.

@cool-RR
Copy link
Contributor Author

cool-RR commented Jun 20, 2020

As far as I know we can move forward with this PR.

@sloria sloria merged commit f3affab into sloria:dev Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants