Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a maxsplit, reverse parameters to split_ #381

Closed
jferard opened this issue Jan 30, 2020 · 2 comments
Closed

Add a maxsplit, reverse parameters to split_ #381

jferard opened this issue Jan 30, 2020 · 2 comments

Comments

@jferard
Copy link
Contributor

jferard commented Jan 30, 2020

maxsplit

more-itertools has various split_ functions (split_at, split_after, split_before, split_when, split_into) but none of these functions has a way to stop the split after a given number of elements. Of course, we can take the first elements of a split_... iterator and swtich to the original iterator:

>>> import more_itertools as mi, itertools
>>> s = '0,1,2,3,4,5,6,7,8,9'
>>> it1 = iter(s)
>>> it2 = mi.split_after(it1, lambda x: x == ",")
>>> list(itertools.islice(it2, 3)) # it *2*
[['0', ','], ['1', ','], ['2', ',']]
>>> list(it1) # it *1*
['3', ',', '4', ',', '5', ',', '6', ',', '7', ',', '8', ',', '9']

But it would be more convenient to write, as we do with str.split:

>>> import more_itertools as mi, itertools
>>> s = '0,1,2,3,4,5,6,7,8,9'
>>> it2 = mi.split_after(s, lambda x: x == ",", maxsplit=3)
>>> list(it2) # it *2*
[['0', ','], ['1', ','], ['2', ','], ['3', ',', '4', ',', '5', ',', '6', ',', '7', ',', '8', ',', '9']]

(The last parameter could be an iterator to remain lazy).

reverse

We could have another parameter, named reverse to mimic str.rsplit:

>>> it2 = mi.split_after(s, lambda x: x == ",", maxsplit=3, reverse=True)
>>> list(it2) # it *2*
[['0', ',', '1', ',', '2', ',', '3', ',', '4', ',', '5', ',', '6', ','], ['7', ','], ['8', ','], ['9']]

(Or create rsplit_at, rsplit_after, ... but that would be a little cumbersome)

Implementation should not be difficult.

@bbayles
Copy link
Collaborator

bbayles commented Jan 30, 2020

I'd entertain a PR for the maxsplit additions, and a separate one for reverse.

@jferard
Copy link
Contributor Author

jferard commented Feb 7, 2020

@bbayles Would you allow:

  • a keep_separator parameter for split_at (useful for the implementation)
  • the name rsplit for the parameter instead of reverse because reverse might be confusing
    ?

Examples (handwritten):

>>> list(mi.split_at("a,bc,def", lambda c: c==',', keep_separator=True))
[['a'], [','], ['bc'], [','], ['def']]

>>> list(mi.split_at("a,bc,def", lambda c: c==',', maxsplit=1, rsplit=True))
[['a,bc'], ['def']]

@jferard jferard mentioned this issue Feb 8, 2020
@bbayles bbayles closed this as completed Dec 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants