Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

itertools.grouper #50271

Closed
lieryan mannequin opened this issue May 14, 2009 · 4 comments
Closed

itertools.grouper #50271

lieryan mannequin opened this issue May 14, 2009 · 4 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@lieryan
Copy link
Mannequin

lieryan mannequin commented May 14, 2009

BPO 6021
Nosy @rhettinger, @ericsnowcurrently

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/rhettinger'
closed_at = <Date 2009-05-14.17:13:30.205>
created_at = <Date 2009-05-14.17:05:29.836>
labels = ['type-feature', 'library']
title = 'itertools.grouper'
updated_at = <Date 2012-06-29.22:20:33.591>
user = 'https://bugs.python.org/lieryan'

bugs.python.org fields:

activity = <Date 2012-06-29.22:20:33.591>
actor = 'eric.snow'
assignee = 'rhettinger'
closed = True
closed_date = <Date 2009-05-14.17:13:30.205>
closer = 'rhettinger'
components = ['Library (Lib)']
creation = <Date 2009-05-14.17:05:29.836>
creator = 'lieryan'
dependencies = []
files = []
hgrepos = []
issue_num = 6021
keywords = []
message_count = 4.0
messages = ['87743', '87745', '87750', '87756']
nosy_count = 4.0
nosy_names = ['rhettinger', 'lieryan', 'cvrebert', 'eric.snow']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = None
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue6021'
versions = ['Python 3.1', 'Python 2.7']

@lieryan
Copy link
Mannequin Author

lieryan mannequin commented May 14, 2009

An itertool to Group-by-n

>>> lst = range(15)
>>> itertools.grouper(lst, 5)
[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]

This function is often asked in several c.l.py discussions, such as these:
http://comments.gmane.org/gmane.comp.python.general/623377
http://comments.gmane.org/gmane.comp.python.general/622763

There are several issues. What should be done if the number of items in
the original list is not exactly divisible?

  • raise an error as default
  • pad with a value from 3rd argument
  • make the last one shorter, maybe using keyword arguments or sentinel
    to 3rd argument

or should there be separate functions for each of them?

What about infinite list? Most recipes for the function uses zip which
breaks with infinite list.

@lieryan lieryan mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels May 14, 2009
@rhettinger
Copy link
Contributor

This has been rejected before.

  • It is not a fundamental itertool primitive. The recipes section in
    the docs shows a clean, fast implementation derived from zip_longest().

  • There is some debate on a correct API for odd lengths. Some people
    want an exception, some want fill-in values, some want truncation, and
    some want a partially filled-in tuple. The alone is reason enough not
    to set one behavior in stone.

  • There is an issue with having too many itertools. The module taken as
    a whole becomes more difficult to use as new tools are added.

@rhettinger rhettinger self-assigned this May 14, 2009
@lieryan
Copy link
Mannequin Author

lieryan mannequin commented May 14, 2009

All implementations relying on zip or zip_longest breaks with infinite
iterable (e.g. itertools.count()).

And it is not impossible to define a clean, flexible, and familiar API
which will be similar to open()'s mode or unicode error mode. The modes
would be 'error' (default), 'pad', 'truncate', and 'partial' (maybe
should suggest a better name than 'partial')

There is an issue with having too many itertools.
The module taken as a whole becomes more
difficult to use as new tools are added.

It should also be weighed that a lot of people are expecting for this
kind of function in itertools. I think there are other functions in
itertools that have more questionable value than groupers, such as starmap.

@rhettinger
Copy link
Contributor

All implementations relying on zip or zip_longest breaks
with infinite iterable (e.g. itertools.count()).

How is it broken?
Infinite in, infinite out.

>>> def grouper(n, iterable, fillvalue=None):
...    args = [iter(iterable)] * n
...    return zip_longest(*args, fillvalue=fillvalue)

>>> g = grouper(3, count())
>>> next(g)
(0, 1, 2)
>>> next(g)
(3, 4, 5)
>>> next(g)
(6, 7, 8)
>>> next(g)

And it is not impossible to define a clean, flexible,
and familiar API which will be similar to open()'s mode
or unicode error mode. The modes would be 'error'
(default), 'pad', 'truncate', and 'partial'

Of course, it's possible. I find that to be bad design. Generally, we
follow Guido's advice and create separate functions rather than overload
a single function with flags -- that is why we have filterfalse()
instead of a flag on filter(). When people suggest an API with multiple
flags, it can be a symptom of hyper-generalization where api complexity
gets substituted for writing a simple function that does what you want
in the first place. IMO, it is easier to learn the zip(g, g, g) idiom
and customize it to your own needs than to learn a new tool with four
flag options that control its output signature.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant