New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
itertools: clearer all_equal recipe #92628
Comments
Thank you for the suggestion, but I find the current one to be more clear (perhaps that's just me). Also, part of the purpose of the examples is to show various techniques for using the language. The use of "and" is one such technique. Python's "and" is different from some other languages in that it can return non-boolean values. It is okay to use that feature as designed. Thanks again for the suggestion, but I will decline. |
Not sure why you're saying that. That's not what happens there. The result of that The result of the That I've certainly done that myself, I'm no stranger to it. But that was mostly inside list comprehensions, where I don't have the "luxury" of being able to write a separate statement. I'm interested to see what others think. Should I post this in some mailing list for discussion instead? |
One more thing, what do you think about these? def proposal3(iterable):
g = groupby(iterable)
return not any(g) or not any(g)
def proposal4(iterable):
g = groupby(iterable)
return not (any(g) and any(g)) (same speed as |
I would like to respond to @rhettinger by pointing out that the main problem with the current recipe is that it does not short-circuit when given an empty iterable, making an unnecessary second call to I would also like to add that the wording of the current docstring makes it unclear that the function would return |
FWIW I find the use of |
@blhsing To be clear, for me it's really a clarity issue, I find squeezing the always-true I don't see the unnecessary second I disagree with the docstring change. The current docstring is fine, matches the function name, and is "positive" instead of "doubly negative" like the suggestion. (For clarity: I'm the issue creator, new account due to current PC trouble.) |
Hmm, I agree that And actually, people being less familiar with Yet another, which I think reads particularly easily, as it tells you in words exactly what it does: def all_equal(iterable):
groups = groupby(iterable)
for first in groups:
for second in groups:
return False
return True But I already know this isn't everyone's cup of tea. Again, speed is not my point, but I'm curious, so here's an updated benchmark:
Benchmark scriptdef original(iterable):
g = groupby(iterable)
return next(g, True) and not next(g, False)
def next__not_next(iterable):
g = groupby(iterable)
next(g, None)
return not next(g, False)
def not_next_or_not_next(iterable):
g = groupby(iterable)
return not next(g, False) or not next(g, False)
def not_any_or_not_any(iterable):
g = groupby(iterable)
return not any(g) or not any(g)
def not__any_and_any(iterable):
g = groupby(iterable)
return not (any(g) and any(g))
# blhsing's from https://discuss.python.org/t/slight-improvement-to-the-all-equal-recipe-in-itertools-doc/46390
def not__next_and_next(iterable):
g = groupby(iterable)
return not (next(g, False) and next(g, False))
def for_for_False_True(iterable):
groups = groupby(iterable)
for first in groups:
for second in groups:
return False
return True
def for_for_False_True_True(iterable):
groups = groupby(iterable)
for first in groups:
for second in groups:
return False
return True
return True
funcs = [original, next__not_next, not_next_or_not_next, not_any_or_not_any, not__any_and_any, not__next_and_next, for_for_False_True, for_for_False_True_True]
iterables = (), (1,), (1, 2)
from timeit import timeit
from statistics import mean, stdev
import sys
import random
from itertools import groupby
for iterable in iterables:
print(f'{iterable = }')
times = {f: [] for f in funcs}
def stats(f):
ts = [t * 1e9 for t in sorted(times[f])[:10]]
return f' {mean(ts):3.0f} ± {stdev(ts):3.1f} ns '
for _ in range(100):
random.shuffle(funcs)
for f in funcs:
t = timeit(lambda: f(iterable), number=1000) / 1000
times[f].append(t)
for f in sorted(funcs, key=stats):
print(f(iterable), stats(f), f.__name__)
print()
print('Python:', sys.version) |
Use of |
In my opinion, those things are all worth knowing. Do you not agree? No, neither trick puzzle nor code golf is the intention. Clarity is. Like I said at Stack Overflow, it reads like "All elements are equal iff ... there is no first group or there is no second group.". Or specifically for the |
Knowing, yes. Having to think about all four at once just to work out the code is checking the length of an iterator? Disagree.
That "second" you parenthesized is kind of the point though. The code doesn't say what you say it says. It says "x and x". And "any" is not normally read like that anyway, it's read as "any true". So what you have actually written reads at first as "is not (any group true or is any group true)" which naturally converts to "is any group false or is any group false" to anyone used to booleans. Relying on side-effects and truthiness to turn that into what you meant is what I consider unreadable code golf about it. I don't think I'm going to convince you at this point, though :) |
Oh just to add! I really appreciate your taking the time to explain, even when I don't agree. Thanks! |
@alicederyn To me it does read like I said, but that's indeed with me already understanding it. Not how it reads "at first", as you said. Thanks for your thoughtful feedback. I'd like to ask for another, about the original issue, which still bothers me. The current recipe and my proposal 1 and proposal 2. How would you judge those three regarding clarity? |
I think proposal 2 is the clearest. I like that the empty case is handled explicitly rather than "falling through" the first next call, which is a little bit more mental load otherwise. |
Thanks. For me, proposal 1 is clearest. Its standalone But yes, I also agree with your reason for finding proposal 2 clearest. For me, it's in the middle of the three. Meaningfully using the first @blhsing Do you also find proposal 2 clearer than the current recipe? It's equivalent to yours (except you De Morgan'ed it), but your point was efficiency, you didn't comment on its clarity. |
This was also discussed on Discourse and the recipe has been changed to the following, making this issue obsolete: def all_equal(iterable, key=None):
"Returns True if all the elements are equal to each other."
return len(take(2, groupby(iterable, key))) <= 1 |
PR for reference: #116081 |
Very clean. Love it! |
The Itertools Recipes have this recipe:
The
next(g, True)
tries to read a first group and if there isn't one, then... uh... actuallynext(g, True)
is always true, so that's squeezed into thereturn
expression just for the side effect of skipping the first group. Psych!I think for official recipes, it would be better to write clearer code. Two ideas for replacing it:
Proposal 1:
This makes the first
next
call a statement that makes it obvious we're not using its result and just do it for skipping the first group. I also replacedTrue
withNone
to further indicate this.Proposal 2:
This reads as "There is no first group or there is no second group". This actually uses the result of the first
next(g, False)
: If it shows that there is no first group, then the function returnsTrue
immediately. It doesn't even try to look for a second group (unlike the original recipe and my first proposal, which both do). Also, it's simpler and nice to have the exact same expressionnot next(g, False)
twice.Really my point is clarity, but for completeness let me also show benchmarks. My first proposal was faster than the original in every case, and the most significant difference is my second proposal being faster for empty iterables.
Python 3.10.4 on Debian (on a Google Compute Engine instance):
Python 3.10.4 on Windows (my laptop):
The benchmark code:
The text was updated successfully, but these errors were encountered: