-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precompute range length and enhance range subscript support #46942
Comments
Attached patch makes range objects precompute their length on creation. See discussion starting at http://mail.python.org/pipermail/python- |
So with this patch, range(10**100) produces an OverflowError: is that I don't much like this aspect of the change: there are uses for for i in range(ridiculously_large_number):
...
if condition_that_occurs_early_in_practice:
break |
On Fri, Apr 25, 2008 at 12:55 PM, Mark Dickinson <report@bugs.python.org> wrote:
For this application, I would use "for i in itertools.count():" |
I guess there needs to be a decision on whether to make range objects of I can see three options, besides leaving things as they are: (1) make large ranges illegal, as with this patch Option 3 seems to me like the ideal from the users' point of view, but I'm Option 2 seems messy: half of one thing and half of the other, but I If Option 1 is indeed the preferred option, then the patch looks good to Whatever happens, we probably also need a documentation update explaining |
Option (3) would require changing both sq_item and sq_length signatures, Option (2) would either require a change for the sq_length signature, or What are the use cases for ranges with length greater than maxsize? Note |
I am currently working on a patch that allows large ranges and large There is still a limit with len(), which seems bound by the size_t limit. I join the current version of the patch. Note: I found more useful to store a "range->end" member, which is the |
Yeah---I'm a bit short on use cases. The one that originally bit me with for i in range(1, p):
if (i_is_a_nonresidue_modulo_p):
break Here p might be a 200-digit prime number, and the situation is that half Of course, it's not hard to rewrite this with a while loop instead; it I'd also note that it's not completely out of the question that something I agree it's a bit strange to have a semi-functional range object, that |
On Fri, Apr 25, 2008 at 4:37 PM, Mark Dickinson <report@bugs.python.org> wrote:
Hmm, AFAIKT there is always at least one non-residue between 1 and p for i in itertools.count(1):
if (i_is_a_nonresidue_modulo_p):
break maybe with an additional check for p > 1. |
Sure. It's just uglier that way. :-) And I feel it would be mildly These really aren't serious objections---just mild preferences. I'll stop |
Given that range() produced a list in the 2.x series (hence limited to So my preference is to mimic the 2.x range's behaviour in this case by (From Python 2.5.1) >>> range(2**99, 2**100)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: range() result has too many items
>>> range(2**99, 2**99+5)
[633825300114114700748351602688L, 633825300114114700748351602689L,
633825300114114700748351602690L, 633825300114114700748351602691L,
633825300114114700748351602692L]
>>> xrange(2**99, 2**99+5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to int |
I've implemented range slicing and x in range(..) in range-sequence.diff |
Reviewing my own patch (range-sequence.diff), I've realized that it is |
My 2 cents: for me is more useful to have a unbound range() than to be For me, range() should mimic a number generator: no limit, no length. |
That's itertools.count() |
It also isn't what range() and xrange() are used for now in 2.x. range() With itertools.count() available for the unbounded iterator case, I |
Fair enough, specially if in the documentation of range() we put "if you Thanks! |
Has a resolution been made on this? |
On the issue of whether len(range()) should be allowed to be > I also think ranges should be introspectable, exposing their start, stop Probably all those changes are for post 3.0. |
Guido, does that mean we can apply this patch to get the cleaner |
This issue was missing a priority setting. Alexander's range-sequence patch still applies cleanly to the Py3k As Alexander notes above, range_contains does still need slightly better Since that explanation got kind of complicated, I've added a modified |
It looks like your range_subscript() forgets to compute the length field... |
My initial patch had a few problems - I removed it and uploaded a |
Good catch Antoine (I missed that in Alexander's patch) - working on |
v2 of my updated patch attached to fix the issue Antoine noted. Also gets rid of some tab-instead-of-spaces indenting issues in the However, the patch still has issues, as can be seen with the new tests I |
By the way, why is this release blocker? do we have performance numbers? |
I wonder if we shouldn't hold off on this. It's a substantial amount of |
Given the problems with it, I'm dropping this to normal priority and (the release blocker status was just temporary to ensure we got a |
I'd still like to have another look at this before beta 1, but can't promise I'll get to it. Unassigning in case someone else has some roundtuits to spare over the next few weeks. |
I brought the patch up to date for the Py3k branch, but realised just before checking it in that it may run afoul of the language moratorium (since it alters the behaviour of builtin range objects). However, the .count() and .index() methods (along with the Sequence ABC registration) as well as the O(1) containment testing for integers were already put in place. (I also noticed that the new methods from issue bpo-9213 are not mentioned in the range() docs, unlike the O(1) optimisation) I've gone ahead and checked it in as r86970, as I see this patch as filling out the promise of the Sequence ABC registration by adding support for slicing and negative indices (with the length caching as more of an implementation detail). However, I'm also leaving the tracker issue open and assigning to Georg in case he wants to revert it before the beta goes out. Note that I also fixed the patch so that OverflowError occurs only when encountering an affected operation (primarily indexing and retrieval of the length). If you don't do any of those things, you can make your ranges as large as you like. (The indexing could fairly easily be fixed to eliminate the overflow errors - I just didn't do it in this patch, since it is a separate problem). |
Wasn't that fixed in bpo-9746? |
On Sat, Dec 4, 2010 at 2:26 AM, Daniel Stutzbach <report@bugs.python.org> wrote:
Ah, I see what you mean. I was more referring to the lack of a Some of my new additions to the range documentation could probably be Cheers, |
Daniel Stutzbach pointed out that range() is also mentioned under: The descriptions of range's limitations there is no longer accurate (slicing is supported following this patch and containment testing is now efficient) |
Want to open a new issue for that? (or is there one already?) |
(Oops, didn't mean to reclose this yet) I want to wait for Georg's verdict on including the functionality in 3.2 before stressing too much about correct documentation of it :) |
I'm fine with it: as with the other changes for .count and .index, consistency with the protocols/ABCs the types are members of is not exclusively a new feature. |
The descriptions of range's limitations in the docs still needs an update. |
Not sure this worth a patch, to me it looks like a removal of a single word. But here it goes anyway. |
Applied in r87162 |
The other major change for ranges is that "in" and "not in" are no longer inefficient for actual instances of int (it does an arithmetic calculation instead of a linear search). |
Is the in/not-in fast path in 2.7? |
No, I believe it was added as part of the .index() and .count() implementation. Checking the source, there's definitely no sq_contains implementation in 3.1 or 2.7. |
here is the patch for the py3k docs. |
Doc change committed to py3k in r87346. Thanks, SilentGhost! I also committed r87349 to reverse r87162 (which was in the wrong branch). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: