-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in range() function for large values #45874
Comments
Range accepts arguments coerce-able into ints via __int__, but rejects The problem is in handle_range_longs in bltinmodule.c:1527-1541. If they Attached is a file that reproduces this bug. |
Is this a bug? print range(int(MyInt(2**64)), int(MyInt(2**64+10))) |
Yes, that is a workaround, but range(MyInt(n), MyInt(n+10)) should work for any valid value of n, not just some of them. |
FWIW, using xrange() it seems to give the proper error message: Traceback (most recent call last):
File "bad_range.py", line 12, in <module>
print xrange(MyInt(2**64), MyInt(2**64+10))
OverflowError: long int too large to convert to int |
Yes, the error for xrange is more illustrative of the problem, but just |
According to the documentation |
This is related to bpo-1540617 and bpo-1546078. bpo-1540617 contains a simple patch that extend acceptable range of bpo-1546078 contains a complete long integer support implementation and It looks like all three issues can be closed by either accepting or |
I'm -10 on the patch in bpo-1540617 ( +/-2**63). Reason: It's a good thing that the range of "range" is limited since it |
Christian, I was probably a bit sloppy using "range" instead of "xrange," but bpo-1540617 is limited to xrange only. Are you still -10 on extending |
Christian, Are you working on the memory problem or on this issue? I think I have a |
Attached patch addresses OP's issue: $ ./python.exe bad_range.py
[8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
here
[18446744073709551616L, 18446744073709551617L, 18446744073709551618L,
18446744073709551619L, 18446744073709551620L, 18446744073709551621L,
18446744073709551622L, 18446744073709551623L, 18446744073709551624L,
18446744073709551625L]
[18446744073709551616L, 18446744073709551617L, 18446744073709551618L,
18446744073709551619L, 18446744073709551620L, 18446744073709551621L,
18446744073709551622L, 18446744073709551623L, 18446744073709551624L,
18446744073709551625L] The only existing test that fails is range(1e100, 1e101, 1e101) >>> range(1e100, 1e101, 1e101)
__main__:1: DeprecationWarning: integer argument expected, got float
[10000000000000000159028911097599180468360808563945281389781327557747838
772170381060813469985856815104L]
Note that range(1e100, 1e101) would still fail:
>>> range(1e100, 1e101)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: range() result has too many items An alternative solution would be to disallow non-ints regardless of |
Alexander Belopolsky's patch looks like the right fix for range() to me. The xrange limits still hold, but that should probably be a separate |
I'm just curious to know which is the right behavior. # Python 3.0
Traceback (most recent call last):
File "bad_range.py", line 7, in <module>
print(range(MyInt(2**3), MyInt(2**3+10)))
TypeError: 'MyInt' object cannot be interpreted as an integer
# Python 2.7a0
[8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
here
[18446744073709551616L, 18446744073709551617L, 18446744073709551618L,
18446744073709551619L, 18446744073709551620L, 18446744073709551621L,
18446744073709551622L, 18446744073709551623L, 18446744073709551624L,
18446744073709551625L]
Traceback (most recent call last):
File "bad_range.py", line 11, in <module>
print(range(MyInt(2**64), MyInt(2**64+10)))
TypeError: range() integer start argument expected, got instance. |
I think *both* behaviors are wrong, the 3.0 one is backwards For our particular use case, it is very annoying to not be able to use |
Updating versions. |
This change is out of scope for 2.5.3 (plus, the desired behavior still |
As far as I can tell there's no bug in 3.x: the 3.x range happily accepts an instance of a class that defines __index__. |
Currently, in trunk, I get: >>> range(0.0, 11.0, 1.1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: range() integer start argument expected, got float. But with Alexander's patch on trunk, I get: >>> range(0.0, 11.0, 1.1)
[0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L] I'm not sure whether this is intentional or not, but IIRC it was a very deliberate choice not to allow float arguments to range (especially when, as here, the arguments are simply being truncated). I don't think this is an acceptable change for 2.7 (still less for 2.6). Any patch for this issue should not change the behaviour for small arguments. IMO, the *right* solution is to convert arguments via __index__ when possible (as 3.x appears to already do). However, this would be a new feature. I suggest closing this as a 'won't fix'. |
Ignore this bit. IDRC. It was a deliberate choice not to let something range(0.0, 1.0, 0.1) work to give [0.0, 0.1, ...], since the usual floating-point difficulties give uncertainty about the endpoint. That float arguments were allowed (and silently truncated to integers) is merely unfortunate. :) And it's no longer permitted in 2.7; I wouldn't want to go back to permitting float arguments here. I'll set this to pending; it should be closed unless someone comes up with a simple fix in the near future. |
I agree that this issue should be closed with no further action, but for historical accuracy the resolution should be "out of date" rather than "won't fix". The original bug was about range() behavior when it get arguments that are not ints, but "coerce-able into ints via __int__". Since range() no longer accepts such arguments, the issue is moot and there is nothing to fix or not fix here. As a pie in the sky idea, I always wanted a range function that would work on any arguments that support addition and ordering. For example range(date(2010,1,1), date(2010, 2, 1), timedelta(7)) to return all Fridays in January, 2010. However, since advent of generator functions, this became simply def range(start, stop, step):
while start < stop:
yield start
start += step and thus unnecessary for stdlib. |
Alexander: range *does* still accept such arguments (in 2.7); just not floats: >>> from decimal import Decimal
>>> range(Decimal(20), Decimal(20))
[]
>>> range(Decimal('1e100'), Decimal('1e100'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: range() integer start argument expected, got Decimal. |
BTW, I've changed the various "nb_int should return int object" error messages in Objects/intobject.c to the more meaningful and accurate message: "__int__ method should return an integer", in r80695. |
Attached patch, bpo-1533.diff, simplifies reference acrobatics at the expense of extra local variables. For the first time the first compilation did not have reference counting errors! |
On Sun, May 2, 2010 at 4:37 AM, Mark Dickinson <report@bugs.python.org> wrote:
Working on it ... |
Done. |
Thanks for the fixes. The latest patch looks good to me. Alexander, is it okay for me to commit this? |
On May 4, 2010, at 7:27 AM, Mark Dickinson <report@bugs.python.org>
Sure. Why do you have to ask? |
Hi Alexander, I took the liberty of messing with your patch slightly; I didn't want to ask you to make further changes since the patch was fine, and my messing was mostly just to satisfy my own fussiness (only the first two items were really necessary): Minor updates:
Any objections or comments? Can I apply this version of the patch? |
I see. Let me take a look. BTW, I tried to use tabs for indentation in the patch, but emacs was On May 4, 2010, at 8:34 AM, Mark Dickinson <report@bugs.python.org>
|
Re emacs: C-c . python should set a python 2.x-friendly indentation mode. There's also a python-new style floating around somewhere on the web (not part of emacs as standard), suitable for the 4-space indent style that's supposed to be used for new code. |
On Tue, May 4, 2010 at 8:34 AM, Mark Dickinson <report@bugs.python.org> wrote:
Thank you for doing it. The patch is good to go, questions and comments below are just for my education.
What is the standard practice on this? I thought Misc/NEWS entry was similar to commit message and would be written by a committer. What are the guidelines for writing such entries? I see that some entries simply repeat the title of the issues while others got into greater details.
I've looked at my patch through cat -et and couldn't find any tab/space issues. I did notice one empty line with a single space that you removed. Do you use an automated checking tool for this? I just did grep " $". BTW, the current trunk version (r80752) of Python/bltinmodule.c has two lines with trailing white space: $ cat -n Python/bltinmodule.c | grep " $" | cat -et
641^I^I^IPyErr_SetString(PyExc_TypeError, $
2966^I $
With arguments processing segregated in a common function, I am not sure whether additional tests actually increase coverage. This brings a question: what is the recommended way to produce coverage statistics for C modules? regrtest.py --coverage seems to be for python modules only.
Good catch. Wouldn't the same argument apply to ilow? Wouldn't static code checkers complain about redundant initialization?
I actually disagree that your regrouping makes the code clearer. In my version, all idiosyncratic argument processing is done with borrowed references first followed by common processing in three similar blocks. This, however, is purely matter of taste. Note that I considered changing i* names to raw* names, but decided not to introduce more changes than necessary. In your grouping, however, the similarity of variable names is more of an issue. This said, I don't really have any problem with your choice.
Yes, I considered that. A further micro-optimization would be to initialize a static variable in module initialization and reuse it in get_len_of_range_longs as well. I decided to put it next to zero instead to simplify the logic.
Is it really worth it? Default start is probably not that common in case of long arguments.
The above are not objections. Please apply this version of the patch. |
[Some of the Alexander's questions about procedures aren't really related to this issue; I've answered those offline. Here are the answers to the others.]
ilow doesn't need to be initialized because the PyArgs_ParseTuple is Do static code checkers really complain about redundant
Okay, fair enough. I agree it's a matter of taste. I like the three
Hmm. Possibly. I have an unhealthy and probably irrational aversion
Yes, possibly not. :) Partly I made this change because the |
Applied to trunk in r80758. Do people want this to go into 2.6 as well? The patch would need to be modified to produce a warning for floats instead of giving a TypeError (and the tests would need to be modified to test for that warning). |
On Tue, May 4, 2010 at 12:32 PM, Mark Dickinson <report@bugs.python.org> wrote:
Also, should additional unit tests forward ported to 3.x? If so, let |
+1 for forward-porting/adapting relevant tests to py3k. |
I am attaching a py3k patch that adds new tests. Since there are no end user visible changes, I don't believe a Misc/NEWS entry is needed. A commit message may read: Issue bpo-1533: Tests only. Added tests for consistency in range function argument processing. Note that the first chunk: - # Reject floats when it would require PyLongs to represent.
- # (smaller floats still accepted, but deprecated)
+ # Reject floats.
+ self.assertRaises(TypeError, range, 1., 1., 1.) is applicable to the trunk as well. |
Perfect! Committed in r80836 (py3k); fixed that one test and comment in r80842 in trunk. Alexander, do you want to tackle the 2.6 backport? BTW, I think in most cases it's unnecessary to add Python 3.3 to the Versions field above, since there's no corresponding svn branch for 3.3. (But it's useful if there's a task that's specifically aimed at 3.3 and not earlier versions---e.g. if something's been deprecated in 3.2 and needs to be removed in 3.3.) |
That should have been r80839, not r80842. |
I've never done a maintenance branch backport, but here is my attempt:
I could probably use svn merge instead of svn diff + patch. Did I miss anything important? BTW, I've discovered "make patchcheck", does it check C files white space issues or only python files? Mark, I've sent you two emails off the tracker, but it looks like they did not make it through your spam filters. |
Yes, that sounds about right. But after all that, you'll still need to modify the patch somewhat, since the requirements are different for 2.6: floats should give a DeprecationWarning rather than a TypeError. I think that's a straightforward change for Python/bltinmodule.c. The trickier bit is coming up with tests that work properly---i.e., check that the appropriate warnings are produced, *and* that the the appropriate values are returned. Look into the 'catch_warnings' function in the warnings module; (there's also 'check_warnings' in test_support, but I think that doesn't exist in 2.6). 'make patchcheck' only checks Python files and ReST files, as far as I can tell. [I got your off-tracker emails; will respond anon.] |
I thought about it, but the comment in test_builtin.py,
convinced me that raising TypeError on large floats is a feature. I don't have a strong opinion on this issue, but I think a conservative approach is not to change current behavior in the maintenance branch unless it is clearly a bug. I did add a test checking that "smaller floats still accepted, but deprecated." |
Yes, okay---that makes some sense; I'm happy to leave floats as they are (i.e., DeprecationWarning for small floats; TypeError for larger floats) and just fix use of __int__ for non-floats. I'll look at the patch. |
The backport looks fine. Applied in r80917. Thanks, Alexander. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: