-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create a numbits() method for int and long types #47689
Comments
Python 3.0b2 (r30b2:65106, Jul 18 2008, 18:44:17) [MSC v.1500 32 bit
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import math
>>> math.frexp(10**100)
(0.5714936956411375, 333)
>>> math.frexp(10**1000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C double
>>> (Same behavior in Python 2.5.2, and presumably 2.6 although I haven't I think it should be easy to make frexp work for large integers by My reason for requesting this change is that math.frexp is the fastest Actually, it would be even more useful to have a standard function to |
Would you like to work on a patch? |
I prefer your idea to expose PyLong_Numbits(). IMO, frexp() is very |
Another reason to leave frexp() untouched is that it is tightly assert ldexp(*frexp(pi)) == pi This relationship is bound to get mucked-up or confused if frexp starts |
Raymond, yes, I think that a separate numbits function would better, Good point about roundtripping, but the problem with that argument is >>> ldexp(*frexp(10**100)) == 10**100
False Anyway, if there is support for exposing _PyLong_Numbits, should it be a I'm attaching a patch (for the trunk) that adds a numbits method to the A slight problem is that _PyLong_NumBits is restricted to a size_t, so >>> (1<<3*10**9).numbits()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to int This would need to be fixed somehow. If numbits becomes a method, should it also be added to the Integral If it's too late to add this method/function for 2.6 and 3.0, please |
numbers.Integral is already way too fat of an API. Am -1 on expanding FWIW, in Py2.6 you can already write: def numbits(x):
return len(bin(abs(x))) - 2 |
I'd also be interested in having _PyLong_NumBits exposed to Python in some My favorite workaround uses hex instead of bin: 4*len('%x'%x) - correction_dictionary[first_hexdigit_of_x] but this is still O(log x) rather than O(1). |
With the patch, the following code causes a
[... interpreter hang ...] The culprit is, of course, the statement if (n < 0) in int_numbits: LONG_MIN is negated to itself (this may The patch also needs documentation, and that documentation One could make a case for (0).numbits() raising ValueError: Other than those two things, I think the patch looks fine. |
One possible fix would be to compute the absolute value unsigned long absn; Might this work? Perhaps it would also be worth changing the tests self.assertEqual((-a).numbits(), i+1) to self.assertEqual(int(-a).numbits(), i+1) This would have caught the -LONG_MAX error. |
Wow, newbie error. Thanks for spotting! In every case I can think of, I've wanted (0).numbits() to be 0. The |
Me too, in most cases, though I've encountered the occasional case where So I agree that (0).numbits() should be 0, but I think there's enough
The docstring looked okay to me. There should be more comprehensive |
To add support to the proposal: there is currently yet another thread on |
Some elaboration (that perhaps could be adapted into the documentation There are two primary uses for numbits, both of which justify The first is that for positive k, n = k.numbits() gives the minimum In Python terms, one could say that self.numbits() "returns the smallest Second, k.numbits() (plus/minus 1, or perhaps multiplied by a constant In particular, if L is a sorted list, then len(L).numbits() exactly Finally, the convention (-k).numbits() == k.numbits() is useful in |
I changed the title since I agree that numbits() with long integer is |
Accidentally removed the following message from Victor Stinner;
One other note: in Fredrik's patch there's commented out code for a Are there general guidelines for making things properties? |
No problem.
Since numbits() cost is O(n) with n: number of digits. I prefer a |
Aesthetically, I think numbits as a function would make more sense.
Unless I missed something, numbits() is O(1). Only the topmost word in a
O(1) is necessary but not sufficient. My sense is that an attribute |
Ooops, you're right. I looked quickly at the patch and I |
I consider .numbits to be an internal property of ints and would prefer Guido has said that the nuisance of tacking on otherwise unnecessary Another tack is to notice that numbits is the length of the bit sequence |
FWIW, I'm one of the people who'd additionally find indexing and slicing |
A property /looks/ like an attribute and an user might try to change |
Properties can be made read-only. Also, there is a good precedent: |
One more minor deficiency in the patch: it gives incorrect results for >>> x = 1 << 2**31-1
>>> x <<= 2**31-1
>>> x.numbits() # expect 4294967295
4294967295L
>>> x <<= 2
>>> x.numbits() # expect 4294967297
4294967295L It would be nicer if the OverflowError from _PyLong_NumBits were Alternatively, in case of OverflowError one could recompute numbits |
Why not, but I prefer your second proposition: return a long integer. >>> x=1<<(2**31-1)
>>> n=x.numbits(); n, n.numbits()
(2147483648L, 32L)
>>> x<<=(2**31-1)
>>> n=x.numbits(); n, n.numbits()
(4294967295L, 32L)
>>> x<<=1
>>> n=x.numbits(); n, n.numbits()
(4294967296L, 33L) # yeah! With my patch, there are two functions:
|
Hi, Victor! Thanks for the updated patch. Your patch still hangs on:
on my 32-bit machine. |
Ooops, nice catch! It was an integer overflow, already present in |
The latest patch from Victor looks good. A few comments: (1) the number of bits should be computed first directly using C (2) Just as a matter of style, I think "if (x == NULL)" is preferable (3) the reference counting all looks good. (4) I quite like the idea of having numbits be a property rather than a |
Okay; I don't have strong feelings about the form the Python code takes; |
IMO, the choices are something like my version or none at all. The Anyone who disagrees needs to show both code fragments to some junior No fair trying this experiment on assembly language programmers ;-) |
Why would finance people be interested in bit_length()? I think this discussion begins to feel like bikeshedding. Documentation |
Antoine, it's not bike-shedding at all. Communicative docs are Yes, you're correct, I can fix-up the docs after the patch is posted. |
Updated patch. |
Bah. Fix test_int so that it actually works. |
...and use proper unittest methods instead of asserts... |
Looks good. Marking as accepted. Before applying, consider adding back the part of the docs with the '1 + |
My only (minor) objection to this definition is that a straight Python >>> from math import floor, log
>>> n = 2**101
>>> n.bit_length()
102
>>> 1 + floor(log(n)/log(2))
101.0
>>> n = 2**80-1
>>> n.bit_length()
80
>>> 1 + floor(log(n)/log(2))
81.0 But as you say, it provides another perspective; I'm fine with |
Also, consider writing it in the two argument form: 1 + floor(log(n, 2)) and using the word "approximately" or somesuch. |
Committed to trunk in r67822, py3k in r67824. |
Really? YEAH! Great job everybody ;-) I read the code and it looks |
32 bytes isn't worth sharing between modules. |
Posted some doc cleanups in r67850 and r67851. |
About the footnote: floor(log(n, 2)) is poor code. This is not supposed to be a dramatic If 1 + floor(log(n, 2)) happens to give the correct result in the common In the case of IEEE 754 doubles, a large part of the luck is that the So I don't like seeing this poor code in the Python reference manual, for IMO, either: (1) the warning needs to be stronger, or (2) the formulation Mark |
Other possible wording: ... so that |
I guess that could work. One other thing: the docs for the trunk seem to suggest that we should http://docs.python.org/dev/library/stdtypes.html#numeric-types-int- and particularly the note (2), that says of int(x) and long(x): "Deprecated since version 2.6: Instead, convert floats to long |
There is a small mistake in the docs: def bit_length(x):
'Number of bits necessary to represent self in binary.'
s = bin(x) # binary representation: bin(-37) --> '-0b100101'
s = s.lstrip('-0b') # remove leading zeros and minus sign
return len(s) # len('100101') --> 6 is probably supposed to be: def bit_length(x):
'Number of bits necessary to represent x in binary.'
s = bin(x) # binary representation: bin(-37) --> '-0b100101'
s = s.lstrip('-0b') # remove leading zeros and minus sign
return len(s) # len('100101') --> 6 |
Yes there was. Fixed in 82146. |
Minor addendum to Mark's last message: in the near release version of 2.7 (rc2), note 2 in 5.4. "Numeric Types — int, float, long, complex" now starts "Conversion from floats using int() or long() truncates toward zero like the related function, math.trunc()" and no longer marks the usage of long as deprecated. |
So there are two issues here:
I'm not sure which Terry is referring to here. On the first, I don't think use of int() with float arguments actually *is* deprecated in any meaningful way. At one point there was a push (related to PEP-3141) to deprecate truncating uses of int and introduce a new builtin trunk, but it never really took hold (and trunc ended up being relegated to the math module0; I certainly don't expect to see such deprecation happen within the lifetime of Python 3.x, so I don't think it would be appropriate to mention it in the 2.x docs. On the second, it's possible that there should be a mention somewhere in the 2.x docs that long() no longer exists in 3.x, and that for almost all uses int() works just as well. A separate issue should probably be opened for this. |
Aargh! |
Please open a new issue for the documentation problems, it's no more related to "numbits()" method and this issue is closed. |
Whoops, sorry to create confusion when I was trying to put this issue to rest completely. Let me try better. In his last message, Raymond said "Other possible wording: That is what the current 2.7rc2 doc says. In response, Mark said "I guess that could work." but quoted footnote 2, which implied that 'int' should be changed to 'trunc' in the example above. This implied to me that there was a lingering .numbits doc issue. I hope this is the end of this. |
Ah; sorry for misunderstanding. Thanks for the explanation, Terry! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: