Skip to content

Commit

Permalink
gh-111140: Adds PyLong_AsNativeBytes and PyLong_FromNative[Unsigned]B…
Browse files Browse the repository at this point in the history
…ytes functions (GH-114886)
  • Loading branch information
zooba committed Feb 12, 2024
1 parent a82fbc1 commit 7861dfd
Show file tree
Hide file tree
Showing 14 changed files with 533 additions and 26 deletions.
66 changes: 66 additions & 0 deletions Doc/c-api/long.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,28 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
retrieved from the resulting value using :c:func:`PyLong_AsVoidPtr`.
.. c:function:: PyObject* PyLong_FromNativeBytes(const void* buffer, size_t n_bytes, int endianness)
Create a Python integer from the value contained in the first *n_bytes* of
*buffer*, interpreted as a two's-complement signed number.
*endianness* may be passed ``-1`` for the native endian that CPython was
compiled with, or else ``0`` for big endian and ``1`` for little.
.. versionadded:: 3.13
.. c:function:: PyObject* PyLong_FromUnsignedNativeBytes(const void* buffer, size_t n_bytes, int endianness)
Create a Python integer from the value contained in the first *n_bytes* of
*buffer*, interpreted as an unsigned number.
*endianness* may be passed ``-1`` for the native endian that CPython was
compiled with, or else ``0`` for big endian and ``1`` for little.
.. versionadded:: 3.13
.. XXX alias PyLong_AS_LONG (for now)
.. c:function:: long PyLong_AsLong(PyObject *obj)
Expand Down Expand Up @@ -332,6 +354,50 @@ distinguished from a number. Use :c:func:`PyErr_Occurred` to disambiguate.
Returns ``NULL`` on error. Use :c:func:`PyErr_Occurred` to disambiguate.
.. c:function:: Py_ssize_t PyLong_AsNativeBytes(PyObject *pylong, void* buffer, Py_ssize_t n_bytes, int endianness)
Copy the Python integer value to a native *buffer* of size *n_bytes*::
int value;
Py_ssize_t bytes = PyLong_CopyBits(v, &value, sizeof(value), -1);
if (bytes < 0) {
// Error occurred
return NULL;
}
else if (bytes > sizeof(value)) {
// Overflow occurred, but 'value' contains as much as could fit
}
*endianness* may be passed ``-1`` for the native endian that CPython was
compiled with, or ``0`` for big endian and ``1`` for little.
Return ``-1`` with an exception raised if *pylong* cannot be interpreted as
an integer. Otherwise, return the size of the buffer required to store the
value. If this is equal to or less than *n_bytes*, the entire value was
copied.
Unless an exception is raised, all *n_bytes* of the buffer will be written
with as much of the value as can fit. This allows the caller to ignore all
non-negative results if the intent is to match the typical behavior of a
C-style downcast.
Values are always copied as twos-complement, and sufficient size will be
requested for a sign bit. For example, this may cause an value that fits into
8 bytes when treated as unsigned to request 9 bytes, even though all eight
bytes were copied into the buffer. What has been omitted is the zero sign
bit, which is redundant when the intention is to treat the value as unsigned.
Passing *n_bytes* of zero will always return the requested buffer size.
.. note::
When the value does not fit in the provided buffer, the requested size
returned from the function may be larger than necessary. Passing 0 to this
function is not an accurate way to determine the bit length of a value.
.. versionadded:: 3.13
.. c:function:: int PyUnstable_Long_IsCompact(const PyLongObject* op)
Return 1 if *op* is compact, 0 otherwise.
Expand Down
7 changes: 6 additions & 1 deletion Doc/whatsnew/3.13.rst
Original file line number Diff line number Diff line change
Expand Up @@ -587,6 +587,7 @@ Tier 2 IR by Mark Shannon and Guido van Rossum.
Tier 2 optimizer by Ken Jin.)



Deprecated
==========

Expand Down Expand Up @@ -1526,6 +1527,11 @@ New Features

(Contributed by Victor Stinner and Petr Viktorin in :gh:`110850`.)

* Add :c:func:`PyLong_AsNativeBytes`, :c:func:`PyLong_FromNativeBytes` and
:c:func:`PyLong_FromUnsignedNativeBytes` functions to simplify converting
between native integer types and Python :class:`int` objects.
(Contributed by Steve Dower in :gh:`111140`.)


Porting to Python 3.13
----------------------
Expand Down Expand Up @@ -1585,7 +1591,6 @@ Porting to Python 3.13
platforms, the ``HAVE_STDDEF_H`` macro is only defined on Windows.
(Contributed by Victor Stinner in :gh:`108765`.)


Deprecated
----------

Expand Down
36 changes: 35 additions & 1 deletion Include/cpython/longobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,40 @@

PyAPI_FUNC(PyObject*) PyLong_FromUnicodeObject(PyObject *u, int base);

/* PyLong_AsNativeBytes: Copy the integer value to a native variable.
buffer points to the first byte of the variable.
n_bytes is the number of bytes available in the buffer. Pass 0 to request
the required size for the value.
endianness is -1 for native endian, 0 for big endian or 1 for little.
Big endian mode will write the most significant byte into the address
directly referenced by buffer; little endian will write the least significant
byte into that address.
If an exception is raised, returns a negative value.
Otherwise, returns the number of bytes that are required to store the value.
To check that the full value is represented, ensure that the return value is
equal or less than n_bytes.
All n_bytes are guaranteed to be written (unless an exception occurs), and
so ignoring a positive return value is the equivalent of a downcast in C.
In cases where the full value could not be represented, the returned value
may be larger than necessary - this function is not an accurate way to
calculate the bit length of an integer object.
*/
PyAPI_FUNC(Py_ssize_t) PyLong_AsNativeBytes(PyObject* v, void* buffer,
Py_ssize_t n_bytes, int endianness);

/* PyLong_FromNativeBytes: Create an int value from a native integer
n_bytes is the number of bytes to read from the buffer. Passing 0 will
always produce the zero int.
PyLong_FromUnsignedNativeBytes always produces a non-negative int.
endianness is -1 for native endian, 0 for big endian or 1 for little.
Returns the int object, or NULL with an exception set. */
PyAPI_FUNC(PyObject*) PyLong_FromNativeBytes(const void* buffer, size_t n_bytes,
int endianness);
PyAPI_FUNC(PyObject*) PyLong_FromUnsignedNativeBytes(const void* buffer,
size_t n_bytes, int endianness);

PyAPI_FUNC(int) PyUnstable_Long_IsCompact(const PyLongObject* op);
PyAPI_FUNC(Py_ssize_t) PyUnstable_Long_CompactValue(const PyLongObject* op);

Expand Down Expand Up @@ -50,7 +84,7 @@ PyAPI_FUNC(PyObject *) _PyLong_FromByteArray(
*/
PyAPI_FUNC(int) _PyLong_AsByteArray(PyLongObject* v,
unsigned char* bytes, size_t n,
int little_endian, int is_signed);
int little_endian, int is_signed, int with_exceptions);

/* For use by the gcd function in mathmodule.c */
PyAPI_FUNC(PyObject *) _PyLong_GCD(PyObject *, PyObject *);
145 changes: 145 additions & 0 deletions Lib/test/test_capi/test_long.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import unittest
import sys
import test.support as support

from test.support import import_helper

Expand Down Expand Up @@ -423,6 +424,150 @@ def test_long_asvoidptr(self):
self.assertRaises(OverflowError, asvoidptr, -2**1000)
# CRASHES asvoidptr(NULL)

def test_long_asnativebytes(self):
import math
from _testcapi import (
pylong_asnativebytes as asnativebytes,
SIZE_MAX,
)

# Abbreviate sizeof(Py_ssize_t) to SZ because we use it a lot
SZ = int(math.ceil(math.log(SIZE_MAX + 1) / math.log(2)) / 8)
MAX_SSIZE = 2 ** (SZ * 8 - 1) - 1
MAX_USIZE = 2 ** (SZ * 8) - 1
if support.verbose:
print(f"SIZEOF_SIZE={SZ}\n{MAX_SSIZE=:016X}\n{MAX_USIZE=:016X}")

# These tests check that the requested buffer size is correct
for v, expect in [
(0, SZ),
(512, SZ),
(-512, SZ),
(MAX_SSIZE, SZ),
(MAX_USIZE, SZ + 1),
(-MAX_SSIZE, SZ),
(-MAX_USIZE, SZ + 1),
(2**255-1, 32),
(-(2**255-1), 32),
(2**256-1, 33),
(-(2**256-1), 33),
]:
with self.subTest(f"sizeof-{v:X}"):
buffer = bytearray(1)
self.assertEqual(expect, asnativebytes(v, buffer, 0, -1),
"PyLong_AsNativeBytes(v, NULL, 0, -1)")
# Also check via the __index__ path
self.assertEqual(expect, asnativebytes(Index(v), buffer, 0, -1),
"PyLong_AsNativeBytes(Index(v), NULL, 0, -1)")

# We request as many bytes as `expect_be` contains, and always check
# the result (both big and little endian). We check the return value
# independently, since the buffer should always be filled correctly even
# if we need more bytes
for v, expect_be, expect_n in [
(0, b'\x00', 1),
(0, b'\x00' * 2, 2),
(0, b'\x00' * 8, min(8, SZ)),
(1, b'\x01', 1),
(1, b'\x00' * 10 + b'\x01', min(11, SZ)),
(42, b'\x2a', 1),
(42, b'\x00' * 10 + b'\x2a', min(11, SZ)),
(-1, b'\xff', 1),
(-1, b'\xff' * 10, min(11, SZ)),
(-42, b'\xd6', 1),
(-42, b'\xff' * 10 + b'\xd6', min(11, SZ)),
# Extracts 255 into a single byte, but requests sizeof(Py_ssize_t)
(255, b'\xff', SZ),
(255, b'\x00\xff', 2),
(256, b'\x01\x00', 2),
# Extracts successfully (unsigned), but requests 9 bytes
(2**63, b'\x80' + b'\x00' * 7, 9),
# "Extracts", but requests 9 bytes
(-2**63, b'\x80' + b'\x00' * 7, 9),
(2**63, b'\x00\x80' + b'\x00' * 7, 9),
(-2**63, b'\xff\x80' + b'\x00' * 7, 9),

(2**255-1, b'\x7f' + b'\xff' * 31, 32),
(-(2**255-1), b'\x80' + b'\x00' * 30 + b'\x01', 32),
# Request extra bytes, but result says we only needed 32
(-(2**255-1), b'\xff\x80' + b'\x00' * 30 + b'\x01', 32),
(-(2**255-1), b'\xff\xff\x80' + b'\x00' * 30 + b'\x01', 32),

# Extracting 256 bits of integer will request 33 bytes, but still
# copy as many bits as possible into the buffer. So we *can* copy
# into a 32-byte buffer, though negative number may be unrecoverable
(2**256-1, b'\xff' * 32, 33),
(2**256-1, b'\x00' + b'\xff' * 32, 33),
(-(2**256-1), b'\x00' * 31 + b'\x01', 33),
(-(2**256-1), b'\xff' + b'\x00' * 31 + b'\x01', 33),
(-(2**256-1), b'\xff\xff' + b'\x00' * 31 + b'\x01', 33),

# The classic "Windows HRESULT as negative number" case
# HRESULT hr;
# PyLong_CopyBits(<-2147467259>, &hr, sizeof(HRESULT))
# assert(hr == E_FAIL)
(-2147467259, b'\x80\x00\x40\x05', 4),
]:
with self.subTest(f"{v:X}-{len(expect_be)}bytes"):
n = len(expect_be)
buffer = bytearray(n)
expect_le = expect_be[::-1]

self.assertEqual(expect_n, asnativebytes(v, buffer, n, 0),
f"PyLong_AsNativeBytes(v, buffer, {n}, <big>)")
self.assertEqual(expect_be, buffer[:n], "<big>")
self.assertEqual(expect_n, asnativebytes(v, buffer, n, 1),
f"PyLong_AsNativeBytes(v, buffer, {n}, <little>)")
self.assertEqual(expect_le, buffer[:n], "<little>")

# Check a few error conditions. These are validated in code, but are
# unspecified in docs, so if we make changes to the implementation, it's
# fine to just update these tests rather than preserve the behaviour.
with self.assertRaises(SystemError):
asnativebytes(1, buffer, 0, 2)
with self.assertRaises(TypeError):
asnativebytes('not a number', buffer, 0, -1)

def test_long_fromnativebytes(self):
import math
from _testcapi import (
pylong_fromnativebytes as fromnativebytes,
SIZE_MAX,
)

# Abbreviate sizeof(Py_ssize_t) to SZ because we use it a lot
SZ = int(math.ceil(math.log(SIZE_MAX + 1) / math.log(2)) / 8)
MAX_SSIZE = 2 ** (SZ * 8 - 1) - 1
MAX_USIZE = 2 ** (SZ * 8) - 1

for v_be, expect_s, expect_u in [
(b'\x00', 0, 0),
(b'\x01', 1, 1),
(b'\xff', -1, 255),
(b'\x00\xff', 255, 255),
(b'\xff\xff', -1, 65535),
]:
with self.subTest(f"{expect_s}-{expect_u:X}-{len(v_be)}bytes"):
n = len(v_be)
v_le = v_be[::-1]

self.assertEqual(expect_s, fromnativebytes(v_be, n, 0, 1),
f"PyLong_FromNativeBytes(buffer, {n}, <big>)")
self.assertEqual(expect_s, fromnativebytes(v_le, n, 1, 1),
f"PyLong_FromNativeBytes(buffer, {n}, <little>)")
self.assertEqual(expect_u, fromnativebytes(v_be, n, 0, 0),
f"PyLong_FromUnsignedNativeBytes(buffer, {n}, <big>)")
self.assertEqual(expect_u, fromnativebytes(v_le, n, 1, 0),
f"PyLong_FromUnsignedNativeBytes(buffer, {n}, <little>)")

# Check native endian when the result would be the same either
# way and we can test it.
if v_be == v_le:
self.assertEqual(expect_s, fromnativebytes(v_be, n, -1, 1),
f"PyLong_FromNativeBytes(buffer, {n}, <native>)")
self.assertEqual(expect_u, fromnativebytes(v_be, n, -1, 0),
f"PyLong_FromUnsignedNativeBytes(buffer, {n}, <native>)")


if __name__ == "__main__":
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Adds :c:func:`PyLong_AsNativeBytes`, :c:func:`PyLong_FromNativeBytes` and
:c:func:`PyLong_FromUnsignedNativeBytes` functions.
2 changes: 1 addition & 1 deletion Modules/_io/textio.c
Original file line number Diff line number Diff line change
Expand Up @@ -2393,7 +2393,7 @@ textiowrapper_parse_cookie(cookie_type *cookie, PyObject *cookieObj)
return -1;

if (_PyLong_AsByteArray(cookieLong, buffer, sizeof(buffer),
PY_LITTLE_ENDIAN, 0) < 0) {
PY_LITTLE_ENDIAN, 0, 1) < 0) {
Py_DECREF(cookieLong);
return -1;
}
Expand Down
3 changes: 2 additions & 1 deletion Modules/_pickle.c
Original file line number Diff line number Diff line change
Expand Up @@ -2162,7 +2162,8 @@ save_long(PicklerObject *self, PyObject *obj)
pdata = (unsigned char *)PyBytes_AS_STRING(repr);
i = _PyLong_AsByteArray((PyLongObject *)obj,
pdata, nbytes,
1 /* little endian */ , 1 /* signed */ );
1 /* little endian */ , 1 /* signed */ ,
1 /* with exceptions */);
if (i < 0)
goto error;
/* If the int is negative, this may be a byte more than
Expand Down
3 changes: 2 additions & 1 deletion Modules/_randommodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,8 @@ random_seed(RandomObject *self, PyObject *arg)
res = _PyLong_AsByteArray((PyLongObject *)n,
(unsigned char *)key, keyused * 4,
PY_LITTLE_ENDIAN,
0); /* unsigned */
0, /* unsigned */
1); /* with exceptions */
if (res == -1) {
goto Done;
}
Expand Down
2 changes: 1 addition & 1 deletion Modules/_sqlite/util.c
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ _pysqlite_long_as_int64(PyObject * py_val)
sqlite_int64 int64val;
if (_PyLong_AsByteArray((PyLongObject *)py_val,
(unsigned char *)&int64val, sizeof(int64val),
IS_LITTLE_ENDIAN, 1 /* signed */) >= 0) {
IS_LITTLE_ENDIAN, 1 /* signed */, 0) >= 0) {
return int64val;
}
}
Expand Down
Loading

0 comments on commit 7861dfd

Please sign in to comment.