gh-140557: Force alignment of empty `bytearray` and `array.array` buffers #140559

jakelishman · 2025-10-24T16:37:44Z

This ensures the buffers used by the empty bytearray and array.array are aligned the same as a pointer returned by the allocator. This is a more convenient default for interop with other languages that have stricter requirements of type-safe buffers (e.g. Rust's &[T] type) even when empty.

I tried to do the same for bytes, but I think its default buffer is only forcibly aligned on an 8 because of the uint64_t member in PyBytesObject, and it ends up dependent on where bytes_empty gets laid out. If that's desirable too, I might need some help figuring out a strategy for it.

I'm not sure where's appropriate to put a test for this, or if it can/should be documented as reliable.

Issue: #140557

Issue: Align default empty buffers of bytearray and array.array #140557

This ensures the buffers used by the empty `bytearray` and `array.array` are aligned the same as a pointer returned by the allocator. This is a more convenient default for interop with other languages that have stricter requirements of type-safe buffers (e.g. Rust's `&[T]` type) even when empty.

python-cla-bot · 2025-10-24T16:37:48Z

All commit authors signed the Contributor License Agreement.

serhiy-storchaka

LGTM. 👍

cmaloney · 2025-10-26T01:03:08Z

I'm 👎 on this for bytearray; I've been looking at making the empty bytearray point to an empty bytes object where this wouldn't hold true.

see: gh-139871 for ways having a bytes inside can make things faster (less copies of data often)

cmaloney · 2025-10-26T02:44:47Z

re: bytearray, it also supports a "fast" start-delete which moves the start offset (ob_start) inside the allocated space by an arbitrary count of bytes which, to me, implies that unaligned data access from other languages needs to be supported in accessing/manipulating its internal buffers/data.

jakelishman · 2025-10-26T06:58:31Z

This doesn't enforce that every buffer always has aligned access - you can easily take out a view of a bytes or bytearray that you offset by a byte - so other languages have to still handle the case of an unaligned pointer. It just makes the default empty object aligned at zero cost, and that object is common.

The empty bytes internal pointer still ends up aligned on an 8, which would in practice still make the pointer aligned for most data types you might be casting to, so swapping to that would still be an improvement over the status quo.

jakelishman · 2025-10-26T07:03:25Z

We do still have to handle unaligned access everywhere, I agree there's no escaping it. The goal here is only around making unaligned objects less common - bytearray being aligned on a 1 (and actually turning up on a 1) is a lot more commonly visible now in Python 3.14 that pickle 5 uses it in the data stream to represent PickleBuffer in band, so more libraries (e.g. Numpy) want to zero-copt view onto the recreated buffer.

cmaloney · 2025-10-26T07:38:00Z

I'm still concerned here:

Both array and bytearray externalize their item storage. The PyObject* that is the array object (or bytearray) is a separate allocation from the storage buffer for the elements. As an optimization, CPython doesn't allocate space if the length is 0, and these two values are used to handle a couple otherwise hard to handle return cases. The array one is only used in array_buffer_getbuf so that memoryview() / buffer protocol works on an array with no storage allocated itself. The bytearray one shows up in the C API PyByteArray_AS_STRING only when bytearray has no internal storage.

These are just placeholders to fill corner cases which seem like they shouldn't be common in code (to use the returned pointer without doing out of bounds access you'd need to look at the len() / size).
This adds a new CPython "guarantee" that it sounds like want to depend on in code but no test that would break if changes effect it. It should be possible to write one for alignment with https://docs.python.org/3/library/stdtypes.html#memoryview
The default "buffer" for bytearray (and array.array) are both immutable and shouldn't really be indexed into, read from, or written to; they're 0 bytes long, code can't set / modify data in them, and there is no data to read from them.

jakelishman · 2025-10-26T08:36:31Z

I'm not aiming to make this a guarantee at all, just that by default the zero allocation case is already aligned. In Rust and other languages we do still have to handle unaligned buffers, just like Numpy alread has to in pure C.

The default "buffer" for bytearray (and array.array) are both immutable and shouldn't really be indexed into, read from, or written to; they're 0 bytes long, code can't set / modify data in them, and there is no data to read from them.

Right, this is where other languages have stronger guarantees - I mentioned it in gh-140557 as the motivation that in Rust, creating a &[T] primitive slice requires an aligned non-null pointer even if it's invalid for any reads. We can't convert every buffer to a slice with zero copies, so we already have alternative handling, but this patch makes the empty bytearray object go down a common path rather than a colder path by default. For similar fast-path alignment reasons, Numpy internally sets its "aligned" flag on empty buffers even if the pointer isn't actually aligned for the data type it purportedly points to, to avoid triggering copying code / extra handling in ufuncs that require alignment.

jakelishman · 2025-10-26T08:49:39Z

About rarity of appearance: both of these pointers show through the buffer protocol like you mentioned, and that's where I come across them in language interop - that's the defined way to get zero-copy access to a data buffer owned by Python space.

We can't have zero-copy access to misaligned buffers, but in practice, the vast majority of buffers that are back by an allocation end up aligned anyway, so requiring copies is rare (since you have to deliberately offset an allocated pointer by a sub-unit amount). We can add special handling to produce an empty slice in Rust that doesn't refer to the same pointer we actually get from the Python buffer protocol, if it's misaligned as an extra optimisation to avoid a copy, so we don't require CPython support, but if the default is actually aligned, then there's less point propagating this hypothetical optimising special-handling code through a lot of downstream packages and just using the slower "copy to force alignment" paths that (should) already exist for it.

When I wrote this patch, it was zero cost to CPython to achieve that for the defaults. If you have additional work that would seriously raise the cost to CPython then the calculus is different, though even reliably having the empty buffer aligned on an 8, like the empty bytes buffer is in practice, would be enough for the majority of cases I care about.

cmaloney · 2025-10-26T08:58:27Z

Can you link to the code or provide a rust sample which needs to special-case handling a zero-length bytearray? I think that would help me understand here.

serhiy-storchaka · 2025-10-27T10:41:47Z

It is not guaranteed that the start of the bytearray buffer has some alignment. This is a CPython implementation detail. But some code depends on this, and it may not work on non-x86 platforms if this is not aligned. There are exceptions: if there was a deletion from the beginning of the bytearray (we cannot do anything with this, but the user can guarantee that this did not happen), and when the bytearray is empty. The latter case is easy to fix for us, it costs nothing.

jakelishman · 2025-10-27T14:01:30Z

cmaloney: Let's say I've got FFI got that wraps the Py_buffer interface, first by making (not precise - I'm just including the illustrative stuff):

struct PyBuffer {
  buf: *mut (),
  itemsize: isize,
  ndim: ::std::ffi::c_int,
  shape: *const isize,
  strides: *const isize,
}
impl PyBuffer {
  /// Is the slice contiguous in memory?
  fn is_contiguous(&self) -> bool { /* ... */ }
  /// How many bytes can be read from it?
  fn len_bytes(&self) -> usize { /* ... */ }
}

Let's say I've got one of these structs that I then initialised with PyObject_GetBuffer, and now I want to expose a Rust-native slice view for a specific Rust type, which might not match the "native" type of the buffer, since I might have been given an array created from a bytearray object whose itemsize is 1 but actually represents storage of uint64_t (u64 in Rust). I return an Result here to signify to the caller that the function is fallible¹:

enum SliceError {
  Unaligned,
  Noncontiguous,
  Nullptr,
}

fn slice_from_buffer<T>(buf: &PyBuffer) -> Result<&[T], SliceError> {
  if buf.buf.is_null() {
    return Err(SliceError::Nullptr);
  }
  if !buf.is_contiguous() {
    return Err(SliceError::Noncontiguous);
  }
  if !buf.buf.is_aligned::<T>() {
    return Err(SliceError::Unaligned);
  }
  // SAFETY: pointer is non-null, aligned, and valid for this many contiguous reads:
  Ok(unsafe { std::slice::from_raw_parts(
    buf.buf,
    buf.len_bytes() / std::mem::size_of::<T>()
  })
}

I have to do the alignment and contiguous checks before I call std::slice::from_raw_parts, because it's undefined behaviour in Rust to create a slice backed by an unaligned or null pointer, even if the length is zero. (The reason is to enable specific type-niche optimisations in the compiler to save space in compound types that contain &[T].)

A Rust caller of this function is still responsible for handling the case of an unaligned pointer, which might cause them to do something like

match slice_from_buffer::<u64>(&buf) {
  Ok(slice) => use_slice(slice),
  Err(SliceError::Noncontiguous | SliceError::Unaligned) => {
    let aligned_contiguous = /* copy buffer to somewhere aligned */;
    use_slice(aligned_contiguous.as_slice())
  }
  Err(SliceError::Nullptr) => panic!(),
}

The idea of this PR is just to make it so that slightly more stuff by default gets to go down the happy path, in a way that doesn't cost CPython anything. It's still the Rust user's responsibility to handle the unhappy path since that's totally valid Python code still, and the Rust library's responsibility to make sure everything is safe for FFI use in Rust². This PR isn't intending to add any restrictions on what CPython is allowed to do or what other Python implementations may do.

Your #139871 looks to me to also achieve the same goal I was going for here, just as a side effect (since the empty bytes buffer happens to be 8-byte aligned in CPython), so if that merged, it'd improve the bytearray situation as well. This PR is slightly stronger for bytearray, but that's largely immaterial (it most likely only affects SIMD types).

this code is usually in Python interface libraries like PyO3, whose implementation is here, but I wrote the example manually because PyO3's has additional complications and spreads the required checks into a few separate places. Also, the same sort of code comes up in lots of specialised places too, like rust-numpy that provides FFI access to Numpy arrays, etc. ↩
there are safe Rust-side optimisations that can be done around empty slices (which I'm planning to contribute in several Rust projects) such as creating a slice out of a dangling pointer magicked from thin air, which helps these cases as well, ↩

cmaloney · 2025-10-27T18:59:31Z

I'm 👍 for doing array here as it supports larger than 1-byte element objects. Does still need a test so if the code is broken/removed accidentally CPython devs will notice. That should be implementable with https://docs.python.org/3/library/ctypes.html#ctypes.alignment.

bytearray should be covered by my CPython internals change which will help your case. For bytes-like and slicing in general across rust and other languages I found https://davidben.net/2024/01/15/empty-slices.html really helpful to improve my knowledge. I don't think it makes sense for CPython to adopt the rust semantic for its "bytes-like" generally; and "buffer protocol" / memoryview across the ecosystem is based on C "bytes" for better and worse.

cmaloney

Reduce scope to just array.array + add a test; empty buffer + non-empty array.array storage being aligned I think is a nice improvement

serhiy-storchaka · 2025-10-29T22:15:07Z

I do not think that bytearray should be excluded. If not a special optimization of using NULL for empty array, it would have standard alignment (as returned by malloc()). The fact that currently it can have worse alignment is the result of optimization. Optimization should not cause regression.

cmaloney · 2025-10-29T22:20:26Z

re: bytearray gh-140557 / GH-140128 makes it aligned as a side effect because bytes is aligned (and removes usage of _PyByteArray_empty_string; ob_start is always set)

serhiy-storchaka · 2025-10-29T22:24:25Z

Then I do not see reasons to object this change.

cmaloney · 2025-10-29T22:51:55Z

I don't like having implementation details relied on by external code/systems without a test / validation. To me that creates a potential future release blocker if the assertion gets broken. I don't see adding a test here as a really onerous requirement.
Aligning _PyByteArray_empty_string is going to be a no op / just waste space shortly, so isn't a lot of value in doing in this PR as this will just show up in 3.15

I have slight nit in wording of the NEWS entry; I think it would be good to mention Rust and to make it more concise.

serhiy-storchaka · 2025-10-29T23:01:47Z

I think there is a good chance that the code that relies on this already exist. It works only because the Intel plathform is tolerable to unaligned access (and there is no actually access to memory here, because it is an empty buffer). But on other platforms pointers of different type can use different registers.

This is not limited to Rust. In C, casting a pointer with wrong alignment can be an undefined behavior. If bytesarray is used instead of array as a collection of 16-, 32- or 64-bit integers, there may be problems.

jakelishman · 2025-10-30T00:46:23Z

Hi both - I'd been busy at work the last few days and not had a chance to check in, sorry.

Given the above comments: I've added some tests of the new behaviour (open to suggestions if I've missed some API that'd make them simpler), and I had a go at shortening the NEWS entry to make it shorter and stress that it's just about the empty default, not a guarantee for all buffers. Happy to change anything more if there's consensus, or to pause this and take it to Discourse if needs be?

I'm definitely not trying to make all buffers follow Rust semantics - that's never possible anyway, given that we can always do memoryview(b"01234567")[1:] to get a buffer backed by a pointer that'd be unaligned for anything other than a byte, let alone whatever downstream implementers of the buffer protocol do. I care about bytearray because it appears more now because of the default handling of PickleBuffer.

cmaloney · 2025-10-30T20:25:52Z

I think we're all moving towards consensus and actually really near it :). Will do a more thorough review later today

cmaloney · 2025-10-31T01:22:59Z

Lib/test/support/__init__.py

    return _linked_to_musl
+
+
+try:


I think this will be easier to do in Lib/test/test_capi/test_misc.py. In particular only need to

Add a new _testcapimodule.c entry point that makes a Py_buffer C API of the PyObject passed to it, gets the pointer and turns that pointer into a PyObject * (https://docs.python.org/3/c-api/long.html#c.PyLong_FromVoidPtr) which it returns

one alignment test across the range of types and constructions care about.

I like how your existing test iterates through / tests all the different array typecodes.

I think it would be good to extend the test to both test empty (the size that caused this bug) + non-empty arrays (they should also be aligned)

Actually found a better test file for the alignment pieces to live in: Lib/test/test_buffer.py; still should implement "get the pointer" in _testcapimodule.c

bedevere-app bot mentioned this pull request Oct 24, 2025

Align default empty buffers of bytearray and array.array #140557

Open

bedevere-app bot added the awaiting review label Oct 24, 2025

jakelishman mentioned this pull request Oct 24, 2025

BUG: Data pointer is unaligned for empty arrays after pickle roundtrip numpy/numpy#30062

Closed

serhiy-storchaka approved these changes Oct 24, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels Oct 24, 2025

serhiy-storchaka requested a review from encukou October 24, 2025 16:58

cmaloney requested changes Oct 29, 2025

View reviewed changes

jakelishman and others added 3 commits October 30, 2025 00:36

Add tests of buffer pointer alignment

1fea8e5

Make NEWS more concise

b7eed2b

Merge remote-tracking branch 'python/main' into max-align-buffers

dd9e50b

jakelishman force-pushed the max-align-buffers branch from 719e0e2 to dd9e50b Compare October 30, 2025 00:38

Avoid ctypes import on unsupported platforms

9d454de

cmaloney reviewed Oct 31, 2025

View reviewed changes

Uh oh!

gh-140557: Force alignment of empty bytearray and array.array buffers #140559

Are you sure you want to change the base?

gh-140557: Force alignment of empty bytearray and array.array buffers #140559

Uh oh!

Conversation

jakelishman commented Oct 24, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

python-cla-bot bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

cmaloney commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmaloney commented Oct 26, 2025

Uh oh!

jakelishman commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakelishman commented Oct 26, 2025

Uh oh!

cmaloney commented Oct 26, 2025

Uh oh!

jakelishman commented Oct 26, 2025

Uh oh!

jakelishman commented Oct 26, 2025

Uh oh!

cmaloney commented Oct 26, 2025

Uh oh!

serhiy-storchaka commented Oct 27, 2025

Uh oh!

jakelishman commented Oct 27, 2025

Footnotes

Uh oh!

cmaloney commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmaloney left a comment

Choose a reason for hiding this comment

Uh oh!

serhiy-storchaka commented Oct 29, 2025

Uh oh!

cmaloney commented Oct 29, 2025

Uh oh!

serhiy-storchaka commented Oct 29, 2025

Uh oh!

cmaloney commented Oct 29, 2025

Uh oh!

serhiy-storchaka commented Oct 29, 2025

Uh oh!

jakelishman commented Oct 30, 2025

Uh oh!

cmaloney commented Oct 30, 2025

Uh oh!

cmaloney Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmaloney Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gh-140557: Force alignment of empty `bytearray` and `array.array` buffers #140559

gh-140557: Force alignment of empty `bytearray` and `array.array` buffers #140559

jakelishman commented Oct 24, 2025 •

edited by bedevere-app bot

Loading

python-cla-bot bot commented Oct 24, 2025 •

edited

Loading

cmaloney commented Oct 26, 2025 •

edited

Loading

jakelishman commented Oct 26, 2025 •

edited

Loading

cmaloney commented Oct 27, 2025 •

edited

Loading

cmaloney Oct 31, 2025 •

edited

Loading