gh-139871: Optimize bytearray unique bytes iconcat #141862

cmaloney · 2025-11-22T23:28:46Z

If the bytearray is empty and a uniquely referenced bytes object is being concatenated (ex. one just received from read), just use its storage as the backing for the bytearray rather than copying it. The bigger the bytes the bigger the saving.

build_bytes_unique: Mean +- std dev: [base] 383 ns +- 11 ns -> [iconcat_opt] 342 ns +- 5 ns: 1.12x faster
build_bytearray: Mean +- std dev: [base] 496 ns +- 8 ns -> [iconcat_opt] 471 ns +- 13 ns: 1.05x faster
encode: Mean +- std dev: [base] 482 us +- 2 us -> [iconcat_opt] 13.8 us +- 0.1 us: 34.78x faster

Benchmark hidden because not significant (1): build_bytes

Geometric mean: 2.53x faster

note: Performance of build_bytes is expected to stay constant.

import pyperf

runner = pyperf.Runner()

count1 = 1_000
count2 = 100
count3 = 10_000

CHUNK_A = b'a' * count1
CHUNK_B = b'b' * count2
CHUNK_C = b'c' * count3

def build_bytes():
    # Bytes not uniquely referenced.
    ba = bytearray()
    ba += CHUNK_A
    ba += CHUNK_B
    ba += CHUNK_C

def build_bytes_unique():
    ba = bytearray()
    # Repeat inline results in uniquely referenced bytes.
    ba += b'a' * count1
    ba += b'b' * count2
    ba += b'c' * count3

def build_bytearray():
    # Each bytearray appended is uniquely referenced.
    ba = bytearray()
    ba += bytearray(CHUNK_A)
    ba += bytearray(CHUNK_B)
    ba += bytearray(CHUNK_C)

runner.bench_func('build_bytes', build_bytes)
runner.bench_func('build_bytes_unique', build_bytes_unique)
runner.bench_func('build_bytearray', build_bytearray)
runner.timeit(
    name="encode",
    setup="a = 'a' * 1_000_000",
    stmt="bytearray(a, encoding='utf8')")

From my understanding of reference counting I think this is safe to do for iconcat (and would be safe to do for ba[:] = b'\0' * 1000 discuss topic). The briefly refcount 2 isn't ideal but I think good enough for the performance delta. I'm hoping if I can ship an implementation of gh-87613 can do the same optimization for bytearray(b'\0' * 4096).

If the iconcat refcount 2 part isn't good, can tweak to keep the enecode + bytearray performance improvement without changing iconcat generally.

cc: @vstinner , @encukou

Issue: Add .take_bytes([n]) a zero-copy path from bytearray to bytes #139871

If the bytearray is empty and a uniquely referenced bytes object is being concatenated (ex. one just recieved from read), just use its storage as the backing for the bytearray rather than copying it. build_bytes_unique: Mean +- std dev: [base] 383 ns +- 11 ns -> [iconcat_opt] 342 ns +- 5 ns: 1.12x faster build_bytearray: Mean +- std dev: [base] 496 ns +- 8 ns -> [iconcat_opt] 471 ns +- 13 ns: 1.05x faster encode: Mean +- std dev: [base] 482 us +- 2 us -> [iconcat_opt] 13.8 us +- 0.1 us: 34.78x faster Benchmark hidden because not significant (1): build_bytes Geometric mean: 2.53x faster note: Performance of build_bytes is expected to stay constant. ```python import pyperf runner = pyperf.Runner() count1 = 1_000 count2 = 100 count3 = 10_000 CHUNK_A = b'a' * count1 CHUNK_B = b'b' * count2 CHUNK_C = b'c' * count3 def build_bytes(): # Bytes not uniquely referenced. ba = bytearray() ba += CHUNK_A ba += CHUNK_B ba += CHUNK_C def build_bytes_unique(): ba = bytearray() # Repeat inline results in uniquely referenced bytes. ba += b'a' * count1 ba += b'b' * count2 ba += b'c' * count3 def build_bytearray(): # Each bytearray appended is uniquely referenced. ba = bytearray() ba += bytearray(CHUNK_A) ba += bytearray(CHUNK_B) ba += bytearray(CHUNK_C) runner.bench_func('build_bytes', build_bytes) runner.bench_func('build_bytes_unique', build_bytes_unique) runner.bench_func('build_bytearray', build_bytearray) runner.timeit( name="encode", setup="a = 'a' * 1_000_000", stmt="bytearray(a, encoding='utf8')") ```

vstinner · 2025-11-24T15:53:52Z

Objects/bytearrayobject.c


+    // optimization: Avoid copying the bytes coming in when possible.
+    if (self->ob_alloc == 0 && _PyObject_IsUniquelyReferenced(other)) {
+        // note: ob_bytes_object is always the immortal empty bytes here.


Can you replace the comment with an assertion?

vstinner · 2025-11-24T15:54:54Z

Objects/bytearrayobject.c

+    // optimization: Avoid copying the bytes coming in when possible.
+    if (self->ob_alloc == 0 && _PyObject_IsUniquelyReferenced(other)) {
+        // note: ob_bytes_object is always the immortal empty bytes here.
+        if (!_canresize(self)) {


Why not also run this check when (self->ob_alloc == 0 && _PyObject_IsUniquelyReferenced(other)) is false? So move it before the outer if?

For the general case it's currently checked by bytearray_resize_lock_held.

Checking here technically changes the order of operations where an error can happen (the main codepath the buffer protocol always is called and can error before _canresize). Not certain how important that is to keep in this case. Keeping order would mean moving _canresize check inside the type checks here.

Objects/bytearrayobject.c

Co-authored-by: Victor Stinner <vstinner@python.org>

vstinner · 2025-11-25T11:35:06Z

Objects/bytearrayobject.c


+    // optimization: Avoid copying the bytes coming in when possible.
+    if (self->ob_alloc == 0 && _PyObject_IsUniquelyReferenced(other)) {
+        assert(_Py_IsImmortal(self->ob_bytes_object));


I was thinking at something like:

Suggested change

assert(_Py_IsImmortal(self->ob_bytes_object));

assert(self->ob_bytes_object == Py_GetConstantBorrow(Py_CONSTANT_EMPTY_BYTES));

encukou · 2025-11-25T15:35:43Z

Here's a test that should pass, but doesn't:

    // make some bytes
    PyObject *bytes = PyBytes_FromString("aaB");
    assert(bytes);
    // make an empty bytearray
    PyObject *ba = PyByteArray_FromStringAndSize("", 0);
    assert(ba);
    // append bytes to bytearray (in place, getting a new reference)
    PyObject *new_ba = PySequence_InPlaceConcat(ba, bytes);
    assert(new_ba == ba);
    Py_DECREF(new_ba);
    // pop from bytearray
    Py_DECREF(PyObject_CallMethod(ba, "pop", ""));

    // check that our bytes was not modified
    assert(memcmp(PyBytes_AsString(bytes), "aaB", 3) == 0);

    Py_DECREF(bytes);
    Py_DECREF(ba);

AFAIK, you need to use PyUnstable_Object_IsUniqueReferencedTemporary.

colesbury · 2025-11-25T17:07:12Z

Objects/bytearrayobject.c

+            PyObject *taken = PyObject_CallMethodNoArgs(other,
+                                                        &_Py_ID(take_bytes));


This looks unsafe to me. If you call a method, you may invalidate the assumptions you verified earlier

bedevere-app bot added the awaiting review label Nov 22, 2025

bedevere-app bot mentioned this pull request Nov 22, 2025

Add .take_bytes([n]) a zero-copy path from bytearray to bytes #139871

Closed

cmaloney added the skip news label Nov 22, 2025

cmaloney mentioned this pull request Nov 23, 2025

gh-141863: Use bytearray.take_bytes in asyncio.streams #141864

Merged

vstinner reviewed Nov 24, 2025

View reviewed changes

cmaloney and others added 2 commits November 24, 2025 11:20

Apply suggestions from code review

08364c1

Co-authored-by: Victor Stinner <vstinner@python.org>

Switch from comment to assertion

70c15bf

vstinner reviewed Nov 25, 2025

View reviewed changes

colesbury reviewed Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-139871: Optimize bytearray unique bytes iconcat #141862

gh-139871: Optimize bytearray unique bytes iconcat #141862

cmaloney commented Nov 22, 2025 •

edited

Loading

Uh oh!

vstinner Nov 24, 2025

Uh oh!

vstinner Nov 24, 2025

Uh oh!

cmaloney Nov 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

vstinner Nov 25, 2025

Uh oh!

encukou commented Nov 25, 2025

Uh oh!

colesbury Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	assert(_Py_IsImmortal(self->ob_bytes_object));
	assert(self->ob_bytes_object == Py_GetConstantBorrow(Py_CONSTANT_EMPTY_BYTES));

		PyObject *taken = PyObject_CallMethodNoArgs(other,
		&_Py_ID(take_bytes));

Uh oh!

gh-139871: Optimize bytearray unique bytes iconcat #141862

Are you sure you want to change the base?

gh-139871: Optimize bytearray unique bytes iconcat #141862

Conversation

cmaloney commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

vstinner Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

cmaloney Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vstinner Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

encukou commented Nov 25, 2025

Uh oh!

colesbury Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cmaloney commented Nov 22, 2025 •

edited

Loading

cmaloney Nov 24, 2025 •

edited

Loading