Skip to content

Fix Android memfd buffer registration rollback#82

Merged
dallison merged 2 commits into
mainfrom
fix-android-memfd-buffer-registration-rollback
Jun 4, 2026
Merged

Fix Android memfd buffer registration rollback#82
dallison merged 2 commits into
mainfrom
fix-android-memfd-buffer-registration-rollback

Conversation

@dallison

@dallison dallison commented Jun 4, 2026

Copy link
Copy Markdown
Owner

Bug and impact

On Android non-split memfd channels, PublisherImpl::CreateOrAttachBuffers advanced ccb_->num_buffers before registering the new buffer FD with the server. If RegisterClientBuffer failed after that CAS, subscribers could observe a new buffer generation that the server could not provide, causing reload failures and eventual abort when a slot referenced the missing buffer.

Root cause

The Android FD broker requires server registration before other clients can map a buffer, but the error path returned without rolling back the shared num_buffers count or clearing the publisher's local unregistered buffer state.

Fix

Roll back ccb_->num_buffers on registration failure when it still points at the failed generation, clear the corresponding BCB sizes, and unmap/drop local buffers beyond the last fully registered generation. Also make SUBSPACE_SHMEM_MODE overridable so the Android memfd path can be tested on Linux, and update memfd includes accordingly.

Validation

  • bazelisk test //client:client_test --test_filter=AndroidBufferRegistrationTest.FailedRegistrationRollsBackNumBuffers --cxxopt=-DSUBSPACE_SHMEM_MODE=SUBSPACE_SHMEM_MODE_ANDROID
  • bazelisk test //client:client_test --test_filter=ClientTest.Resize1:ClientTest.ResizeCallback
  • bazelisk test //client:client_test
  • bazelisk test //client:client_test --cxxopt=-DSUBSPACE_SHMEM_MODE=SUBSPACE_SHMEM_MODE_ANDROID

Re-created from #81 so the commit is authored under a CLA-covered account; supersedes #81.

dallison added 2 commits June 4, 2026 12:34
On Android non-split memfd channels, PublisherImpl::CreateOrAttachBuffers
advanced ccb_->num_buffers before registering the new buffer FD with the
server. If RegisterClientBuffer failed after that CAS, subscribers could
observe a new buffer generation the server could not provide, causing
reload failures and an eventual abort when a slot referenced the missing
buffer.

Roll back ccb_->num_buffers on registration failure when it still points
at the failed generation, clear the corresponding BCB sizes, and
unmap/drop local buffers beyond the last fully registered generation.
Also make SUBSPACE_SHMEM_MODE overridable so the Android memfd path can
be tested on Linux, and update memfd includes accordingly.
…create

CreateSplitSharedMemoryBuffer referenced the memfd `fd` after the
#ifdef __NR_memfd_create / #else block, so building the Android shmem
path on a platform without memfd_create (e.g. macOS) failed to compile.
Move the descriptor handling inside the #ifdef so the unsupported path
just returns UnimplementedError.

Also skip AndroidBufferRegistrationTest when memfd_create is unavailable
instead of failing, so the Android shmem mode can be built and exercised
on macOS.
@dallison dallison merged commit ad6d598 into main Jun 4, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant