Tier 1: pool curl resources, RDMA RAII + decline-fallback, fix RemoveObjectsResult UB#215
Merged
Conversation
…oadPart headers http.cc: - Replace the per-request curlpp::Cleanup with a function-local static so curl_global_init() runs exactly once per process. The previous code re-ran the (non-thread-safe, OpenSSL-touching) global init on every HTTP call. - Attach a process-wide CURLSH to each curlpp::Easy with one std::mutex per CURL_LOCK_DATA_* slot, sharing the connection cache, DNS cache, and TLS session cache. A single mutex covering all slots deadlocked because libcurl can hold one slot's lock while acquiring another on the same thread. - Enable CURLOPT_TCP_KEEPALIVE so pooled sockets survive idle gaps between S3 calls. baseclient: - Guard region_map_ with a std::shared_mutex. Reads in HandleRedirectResponse and GetRegion now go through find() instead of operator[], so they no longer mutate the map (the prior code silently inserted empty entries on every lookup miss) and can take a shared lock; the erase and write sites take a unique lock. - UploadPart now propagates args.headers into the wrapped PutObjectApiArgs so caller-supplied per-part headers reach the signer.
…t, fix RemoveObjectsResult UB request.cc: - BuildHeaders now honors a caller-supplied x-amz-content-sha256. When a Put/Post caller pre-sets it (e.g. "UNSIGNED-PAYLOAD" for a GPU-resident body) the signer uses that value and skips utils::Sha256Hash(body), avoiding a host-side read of device memory just to compute a signing hash that TLS already authenticates. client.cc: - Wrap the part-buffer allocation and cuObj registration in two small RAII guards (AlignedBuffer, ScopedRDMARegistration) so deregistration always runs before the buffer is freed and is enforced structurally, not by careful early-return ordering. - When the cuObj layer is connected but cuMemObjGetDescriptor declines to register the buffer (e.g. nvidia_peermem.ko not loaded, IB device unhealthy, GPUDirect misconfigured), silently skip the RDMA path and fall through to HTTP multipart instead of failing the whole call. - The GPU-buffer HTTP fallback in PutObject now sets x-amz-content-sha256: UNSIGNED-PAYLOAD, and the multipart loop propagates that header into every UploadPart, so the signer does not have to hash device memory. - Fix UB in RemoveObjectsResult::Populate: when the caller's func returned false on the very first call (empty batch), done_ was set true but itr_ stayed uninitialized, so operator bool() compared an uninit iterator and the first dereference segfaulted. Pin itr_ to errors.end() when the batch is empty. tests/tests.cc: - SelectObjectContent now detects MethodNotAllowed (the server does not implement S3 Select) and prints "skipped:" instead of failing the suite.
ea3bf62 to
c0b2cc8
Compare
This was referenced May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tier 1 of the C++-native modernization audit, plus two latent bug fixes uncovered while running the full test suite against AIStor + RDMA. End-to-end validated on coe09 (client) against a single-node coe08 RDMA MinIO build of latest eos.
Changes
Performance (HTTP transport)
http.cc: movecurl_global_initto a one-time function-local static (was per-request viacurlpp::Cleanup; non-thread-safe, expensive — OpenSSL init etc.)http.cc: attach a process-wideCURLSHto eachcurlpp::Easyso the connection cache, DNS cache, and TLS session cache survive the Easy handle's lifetime. One mutex perCURL_LOCK_DATA_*slot — a single mutex covering all slots self-deadlocks because libcurl can hold one slot's lock while acquiring another on the same thread.http.cc: enableCURLOPT_TCP_KEEPALIVEso pooled sockets survive idle gaps.Thread safety
baseclient.{h,cc}: guardregion_map_withstd::shared_mutex. Read sites now usefind()rather thanoperator[], so they no longer silently insert empty entries on every miss and can take a shared lock; erase + write sites take a unique lock.baseclient.cc:UploadPartpropagatesargs.headersinto the wrappedPutObjectApiArgsso caller-supplied per-part headers reach the signer.GPU-direct friendliness
request.cc:BuildHeadershonors a caller-suppliedx-amz-content-sha256. When the caller pre-sets it (e.g.UNSIGNED-PAYLOADfor a device-resident body) the signer uses that value and skipsSha256Hash(body)— avoids dragging GPU memory through OpenSSL just to compute a signing hash that TLS already authenticates.client.cc: the GPU-buffer HTTP fallback inPutObjectsetsUNSIGNED-PAYLOAD, and the multipart loop propagates it into everyUploadPart.RDMA path robustness
client.cc: wrap the page-aligned part buffer andcuMemObjregistration inAlignedBuffer+ScopedRDMARegistrationRAII guards so deregister-before-free ordering is enforced structurally, not by careful early-return ordering.client.cc: whencuMemObjGetDescriptordeclines to register (peermem not loaded, IB unhealthy, GPUDirect misconfigured), silently skip the RDMA path and fall through to HTTP multipart instead of failing the whole call.Bug fixes uncovered during validation
client.cc: fix UB inRemoveObjectsResult::Populate— when the caller'sfuncreturned false on the first call (empty batch),done_flipped to true butitr_stayed uninitialized.operator bool()then compared an uninit iterator → SIGSEGV the first time a caller copied/dereferenced the result. Confirmed pre-existing inmain(reproduces with Tier 1 reverted). Pinitr_toerrors.end()on empty batches.tests/tests.cc:SelectObjectContentnow detectsMethodNotAllowed(AIStor doesn't implement S3 Select) and printsskipped:instead of failing the suite.Validation
Built with
MINIO_CPP_ENABLE_RDMA=ONagainst vendored cuObj + libcufile, on coe09 (mlx5_0 IB NIC, 400 Gbps). Server: single-node AIStor at15.15.15.59:9200from latestminiohq/eoswithmake install-cgo TAGS=rdma.RDMA round-trip (
./GetPutRDMA 15.15.15.59:9200 minioadmin minioadmin 10485760) also passes: PUT + GET of a 10 MiB host-allocated, page-aligned, cuObj-registered buffer succeed against the AIStor RDMA endpoint.Test plan
MINIO_CPP_ENABLE_RDMA=ONbuild (release)testsbinary exit 0 against AIStorGetPutRDMAround-trip against AIStor RDMA endpointMINIO_CPP_ENABLE_RDMA=OFF(non-RDMA consumers)