Skip to content

fix(arrow/cdata): make nativeCRecordBatchReader deterministic#793

Merged
zeroshade merged 4 commits intoapache:mainfrom
zeroshade:deterministic-cdata
May 1, 2026
Merged

fix(arrow/cdata): make nativeCRecordBatchReader deterministic#793
zeroshade merged 4 commits intoapache:mainfrom
zeroshade:deterministic-cdata

Conversation

@zeroshade
Copy link
Copy Markdown
Member

Rationale for this change

Instead of relying on a finalizer, make the nativeCRecordBatchReader use atomic Retain and Release to make releasing deterministic.

What changes are included in this PR?

Remove the finalizer, call C.ArrowArrayRelease and C.ArrowArrayStreamRelease based on refcount.

Are these changes tested?

Yes, tests cover this already.

Are there any user-facing changes?

Only that retain and release now properly control the determinism of releasing the C memory instead of relying on a finalizer.

@zeroshade zeroshade requested a review from lidavidm April 30, 2026 17:15
Comment thread arrow/cdata/cdata.go Outdated
lidavidm pushed a commit to apache/arrow-adbc that referenced this pull request May 1, 2026
Looking into
adbc-drivers/mysql#99 (comment)
resulted in finding some of these issues, along with
apache/arrow-go#793

## Summary
Three bugs found and fixed in the CGO driver template
(`_tmpl/driver.go.tmpl`) and all generated drivers
(`flightsql`, `snowflake`, `panicdummy`).
### Bug 1 — off-by-one in `exportStringOption` (buffer overwrite)
`exportStringOption` wrote the null terminator to
`sink[lenWithTerminator]`
(`= sink[len(val)+1]`) instead of `sink[len(val)]`. When the caller
supplied
exactly the minimum buffer size (`len(val)+1`), this wrote one byte past
the
end of the allocated buffer.
### Bug 2 — fragile `cgo.Handle` recovery in Release functions
                                                            
The four Release functions (`ArrayStreamRelease`, `DatabaseRelease`,
  `ConnectionRelease`, `StatementRelease`) recovered the handle from
`private_data` using `(*(*cgo.Handle)(ptr))` — reinterpreting the
C-allocated `uintptr_t` wrapper as a `*cgo.Handle`. This worked by
coincidence (both are `uintptr`-sized) but was inconsistent with
`getFromHandle` and relied on an undocumented type-size coincidence.
Introduces `handleFromPtr(ptr unsafe.Pointer) cgo.Handle` as the single
canonical read-back path, used by both `getFromHandle` and all Release
functions. The misleading comment claiming the GC would corrupt the
handle
is replaced with an accurate explanation of the actual CGO rule being
satisfied.
### Bug 3 — unnecessary C allocation for handle storage; wrong `Delete`
ordering
                                                            
`createHandle` allocated a `uintptr_t` via `C.calloc` to hold the
handle's
numeric value, then stored a pointer to that allocation in
`private_data`.
This was not necessary: `cgo.Handle` is `type Handle uintptr` — an
integer,
not a Go heap pointer — so the CGO checker does not object to storing it
directly in a pointer-sized `void*` field.
The C allocation is eliminated entirely. `createHandle` now stores the
handle
value directly via `unsafe.Pointer(uintptr(hndl))`, and `handleFromPtr`
casts
it back with `cgo.Handle(uintptr(ptr))`. This removes a `calloc`/`free`
pair
from every New/Release call path and eliminates any possibility of a
leak from
an early return between allocation and free.
Additionally, the Release functions were calling `h.Delete()` before
`h.Value()` in some paths, which would panic — `Delete` removes the
entry from
the handle map, invalidating any subsequent `Value` call. The correct
sequence
is now applied consistently in all four Release functions:
1. Nil `private_data` (idempotence guard for double-release)
2. `h.Value()` — extract the Go object while the handle is still live
3. `h.Delete()` — remove from the map
  4. Use the extracted object
@zeroshade zeroshade merged commit e332733 into apache:main May 1, 2026
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants