Describe the bug, including details regarding any error messages, version, and platform.
Surfaced while implementing #825 (closes #815). See the PR's "Out of scope" section and this comment.
*scalar.Extension does not implement Release() or Retain(). As a result, when compute.ScalarDatum.Release() is called on a *ScalarDatum whose Value is an extension scalar, the storage scalar's Release() is never invoked and any underlying buffers leak.
Mechanism
compute.ScalarDatum.Release() only forwards Release() if the wrapped scalar satisfies the unexported releasable interface (arrow/compute/datum.go#L122-L130):
type releasable interface {
Release()
}
func (d *ScalarDatum) Release() {
if v, ok := d.Value.(releasable); ok {
v.Release()
}
}
*scalar.Extension (arrow/scalar/scalar.go#L397-L467) defines value(), equals(), Validate(), ValidateFull(), CastTo(), and String() — but neither Release() nor Retain(). The type assertion in ScalarDatum.Release() therefore fails and the inner storage scalar is never released, even though the storage scalar itself (e.g. *scalar.FixedSizeList) carries a buffer-bearing array.
Peer scalar types implement the pair correctly by delegating to their inner storage. For example, *scalar.Binary (arrow/scalar/binary.go#L43-L53):
func (b *Binary) Retain() {
if b.Value != nil {
b.Value.Retain()
}
}
func (b *Binary) Release() {
if b.Value != nil {
b.Value.Release()
}
}
*scalar.List, *scalar.Struct, *scalar.Dictionary, *scalar.SparseUnion, *scalar.DenseUnion, and *scalar.RunEndEncoded all follow the same pattern in arrow/scalar/nested.go. *scalar.Extension is the lone outlier among scalar types whose storage holds buffers.
Affected paths
Anything that produces a *scalar.Extension and reaches ScalarDatum.Release(). Concretely in this repo:
Reproducer
Drive any of the affected paths with memory.NewCheckedAllocator and call Release() on the resulting *ScalarDatum. AssertSize(t, 0) will fail because the storage scalar's buffer is never released. (#825's TestLiteralToDatumIntervalYearToMonth deliberately uses memory.DefaultAllocator for exactly this reason and explicitly defers the leak fix to this issue.)
Expected behavior
*scalar.Extension should implement Release() and Retain() by delegating to s.Value when the inner storage scalar implements Releasable (arrow/scalar/scalar.go#L68-L71), mirroring *scalar.Binary and the nested scalars. Suggested implementation:
func (s *Extension) Retain() {
if r, ok := s.Value.(Releasable); ok {
r.Retain()
}
}
func (s *Extension) Release() {
if r, ok := s.Value.(Releasable); ok {
r.Release()
}
}
This is purely additive:
- Storage scalars that don't implement
Releasable (e.g. primitive numeric storage) become no-ops, matching today's behavior.
- Storage scalars that do (e.g.
*scalar.FixedSizeList for the interval extension types) get properly released — closing the leak on the existing *types.IntervalYearToMonthType / *types.IntervalDayType paths and on any future extension types whose storage carries buffers.
- The
Retain() half restores symmetry: today, a caller that takes ownership of an extension scalar has no way to bump the storage refcount.
A follow-up test using memory.NewCheckedAllocator with AssertSize(t, 0) around the interval extension paths in arrow/compute/exprs/exec_internal_test.go would lock in the fix.
Component(s)
Other (arrow/scalar, arrow/compute)
Describe the bug, including details regarding any error messages, version, and platform.
Surfaced while implementing #825 (closes #815). See the PR's "Out of scope" section and this comment.
*scalar.Extensiondoes not implementRelease()orRetain(). As a result, whencompute.ScalarDatum.Release()is called on a*ScalarDatumwhoseValueis an extension scalar, the storage scalar'sRelease()is never invoked and any underlying buffers leak.Mechanism
compute.ScalarDatum.Release()only forwardsRelease()if the wrapped scalar satisfies the unexportedreleasableinterface (arrow/compute/datum.go#L122-L130):*scalar.Extension(arrow/scalar/scalar.go#L397-L467) definesvalue(),equals(),Validate(),ValidateFull(),CastTo(), andString()— but neitherRelease()norRetain(). The type assertion inScalarDatum.Release()therefore fails and the inner storage scalar is never released, even though the storage scalar itself (e.g.*scalar.FixedSizeList) carries a buffer-bearing array.Peer scalar types implement the pair correctly by delegating to their inner storage. For example,
*scalar.Binary(arrow/scalar/binary.go#L43-L53):*scalar.List,*scalar.Struct,*scalar.Dictionary,*scalar.SparseUnion,*scalar.DenseUnion, and*scalar.RunEndEncodedall follow the same pattern inarrow/scalar/nested.go.*scalar.Extensionis the lone outlier among scalar types whose storage holds buffers.Affected paths
Anything that produces a
*scalar.Extensionand reachesScalarDatum.Release(). Concretely in this repo:*types.IntervalYearToMonthTypeand*types.IntervalDayTypebranches in*expr.ProtoLiteralhandling, factored intointervalYearToMonthDatum/ the inline*types.IntervalDayTypebranch inarrow/compute/exprs/exec.go(theintervalYearToMonthDatumhelper is added in feat(arrow/compute/exprs): support expr.IntervalYearToMonthLiteral in literalToDatum #825; the underlying leak predates that PR).expr.IntervalYearToMonthLiteralvalue/pointer cases added in feat(arrow/compute/exprs): support expr.IntervalYearToMonthLiteral in literalToDatum #825.intervalYear()/intervalDay()extension type constructors inarrow/compute/exprs/extension_types.go#L142-L147.Reproducer
Drive any of the affected paths with
memory.NewCheckedAllocatorand callRelease()on the resulting*ScalarDatum.AssertSize(t, 0)will fail because the storage scalar's buffer is never released. (#825'sTestLiteralToDatumIntervalYearToMonthdeliberately usesmemory.DefaultAllocatorfor exactly this reason and explicitly defers the leak fix to this issue.)Expected behavior
*scalar.Extensionshould implementRelease()andRetain()by delegating tos.Valuewhen the inner storage scalar implementsReleasable(arrow/scalar/scalar.go#L68-L71), mirroring*scalar.Binaryand the nested scalars. Suggested implementation:This is purely additive:
Releasable(e.g. primitive numeric storage) become no-ops, matching today's behavior.*scalar.FixedSizeListfor the interval extension types) get properly released — closing the leak on the existing*types.IntervalYearToMonthType/*types.IntervalDayTypepaths and on any future extension types whose storage carries buffers.Retain()half restores symmetry: today, a caller that takes ownership of an extension scalar has no way to bump the storage refcount.A follow-up test using
memory.NewCheckedAllocatorwithAssertSize(t, 0)around the interval extension paths inarrow/compute/exprs/exec_internal_test.gowould lock in the fix.Component(s)
Other (
arrow/scalar,arrow/compute)