Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38281: [Go] Ensure CData imported arrays are freed on release #38314

Merged
merged 6 commits into from
Oct 18, 2023

Conversation

zeroshade
Copy link
Member

@zeroshade zeroshade commented Oct 17, 2023

Rationale for this change

The usage of SetFinalizer means that it's not guaranteed that calling Release() on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

What changes are included in this PR?

  • Some simplifications of code to use unsafe.Slice instead of the deprecated handling of reflect.SliceHeader to improve readability
  • Updating tests using mallocator.Mallocator in order to easily allow testing to ensure that memory is being cleaned up and freed
  • Fixing a series of memory leaks subsequently found by the previous change of using the mallocator.Mallocator to track the allocations used for testing arrays.

Are these changes tested?

Yes, unit tests are updated and included.

@github-actions
Copy link

⚠️ GitHub issue #38281 has been automatically assigned in GitHub to PR creator.

Comment on lines -433 to -439
runtime.SetFinalizer(imp.data, func(arrow.ArrayData) {
defer C.free(unsafe.Pointer(arr))
C.ArrowArrayRelease(arr)
if C.ArrowArrayIsReleased(arr) != 1 {
panic("did not release C mem")
}
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea to keep the finalizer in debug mode to assert that the ref count that's maintained manually via Retain/Release is indeed zero when the objected is not referenced anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually expose any API to externally check the current refcount on the objects so there's not really a way for the finalizer to check that. That's why I added the new tests to confirm that Release does actually free the memory, and users can use the CheckedAllocator if they want to ensure that things are being properly released

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Makes sense. I do remember unit tests complaining about my refcounts when I was implementing ListView.

Copy link
Contributor

@felipecrv felipecrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 17, 2023
go/arrow/memory/buffer.go Show resolved Hide resolved
go/arrow/cdata/exports.go Show resolved Hide resolved
go/arrow/cdata/cdata_fulltest.c Outdated Show resolved Hide resolved
go/arrow/cdata/cdata.go Show resolved Hide resolved
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 18, 2023
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Oct 18, 2023
@zeroshade zeroshade merged commit 0428c5e into apache:main Oct 18, 2023
25 checks passed
@zeroshade zeroshade removed the awaiting changes Awaiting changes label Oct 18, 2023
@zeroshade zeroshade deleted the c-data-release-ensure branch October 18, 2023 17:23
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 0428c5e.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 14 possible false positives for unstable benchmarks that are known to sometimes produce them.

JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
apache#38314)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: apache#38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
apache#38314)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: apache#38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
apache#38314)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: apache#38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Go] Release() on C Data-imported arrays and batches does not ensure release() is called
3 participants