Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] Release() on C Data-imported arrays and batches does not ensure release() is called #38281

Closed
pitrou opened this issue Oct 16, 2023 · 11 comments · Fixed by #38314
Closed
Assignees
Milestone

Comments

@pitrou
Copy link
Member

pitrou commented Oct 16, 2023

Describe the bug, including details regarding any error messages, version, and platform.

The current implementation of array and record batch importation in Go sets a finalizer to trigger the release callback when any reference to the underlying ArrayData is collected.

Unfortunately, Go finalizers are unreliable and even a GC run is not necessarily sufficient to get them invoked.

Relevant excerpts from https://pkg.go.dev/runtime#SetFinalizer :

The finalizer is scheduled to run at some arbitrary time after the program can no longer reach the object to which obj points. There is no guarantee that finalizers will run before a program exits, so typically they are useful only for releasing non-memory resources associated with an object during a long-running program.

A single goroutine runs all finalizers for a program, sequentially.

Component(s)

Go

@pitrou
Copy link
Member Author

pitrou commented Oct 16, 2023

cc @zeroshade

@pitrou
Copy link
Member Author

pitrou commented Oct 16, 2023

It seems we could instead use the same approach as C++ and Java do: when importing buffers, bind them to a ref-counted Allocator that will decref on Free and call the release callback when the decref drops to zero.

@zeroshade
Copy link
Member

While there is no guarantee that a particular finalizer will run before a program exits, it will eventually be called over the lifetime of a program or the program will exit (at which point it becomes moot since it would be released like any other memory from the program). Even though it's not guaranteed on any individual GC-run, in practice I've found that running the GC is enough to trigger the finalizer (and is how the python <--> Go integration tests in that dir work).

It seems we could instead use the same approach as C++ and Java do: when importing buffers, bind them to a ref-counted Allocator that will decref on Free and call the release callback when the decref drops to zero.

I can see a way in which we could possibly do this because Allocator is an interface, i'll see if i can sketch something out that can work for this.

@pitrou
Copy link
Member Author

pitrou commented Oct 16, 2023

Even though it's not guaranteed on any individual GC-run, in practice I've found that running the GC is enough to trigger the finalizer (and is how the python <--> Go integration tests in that dir work).

Indeed. However, in the real world, it can be annoying that a large piece of data doesn't get collected even after a GC run. My experience with C Data integration is that I have to run the GC up to 10 times.

(and I suppose Arrow Go users would like Release() to work correctly?)

@pitrou pitrou added this to the 15.0.0 milestone Oct 16, 2023
@zeroshade
Copy link
Member

That's absolutely interesting and not what I expected (though it's also not unexpected), I'll see what I can come up with.

@felipecrv
Copy link
Contributor

That's absolutely interesting and not what I expected (though it's also not unexpected), I'll see what I can come up with.

This is a classic issue with GC finalizers: they can't run at a deterministic time because the GC has to prove it's not referenced anymore (by the end of a tracing cycle) and even then it can decide to delay the finalization/collection to a later collection cycle to keep the cost of GC pauses low. This is a fundamental issue of tracing-based GCs, so calling finalizers "unreliable" is a bit unfair. :-)

Go's GC is also incremental: a tracing cycle doesn't traverse the whole graph at once. So these 10 invocations are necessary due to the size of the object graph. The bigger the graph, the higher the number of incremental steps necessary to reach an specific object in the graph.

(and I suppose Arrow Go users would like Release() to work correctly?)

That's very unsafe because code retaining references to the object won't know whether Release has been already called on the shared reference.

@pitrou
Copy link
Member Author

pitrou commented Oct 17, 2023

This is a fundamental issue of tracing-based GCs, so calling finalizers "unreliable" is a bit unfair.

I don't think so. Python finalizers work reliably after a GC pass, they don't stay pending until several other GC passes, waiting for a goroutine to pick them up.

(Python is reference-counted but also has a mark-and-sweep GC for reference cycles)

Go finalizers being executed "eventually" is a design choice, not a fundamental obligation.

Go's GC is also incremental: a tracing cycle doesn't traverse the whole graph at once. So these 10 invocations are necessary due to the size of the object graph.

Go's GC may be incremental, but Runtime.GC is not supposed to be. At least I couldn't find a source stating that Runtime.GC only performs an incremental GC pass.

That's very unsafe because code retaining references to the object won't know whether Release has been already called on the shared reference.

Code retaining references to the object should have called Retain() to ensure they own the reference to the object.

@felipecrv
Copy link
Contributor

(Python is reference-counted but also has a mark-and-sweep GC for reference cycles)

If you're keeping ref-counts, your GC can be precise. Even though you trace to break cycles, "tracing GC" refers to GCs that don't have ref-counts and always have to trace.

Go's GC may be incremental, but Runtime.GC is not supposed to be. At least I couldn't find a source stating that Runtime.GC only performs an incremental GC pass.

Interesting. Docs indeed say that, but apparently there is work being delayed if we're observing that an arbitrary number of cycles is necessary to get a collection.

Code retaining references to the object should have called Retain() to ensure they own the reference to the object.

Well... sure. If that is a public API this will be easier said than done.

@pitrou
Copy link
Member Author

pitrou commented Oct 17, 2023

Interesting. Docs indeed say that, but apparently there is work being delayed if we're observing that an arbitrary number of cycles is necessary to get a collection.

This is where we disagree on the interpretation :-). You are saying that Runtime.GC does not reliably ensure a full GC pass occurred. I could not find a reference to support that point. Other languages like Java or C# with sophisticated GCs also expose functions to ensure a full GC pass, so it's at least conceptually possible to do so.

My interpretation reading the Runtime.SetFinalizer doc is that, even if the GC detects that an object does not have any external references anymore, its finalization might be arbitrarily delayed (until the end of the process if you're unlucky).

Well... sure. If that is a public API this will be easier said than done.

AFAICT, all Go Arrow objects that expose a refcounted Release method also expose a Retain counterpart. But @zeroshade should be able to confirm or infirm this.

@felipecrv
Copy link
Contributor

felipecrv commented Oct 17, 2023

Haha. I wasn't interpreting the docs, just commenting about tracing GCs in general (the ones without ref-counting) based on what I've read in the past about Go's GC. You convinced me that the docs give very mixed signals.

Other languages like Java or C# with sophisticated GCs also expose functions to ensure a full GC pass, so it's at least conceptually possible to do so.

It's possible, but extremely costly, so people developing these runtimes tend to cheat to prevent their customers from killing their servers' throughput by abusing the manual GC API. Designs that rely on finalizers are very frowned upon.

AFAICT, all Go Arrow objects that expose a refcounted Release method also expose a Retain counterpart. But @zeroshade should be able to confirm or infirm this.

Yes. It's a common pattern. It would be a problem if the API was already being used publicly relying solely on GC's semantics.

@zeroshade
Copy link
Member

The reason why the Release/Retain API was put forth for the Arrow objects is because we have the Allocator interface to allow for allocations which don't rely on the Go GC, so we maintain the reference counts in order to ensure proper cleanup calls to Free on the allocator.

I leveraged this for #38314 (review)

zeroshade added a commit that referenced this issue Oct 18, 2023
)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: #38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this issue Oct 23, 2023
apache#38314)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: apache#38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
apache#38314)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: apache#38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
apache#38314)

### Rationale for this change
The usage of `SetFinalizer` means that it's not *guaranteed* that calling `Release()` on an imported Record or Array will actually free the memory during the lifetime of the process. Instead we can leverage a shared buffer count, atomic ref counting and a custom allocator to ensure proper and more timely memory releasing when importing from C Data interface.

### What changes are included in this PR?
* Some simplifications of code to use `unsafe.Slice` instead of the deprecated handling of `reflect.SliceHeader` to improve readability
* Updating tests using `mallocator.Mallocator` in order to easily allow testing to ensure that memory is being cleaned up and freed
* Fixing a series of memory leaks subsequently found by the previous change of using the `mallocator.Mallocator` to track the allocations used for testing arrays.

### Are these changes tested?
Yes, unit tests are updated and included.

* Closes: apache#38281

Authored-by: Matt Topol <zotthewizard@gmail.com>
Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants