Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go binding: do not automatically close database objects #11394

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gm42
Copy link
Contributor

@gm42 gm42 commented May 14, 2024

Setting the finalizer prevents user from calling Close(), as it would randomly result in SIGSEGV or some other silent memory corruption.

See #11383 (comment) for an example.

The change consists in setting no finalizer for the database and then expecting user to call Close() to avoid memory leaks.

NOTE: this must be considered a breaking change, since clients who do not update their usage of the binding will automatically upgrade to a memory leak if they open many databases without closing any.

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

@gm42
Copy link
Contributor Author

gm42 commented May 14, 2024

Cc @johscheuer

Alternative approach would be: to use an atomic.Value or a mutex to make sure that calling destroy() multiple times is safe.
I can change the PR for that approach, if it's preferable.

In the Go binding there's only 3 objects which have a finalizer:

  • transactions
  • futures
  • database

And currently only database has an user-reachable Close(), thus the other two use-cases are unaffected by this issue. Since databases are (usually) not created with high frequency introducing a mutex or atomic.Value is acceptable as the little performance cost is not expected to crop up.

Copy link
Contributor

@johscheuer johscheuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to take some time to go over the changes.

In general I would prefer if all structs like Database, Future and Transaction would implement a Close() method. That would give a user more control over the life cycle of the structs and would fit well into the go model where you just define a defer method to clean up the resources.

@gm42
Copy link
Contributor Author

gm42 commented May 14, 2024

I have to take some time to go over the changes.

Understood, thanks for the info; I would prefer as well that you take all the time needed as rush can only favor the introduction of bugs.

In general I would prefer if all structs like Database, Future and Transaction would implement a Close() method. That would give a user more control over the life cycle of the structs and would fit well into the go model where you just define a defer method to clean up the resources.

I think that's precisely the more idiomatic Go way e.g. instead of setting finalizers letting user call Close() on defer. If in future the binding would expose Close() methods for transactions and futures as well, the same issue I raised here would appear e.g. the function called by finalizer and the function called by user (Close()) must be safe to call concurrently and multiple times.

Given that constraint I can see only two possible approaches:

  1. removing the finalizer (as done in this PR)
  2. using a mutex or atomic.Value to make sure that multiple calls are safe

Let me know if you prefer to retool for (2), after you have time to look into the changes.

johscheuer
johscheuer previously approved these changes May 27, 2024
Copy link
Contributor

@johscheuer johscheuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that constraint I can see only two possible approaches:

removing the finalizer (as done in this PR)
using a mutex or atomic.Value to make sure that multiple calls are safe
Let me know if you prefer to retool for (2), after you have time to look into the changes.

Removing the finalizer is from my point of view the best option. I wouldn't try to make it more complex than needed. Most (if not all) of the standard go library that implement Close() are not safe to call concurrency either: https://cs.opensource.google/go/go/+/refs/tags/go1.22.3:src/os/file_posix.go;l=15-24.

// You have to ensure that you're not resuing this database.
// It must be called exactly once for each created database.
// You have to ensure that you're not reusing this database
// after it has been closed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return an error in the close method if the DB is already closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case the Close() function should set d.ptr to nil, which it currently doesn't, and calling Close() on an improperly-constructed Database would have an undetermined effect (most likely SIGSEGV) instead of current behavior (no-op).

I can make the change and add some test coverage, however that likely means that in future when adding Close() for futures and transactions they'd also need to have this return value for consistency, and that might clutter a bit the calling code.

Since having this error on Close() still does not do anything regarding concurrency-safety (caller must make sure of that), we could leave it without error; let me know what you prefer.

The error return value can always be added later in its own PR.

Copy link
Contributor

@johscheuer johscheuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jzhou77 this PR would add a breaking change in the go bindings, is there a good place to announce this somewhere e.g. in the release notes: https://github.com/apple/foundationdb/tree/main/documentation/sphinx/source/release-notes?

@johscheuer johscheuer closed this May 27, 2024
@johscheuer johscheuer reopened this May 27, 2024
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 9fd1f5a
  • Duration 0:21:42
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 9fd1f5a
  • Duration 0:34:41
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 9fd1f5a
  • Duration 0:35:50
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 9fd1f5a
  • Duration 0:46:59
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 9fd1f5a
  • Duration 0:47:12
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 9fd1f5a
  • Duration 0:51:29
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

Setting the finalizer prevents user from calling Close(),
as it would randomly result in SIGSEGV or some other silent memory corruption.
No finalizer is set for the database and user is expected to call Close() to
avoid memory leaks.
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 399c62a
  • Duration 0:22:51
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 399c62a
  • Duration 0:34:54
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 399c62a
  • Duration 0:46:02
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 399c62a
  • Duration 0:47:56
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 399c62a
  • Duration 0:53:33
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 399c62a
  • Duration 1:01:19
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants