Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all binding implementations follow the requirements of fdb_stop_network #3015

Open
ajbeamon opened this issue Apr 23, 2020 · 6 comments
Open
Assignees
Milestone

Comments

@ajbeamon
Copy link
Contributor

In the current C API, it is required that fdb_stop_network be called and the network thread allowed to complete before terminating the program. It seems we aren't doing this in all of our bindings, and absent a change to this requirement as proposed in #2978, this can lead to undefined behavior.

It looks like the current state of our in-tree bindings are that:

  • Ruby and Python follow the requirements of the API (stop then join)
  • Java stops the network thread but doesn't join it
  • Go doesn't stop the network thread automatically or provide any API to do it manually as far as I can tell

If #2978 is done, then this won't be an issue. However, I suspect that solving this problem is easier than the other one, so it may be worth updating the two bindings in the meantime.

@etschannen etschannen added this to the 6.3 milestone Apr 29, 2020
@ajbeamon
Copy link
Contributor Author

It seems I may have been mistaken about the Java case, where it looks like we do block on the run network call terminating in our implementation of stop network.

@vishesh
Copy link
Contributor

vishesh commented May 20, 2020

In Go land, it seems a bit harder to do.

Ruby, Python and Java are using the API provided by language atexit/onShutdownHook respectively which can register functions to be called when programs end. Go doesn't seem to have any equivalent. There is SetFinalizer but doesn't seem to be helpful in this case as the documentation says

The finalizer is scheduled to run at some arbitrary time after the program can no longer reach the object to which obj points. There is no guarantee that finalizers will run before a program exits

So it seems like Go really emphasizes on explicitly handling cleanup up stuff, and the solution has to be change in API itself.

@sfc-gh-anoyes
Copy link
Collaborator

It could be that the real requirement is that you join the network thread before returning from main, and atexit is too late to avoid the undefined behavior (atexit is also when global destructors are run)

@ajbeamon
Copy link
Contributor Author

Or maybe the fact that we are waiting for fdb_run_network to stop but not actually joining the thread that it's running in is a problem.

@ajbeamon
Copy link
Contributor Author

@vishesh So are you saying that we should expose the stopNetwork function in Go and that's it for now?

@gm42
Copy link
Contributor

gm42 commented May 14, 2024

In Go land, it seems a bit harder to do.

Indeed it is. This is the best I could come up with:

//go:linkname runtime_addExitHook runtime.addExitHook
func runtime_addExitHook(f func(), runOnNonZeroExit bool)

func init() {
	// this is a mitigation for https://github.com/apple/foundationdb/issues/3015
	// and it has the purpose of having our tests with -race enabled not crash with SIGSEGV
	// due to the destructors being invoked while the network thread is still running
	runtime_addExitHook(fdb.StopNetwork, true)
}

I am exposing StopNetwork() in a PR here, and will be testing out whether this approach works for tests with -race, or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants