Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: support dlclose with -buildmode=c-shared #11100

Open
mattn opened this issue Jun 6, 2015 · 48 comments
Open

runtime: support dlclose with -buildmode=c-shared #11100

mattn opened this issue Jun 6, 2015 · 48 comments
Milestone

Comments

@mattn
Copy link
Member

@mattn mattn commented Jun 6, 2015

package main

import (
    "C"
    "fmt"
)

var (
    c chan string
)

func init() {
    c = make(chan string)
    go func() {
        n := 1
        for {
            switch {
            case n%15 == 0:
                c <- "FizzBuzz"
            case n%3 == 0:
                c <- "Fizz"
            case n%5 == 0:
                c <- "Buzz"
            default:
                c <- fmt.Sprint(n)
            }
            n++
        }
    }()
}

//export fizzbuzz
func fizzbuzz() *C.char {
    return C.CString(<-c)
}

func main() {
}

build this with

$ go build -buildmode=c-shared -o libfizzbuzz.so libfizzbuzz.go

then go

from ctypes import *
import _ctypes
lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
_ctypes.dlclose(lib._handle)
1
2
Fizz
4
Buzz
Fizz
Segmentation fault
@minux minux closed this Jun 6, 2015
@minux
Copy link
Member

@minux minux commented Jun 6, 2015

I don't understand. What do you expect otherwise?
The code is still running, and you've unmapped its
pages.

In general, it's impossible to dlclose a Go shared
library (or a plugin).

Loading

@mikioh mikioh changed the title c-shared crash with unload runtime: dlclose in shared library causes segmentation fault Jun 6, 2015
@mattn
Copy link
Member Author

@mattn mattn commented Jun 7, 2015

eventhough close(c) and wait exiting goroutine, it reproduce.

On 6/7/15, Minux Ma notifications@github.com wrote:

I don't understand. What do you expect otherwise?
The code is still running, and you've unmapped its
pages.


Reply to this email directly or view it on GitHub:
#11100 (comment)

  • Yasuhiro Matsumoto

Loading

@minux
Copy link
Member

@minux minux commented Jun 7, 2015

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jun 7, 2015

I think this is a legitimate feature request. Although we currently do not support calling dlclose on a Go shared library, and it would be difficult to make it work, it is not fundamentally impossible.

Loading

@ianlancetaylor ianlancetaylor reopened this Jun 7, 2015
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Jun 7, 2015
@minux
Copy link
Member

@minux minux commented Jun 7, 2015

Loading

@mattn
Copy link
Member Author

@mattn mattn commented Jun 8, 2015

This also reproduce.

package main

import (
        "C"
        "fmt"
)

var (
        c chan string
        q chan struct{}
)

func init() {
        c = make(chan string)
        q = make(chan struct{})
        go func() {
                defer func() {
                        recover()
                        q <- struct{}{}
                }()
                n := 1
                for {
                        switch {
                        case n%15 == 0:
                                c <- "FizzBuzz"
                        case n%3 == 0:
                                c <- "Fizz"
                        case n%5 == 0:
                                c <- "Buzz"
                        default:
                                c <- fmt.Sprint(n)
                                println("stop")
                        }
                        n++
                }
        }()
}

//export fizzbuzz
func fizzbuzz() *C.char {
        return C.CString(<-c)
}

//export finish
func finish() {
        close(c) // occur panic of sending closed channel in above
        <-q      // wait goroutine
}

func main() {
}
from ctypes import *
import _ctypes
lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
print lib.fizzbuzz()
lib.finish()
_ctypes.dlclose(lib._handle)

Loading

@mattn
Copy link
Member Author

@mattn mattn commented Jun 8, 2015

i added runtime.LockOSThread() in top of init() but not.

Loading

@gcatlin
Copy link

@gcatlin gcatlin commented Jul 13, 2015

I have a need for this functionality too. My goal is to be able to make changes to a running Go game engine without needing to completely reload the game.

The implementation idea is to have 2 layers, the platform layer and the game layer.

The platform layer is GOOS/GOARCH-specific and is written in C. It is responsible for the game loop wherein it gathers controller input and manages image and sound buffers that it provides to the game layer for writing. It uses OS-provided functionality to output graphics and sound.

The game layer is platform agnostic, contains game logic only, and is written in Go. It is called by the platform layer once per iteration of the game loop. The game layer takes the controller input and buffers provided by the platform layer, updates the game state, and writes to the buffers.

The game layer is a shared library, built using -buildmode=c-shared, that can be edited and recompiled at any time. When the platform layer detects that the shared library was modified, it unloads the previous version of the library (dlclose) then loads the new version of the library (dlopen).

See https://gist.github.com/gcatlin/e09359f6e53f37e74a82

I'm trying this on darwin/amd64 and getting the following error when calling dlopen on the shared library for the second time (i.e. after detecting that the shared library has changed):

runtime/cgo: could not obtain pthread_keys
    tried 0x101 0x102 0x103 0x104 0x105 0x106 0x107 0x108 0x109 0x10a 0x10b 0x10c 0x10d 0x10e 0x10f 0x110 0x111 0x112 0x113 0x115 0x116 0x117 0x118 0x119 0x11a 0x11b 0x11c 0x11d 0x11e 0x11f 0x120 0x121 0x122 0x123 0x124 0x125 0x126 0x127 0x128 0x129 0x12a 0x12b 0x12c 0x12d 0x12e 0x12f 0x130 0x131 0x132 0x133 0x134 0x135 0x136 0x137 0x138 0x139 0x13a 0x13b 0x13c 0x13d 0x13e 0x13f 0x140 0x141 0x142 0x143 0x144 0x145 0x146 0x147 0x148 0x149 0x14a 0x14b 0x14c 0x14d 0x14e 0x14f 0x150 0x151 0x152 0x153 0x154 0x155 0x156 0x157 0x158 0x159 0x15a 0x15b 0x15c 0x15d 0x15e 0x15f 0x160 0x161 0x162 0x163 0x164 0x165 0x166 0x167 0x168 0x169 0x16a 0x16b 0x16c 0x16d 0x16e 0x16f 0x170 0x171 0x172 0x173 0x174 0x175 0x176 0x177 0x178 0x179 0x17a 0x17b 0x17c 0x17d 0x17e 0x17f 0x180 0x181
fatal error: cgo callback before cgo call

Is there a different way to achieve this with Go?

Loading

@crawshaw crawshaw changed the title runtime: dlclose in shared library causes segmentation fault runtime: support dlclose with -buildmode=c-shared Jul 17, 2015
@crawshaw crawshaw added this to the Go1.6 milestone Jul 17, 2015
@crawshaw crawshaw removed this from the Unplanned milestone Jul 17, 2015
ianlancetaylor added a commit that referenced this issue Oct 8, 2015
Go shared libraries do not support dlclose, and there is no likelihood
that they will suppose dlclose in the future.  Set the DF_1_NODELETE
flag to tell the dynamic linker to not attempt to remove them from
memory.  This makes the shared library act as though every call to
dlopen passed the RTLD_NODELETE flag.

Fixes #12582.
Update #11100.
Update #12873.

Change-Id: Id4b6e90a1b54e2e6fc8355b5fb22c5978fc762b4
Reviewed-on: https://go-review.googlesource.com/15605
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
@rsc rsc added this to the Unplanned milestone Nov 5, 2015
@rsc rsc removed this from the Go1.6 milestone Nov 5, 2015
@z505
Copy link

@z505 z505 commented May 1, 2017

Minux said:

It's possible if the user can be certain it doesn't hold onto
any Go objects and all resources allocated by Go code
has been freed (esp. no background goroutines).

However, as there is no way to kill a goroutine, I think all
sufficiently sophisticated Go shared library will not be unloadable.

Hi, so how does go know it is safe for the exe/elf to close if a user kills the application or hits the close button on a win32 gui window... There must be some way of finding out if it is safe? or go program just exits badly with hanging data around?
Or how do you halt/exit a go app safely (an exe, not a dll)..? Would exiting a go app (exe) safely be the same as unloading a dll safely?

The go runtime does not keep a reference count of the number of goroutines currently open? This should not be a performance hit to keep track of N number of goroutines open? Each goroutine opened increments a ref count by 1. But maybe there is already something implemented like this, that could be tapped into and used for dll's to know if all goroutines are finished.

Minux says:

For example, the os/signal package contains a background
goroutine to check for newly arrived signals.

Can you increment a reference counter and unload the library later once all signals are finished (goroutine ref count hit zero)? How does the exe/elf normally exit safely and kill the program if goroutines could be still running? Simply make the DLL unload function act like how a normal go exe program safely exits? easier said than done, likely :-)

Loading

@glycerine
Copy link

@glycerine glycerine commented May 1, 2017

Minux wrote,

You can't stop the runtime, which uses its own OS threads.

This is the root of the problem, @z505. I don't believe there is any graceful shutdown of the runtime threads under normal process exit/termination. Hence they may still have references to DLL memory. So currently there's no safety problem with process termination; just doing an exit() call, it simply unceremoniously stops the process, and doesn't need to do any graceful cleanup.

or go program just exits badly with hanging data around?

Yep.

Loading

@brutestack
Copy link

@brutestack brutestack commented May 16, 2018

I'm using Go c-shared library in my Unity3D game.
Unity itself does not support unloading libraries, but when you try to exit Unity Editor or Unity based game it starts waiting untill all libraries complete their work (wait untill all threads stop)

In case of using Go c-shared library which creates at least one goroutine Unity will wait forever and will not be able to exit. This will not happen if library does not utilize goroutines.
So, go creates thead for my goroutine and never terminates it even if there is nothing to do in this thread.

Here is sample library code to reproduce:

package main

import "C"
import "time"

//export TestLib
func TestLib() {
	go func() {
		for i := 0; i < 10; i++ {
			fmt.Println("testlib.dll is here")
			time.Sleep(3 * time.Second)
		}
		fmt.Println("testlib.dll thread Done")
	}()
}

func main() {
}

I'm compiling that library on Linux that way:
env GOOS=windows GOARCH=amd64 CGO_ENABLED=1 CXX=x86_64-w64-mingw32-g++ CC=x86_64-w64-mingw32-gcc go build -o $(go env GOPATH)/lib/win64/testlib.dll -buildmode c-shared unity3d.com/testlib

Here is c# class that utilizes this library:

using System;
using System.Runtime.InteropServices;
using UnityEngine;

public class TestLibrary : IDisposable {
	public TestLibrary()
	{
		ServerProcess();
	}

	private void ServerProcess()
	{
		TestLib();
	}
		
	public void Dispose()
	{
	}


	[DllImport ("testlib")]
	private extern static void TestLib ();

}

Unity is besed on Mono, that's why c#.

There is a very-very dirty trick that fixes my problem and allows to exit my game without wating forever - I have implemented function in my Go library that causes panic and library crashes when I need to stop everything and exit game. Actually nobody should do that, but it works.
Additional code in Go library:

//export Panic
func Panic() {
	var panicChan chan bool
	close(panicChan)
}

Wortst of it is that I don't know what exactly crashes after calling Panic() from C# - Unity(Mono) or Go library, but I suspect Unity.
Anfortunately crash is the only way to exit Unity game that uses Go c-shared library with goroutines...

Another workaround is to compile Go code as executable and run it in separate process, but this is not sutible for iOS version of my game because iOS does not let executing separate processes (even included in the same application boundle) without using private APIs

Loading

@glycerine
Copy link

@glycerine glycerine commented Apr 18, 2019

I'll attempt this. Background from https://groups.google.com/forum/#!topic/golang-nuts/L-tby34r5Gs

Ian wrote

Jason wrote> Where does the source for the runtime scheduler and garbage collector live these days?
The central locations are runtime/proc.go and runtime/mgc.go.
Jason wrote> Wherefore, I need to locate all runtime background threads and add in a means to shut them down upon request.
That's not currently supported, but it may be possible to modify the
scheduler to do it. There is no simple way.

Ian

Loading

@glycerine
Copy link

@glycerine glycerine commented Apr 23, 2019

After some thinking, I realized that the main use case is really during development: to unload a Go DLL, and then re-load a modified version of that DLL; this being done during coding and evolving the DLL. We don't really want to stop the runtime, because we'll just then need to restart it upon the re-load of the newer version of the DLL.

So I propose the following, more general, approach to supporting dlclose with -buildmode=c-shared:

a) On windows, additionally during build of Go from source, build the runtime as a distinct DLL. Once the runtime DLL is loaded, it will never be unloaded, even if client Go DLLs that depend on the runtime are unloaded. This solves the tricky part of trying to halt the runtime cleanly, because we don't need to do that after all.

b) On windows, build Go DLLs as libraries that are clients of the Go runtime DLL. Each client Go DLL will dynamically load the Go runtime DLL if it is not already loaded, taking care prevent race on load by some means when there are two or more Go DLLs loaded during process start (I expect this to be a common race; perhaps the first to claim a pre-agreed upon localhost port wins the race and gets to load the runtime DLL from some well known location in GOROOT). Each Go DLL will increment the reference count on the runtime twice, so that Windows never unloads the runtime DLL, even if the client Go DLL is unloaded by a dlclose() call. A distinct new buildmode may be indicated, to distinguish it from c-shared which currently bundles another copy of the runtime into every DLL, and probably needs to continue to do so for backwards compatibility. Suggestion: buildmode=common-runtime-dll.

c) the extra benefit: now multiple DLLs when loaded all share the same runtime, and so the possibility of communicating via channels between DLLs becomes viable.

@ianlancetaylor @alexbrainman @minux and anyone else with wisdom to contribute: Feedback welcome.

Loading

@dchest
Copy link
Contributor

@dchest dchest commented Apr 23, 2019

@glycerine that may be the main use case for some, but others are writing shared libraries in Go to plug into other programs (e.g. see @brutestack's case above, or from my personal experience and @mattn's description for this issue, a library that is used from other programming languages) and run them in production, where they don't control how the library is used. I think your proposal, while useful for some cases, doesn't solve the issue.

Loading

@glycerine
Copy link

@glycerine glycerine commented Apr 23, 2019

@dchest: This issue is about dlclose() support. The plan is perfectly compatible with plugging into non Go host programs. If you don’t understand why, ask about the specific point where you are confused, so that we can clarify. Also divide and conquer wise, we can’t do everything at once, so I’m okay with no runtime shutdown just prior to program termination. Especially as a workaround is already posted above.

Loading

@zdjones
Copy link
Contributor

@zdjones zdjones commented Apr 30, 2019

@brutestack I don't have any experience with it, but have you looked into declaring a routine in your DLL specifying __attribute__((destructor))? It sounds like it may fit your use case.

dlclose man
gcc function attributes

Loading

@brutestack
Copy link

@brutestack brutestack commented Apr 30, 2019

@zdjones, what should I do inside destructor to stop all go threads legally?
Unity3D engine hangs before exit because it waits all child threads (even in libraries) to stop, but threads serving goroutines never stop even when all goroutines done and even after dlclose.
@glycerine may be right, Unity3D should not wait and this is a bug.
But I'm sure Go must have some legal way to stop all thread pools and do other clean up things legally (without panic).

Loading

@glycerine
Copy link

@glycerine glycerine commented Apr 30, 2019

But I'm sure Go must have some legal way to stop all thread pools and do other clean up things legally (without panic).

@brutestack There is not, presently. Panic is the closest we've got. The Go runtime is just not very used to being inside a library.

So I suspect that we'll have to add a method to indicate whether full shutdown is needed, or if the runtime should keep going, upon dlclose(). But one step at a time. We're currently trying to figure out how to get the runtime initialized after it has been dynamically loaded at runtime.

If manual halt of threads is required, then we actually have to figure out how to do that, which as you can probably guess from the above, will take some surgery. Linux C libraries use pthread_cancel, pthread_exit, or have the thread return from its starting routine. I'm not sure what the equivalent raw system calls are that don't use pthreads. Probably have to read the pthreads source. On Windows it is more obvious, as the OS provides a TerminateThread() API.

Note to self: we may well also need to add a mechanism to have the runtime not squash the host's signal handlers. I had to deal with this before when embedding a Go .so library inside R. R, as a C host, expects to receive SIGINT which the Go runtime, at least by default, overwrites. (c.f. https://github.com/glycerine/rmq/blob/master/src/cpp/interface.cpp#L41 through L80)

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 30, 2019

On Unix systems you can kill a thread (not, of course, a goroutine) by sending it a SIGKILL signal, via whatever system call is invoked by pthread_kill (on GNU/Linux this is tgkill). Or you can kill it less aggressively by sending it some other signal.

The guidelines for signal handlers can be seen at https://golang.org/pkg/os/signal/#hdr-Non_Go_programs_that_call_Go_code . It should work already, and it's unlikely that adding a knob will help.

Loading

@glycerine
Copy link

@glycerine glycerine commented Apr 30, 2019

Thanks Ian!

The guidelines for signal handlers can be seen at https://golang.org/pkg/os/signal/#hdr-Non_Go_programs_that_call_Go_code .

"Go code built with -buildmode=c-archive or -buildmode=c-shared will not install any other signal handlers by default."

Since this doesn't specifically mention -buildmode=shared, I assume that we will just need to make shared work like c-shared? I see alot of places in runtime/proc.go where c-shared and c-archive are special cased, but little-to-no mention of shared handling.

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 30, 2019

For -shared I think you have to ensure that all the signal handlers are installed by a runtime that will never be closed.

Loading

@glycerine
Copy link

@glycerine glycerine commented Apr 30, 2019

For -shared I think you have to ensure that all the signal handlers are installed by a runtime that will never be closed.

Ian, would you mind elaborating on this? Why would signal handlers prevent a -shared runtime from being shut down? Could the signal handlers not be uninstalled first?

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 1, 2019

What I mean is: a -buildmode=shared runtime is intended to be run by a Go program. The main Go program should be handling the signal handlers. The -buildmode=shared shared library should not be handling the signal handlers. (Unless the -buildmode=shared library is itself the instance of the runtime package used by the main program, in which case shutting it down is tantamount to exiting the program.)

Loading

@glycerine
Copy link

@glycerine glycerine commented May 1, 2019

a -buildmode=shared runtime is intended to be run by a Go program.

Ah. There's where I was thinking differently. We used -buildmode=shared to build separate go-runtime.dll and go-client.dll, so I was thinking that these would both be loaded by the host C program. I suppose that's not viable? Is it possible to have separate go-runtime.dll and goclient.dll that are buildmode=c-shared ?

Windows is truly bizarre. Apparently it is indeed possible to lock a DLL permanantly in memory, even after the host program exits... The author of this blog uses my double-reference count suggestion to keep his DLL in memory... and then says there is even a stronger approach -- one immune to double FreeLibrary calls -- by pinning the DLL to thread 0 so it stays in memory while the current desktop is alive... just wow.

https://blogs.msmvps.com/vandooren/2006/10/09/preventing-a-dll-from-being-unloaded-by-the-app-that-uses-it/

"The right way that I described is indeed the right way on pre-XP systems. On XP or later you can use GetModuleHandleEx with the GET_MODULE_HANDLE_EX_FLAG_PIN flag to prevent unloading of the DLL.
The advantage is that the calling app cannot unload the DLL by calling FreeLibrary twice. The disadvantage is that you cannot unload the DLL anymore, even if you should want to."

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 1, 2019

It sounds like you are thinking about using -buildmode=shared to build a shared library that is then shared by other Go packages built using -buildmode=c-shared -linkshared. That might work to some extent but I agree that at present signal handlers will not work well.

There is no way at present to separate the Go runtime and Go client into separate shared libraries built with -buildmode=c-shared. The intent with c-shared is to provide a self-contained library that a C program can use.

I guess I'm not sure what real advantage you get by splitting out the Go runtime into a separate shared library. It's theoretically interesting but I don't know who would want to do that in practice.

Loading

@glycerine
Copy link

@glycerine glycerine commented May 1, 2019

I guess I'm not sure what real advantage you get by splitting out the Go runtime into a separate shared library. It's theoretically interesting but I don't know who would want to do that in practice.

The main advantage I was after in building the runtime as a separate library was that two or more Go DLLs (both clients of the go-runtime.dll) loaded into one process could communicate (say over channels), because they would share the same runtime.

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 1, 2019

But since they are c-shared, they can only provide a C API. Returning a channel from a Go function exported by a c-shared library is not permitted by the pointer-passing rules, so the only way you could get a channel from one library to the other would be through shenanigans.

Loading

@glycerine
Copy link

@glycerine glycerine commented May 1, 2019

I was thinking of shared libraries that would provide Go API to Go callers in the same address space.

The other big reason for thinking along these lines -- of having the Go runtime library separate from the user-written guest library -- is dlclose(); the idea being that if the guest/client Go DLLs can be closed separately, but have the runtime persist, that might solve the crashes (presumably due to the runtime having threads that are going when suddently their code pages disappear). I think this would work as long as the guest library can shutdown its own goroutines. For example, when DllMain() is called with DLL_PROCESS_DETACH the guest/user library shuts down all its goroutines before returning. So long as the runtime still has its code pages mapped, its code would not crash presumably.

Addendum: and then if the guest reloads (expected often), the guest could just plug into the same runtime again, presumably starting up faster.

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented May 1, 2019

If there is a single shared runtime library, how can we decide which goroutines are associated with which library? Say we send a function across a channel and start a goroutine that runs a loop calling that function. In general a single goroutine might cross between code in different libraries. Once we start permitting that, I don't see how we can ever release a library completely.

Loading

@glycerine
Copy link

@glycerine glycerine commented May 3, 2019

That's hard to argue with.

So at the moment the solution/workaround for "supporting" dlclose() seems to be to allow it but to not really close. On Windows, at any rate, we can do this by pinning the shared library into memory using one or more of the technique(s) cited above from
https://blogs.msmvps.com/vandooren/2006/10/09/preventing-a-dll-from-being-unloaded-by-the-app-that-uses-it/
and then, on actual full process shutdown, if need be to address Unity bugs, use the panic() approach, and hope we are last to unload since nobody else will get to unload after us (the process is gone after a panic).

Since this approach addresses my needs, I'm not going to attempt anything more elaborate.

Loading

@typeless
Copy link

@typeless typeless commented May 6, 2019

go install -buildmode=shared runtime
go build -buildmode=shared -linkshared PKG

I tested with go install -buildmode=shared runtime sync/atomic and go build -buildmode=shared -linkshared log and it built successfully. But what surprised me is that the size of the liblog.so:

-rw-rw-r-- 1 mura mura 8.7M May  6 10:45 liblog.so

Loading

@Dids
Copy link

@Dids Dids commented Jul 1, 2020

Any updates on this?

In my specific case, I was hoping to use the plugin system for a kind of "hot module replacement" system, with the idea being that you can edit any "module", which will automatically trigger a recompilation of the plugin, as well as "reloading" the plugin after it has successfully compiled. All without having to restart the main Go application.
I know I could achieve the same with various scripting languages (or even RPC), but I'd much rather do everything with Go.

So far the only workaround I've found is to version the plugins, even just incrementing them works.
This of course has the downside of previous plugin versions not being unloaded (as far as I know?), but considering these "modules" would be very constrained and small in my case, I'm wondering if this would still be acceptable?

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Jul 1, 2020

There are no updates. Any updates will appear on this issue.

Loading

@jtarchie
Copy link

@jtarchie jtarchie commented Aug 31, 2021

Is this a platform specific issue? I'm on Mac Intel and have not been able to reproduce this at all.

I've tried the original implementation (using go 1.17) and using it as a shared library in Ruby, LuaJIT, and Python.
It works fine, no Segfault.

package main

import (
    "C"
    "fmt"
)

var (
    c chan string
)

func init() {
    c = make(chan string)
    go func() {
        n := 1
        for {
            switch {
            case n%15 == 0:
                c <- "FizzBuzz"
            case n%3 == 0:
                c <- "Fizz"
            case n%5 == 0:
                c <- "Buzz"
            default:
                c <- fmt.Sprint(n)
            }
            n++
        }
    }()
}

//export fizzbuzz
func fizzbuzz() *C.char {
    return C.CString(<-c)
}

func main() {}

Compile into shared Library.

go build -buildmode=c-shared -o libfizzbuzz.so libfizzbuzz.go

Then run in Ruby 3.0.0

require 'fiddle'

libfb = Fiddle.dlopen("libfizzbuzz.so")
fb    = Fiddle::Function.new(
  libfb['fizzbuzz'],
  [],
  Fiddle::TYPE_VOIDP
)

puts fb.call
puts fb.call
puts fb.call
puts fb.call
puts fb.call

libfb.close

Loading

@nightlark
Copy link

@nightlark nightlark commented Aug 31, 2021

Is this a platform specific issue? I'm on Mac Intel and have not been able to reproduce this at all.

I tried it on WSL2, and it didn't segfault -- though it doesn't seem to be closing properly.

Here's a simple C shared library I compiled to compare against the behavior of the Go fizzbuzz program:

int c = 0;
int notfizzbuzz() {
        c = c + 1;
        return c;
}

With this C shared library, I'm able to dlclose the library in Python3 and then reopen it, and the numbers start counting from 0 again. If I dlclose the library and then try to call the function imported from the library, I get a segfault (as expected).

With the Go shared library, after calling dlclose if I open the library again, the numbers returned from fizzbuzz don't restart, they just keep counting from where it left off previously. In addition to that, if I dlclose the shared library then I am still able to call lib.fizzbuzz() and it just keeps counting as if the library had not been closed.

Loading

@jtarchie
Copy link

@jtarchie jtarchie commented Sep 1, 2021

This might be Windows specific.

Loading

@nightlark
Copy link

@nightlark nightlark commented Sep 1, 2021

@jtarchie it doesn't appear to be Windows specific, I tried the same test on Linux -- on both platforms, calling dlclose is not actually closing the handle to the shared library written in Go, but does close the handle for the library written in C.

I don't use Ruby, but I think the equivalent for Ruby of what I tried in Python would be:

require 'fiddle'

libfb = Fiddle.dlopen("libfizzbuzz.so")
fb    = Fiddle::Function.new(
  libfb['fizzbuzz'],
  [],
  Fiddle::TYPE_VOIDP
)

puts fb.call
puts fb.call
puts fb.call
puts fb.call
puts fb.call

libfb.close

puts fb.call

What I did in Python was:

from ctypes import *
import _ctypes
lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print(lib.fizzbuzz())
print(lib.fizzbuzz())
print(lib.fizzbuzz())
print(lib.fizzbuzz())
_ctypes.dlclose(lib._handle)

print(lib.fizzbuzz())

lib = CDLL("./libfizzbuzz.so")
lib.fizzbuzz.restype = c_char_p
print(lib.fizzbuzz())

And the result was:

1
2
Fizz
4
Buzz
Fizz

The C version segfaults as expected after calling dlclose.

Loading

@tmm1
Copy link
Contributor

@tmm1 tmm1 commented Sep 30, 2021

Looks like dlclose is currently ignored:

// Pass -z nodelete to mark the shared library as
// non-closeable: a dlclose will do nothing.
argv = append(argv, "-Wl,-z,nodelete")

It sounds like from this thread, an approximate solution would be to:

  • revert above lines
  • stop all user goroutines
  • invoke panic from golang code and catch/ignore from C
  • send SIGKILL to GC thread
  • call dlcose()

Does that sound right?

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Sep 30, 2021

@tmm1 Not really. Consider #11100 (comment). Consider what should happen for a goroutine that is currently sitting in C code; if we remove the Go shared library then when the C code returns it will crash or (if some other shared library has been opened) behave unpredictably. Note that there is no separate GC thread; in the current Go runtime GC is handled by ordinary goroutines.

Loading

@tmm1
Copy link
Contributor

@tmm1 tmm1 commented Oct 1, 2021

I understand there are a lot of complexities and edge cases.

In my case I am working in a plugin environment where I have a lot of control and can ensure that no goroutines will be busy calling into C code or other shared libraries. I have the ability to make sure all my code is done before calling dlclose, so I'm wondering what's required/possible in that more limited scenario.

I suppose parts of the runtime internally could be calling into libc even if my user code is not. What I observe currently is that many of the OS threads related to the runtime start busy looping after I dlclose. I'll try to make a list of them and kill them to see if that helps.

Loading

@tmm1
Copy link
Contributor

@tmm1 tmm1 commented Oct 1, 2021

After looking more into this today, it seems there is no way to kill a thread. Even if you use pthread_kill (i.e tgkill), the signal will be delivered to that particular thread but the process as a whole will be affected by it if its a stop/terminate/kill signal.

What I observe currently is that many of the OS threads related to the runtime start busy looping after I dlclose.

I observed this behavior on macOS, where ps -M showed several threads at 100% cpu after dlclose. Using sample on the pid shows unknown backtraces, and lsof -nPp confirmed that the golang dylib was no longer mapped causing those runtime threads to be very confused.

It turns out -Wl,-z,nodelete does not work on macOS.

I found instead that using dlopen with RTLD_NODELETE has the same effect, and fixed the cpu usage issues I was having.

On Windows, a similar GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_PIN, ...) is required for the same effect.

My solution for now is to simply allow old copies of the runtime and my code to stay resident in memory. When my plugin updates, I load a new dylib/so with a fresh copy of the runtime and my code. These copies of the golang runtime seem to be happy residing side by side. The old runtimes are basically doing nothing anyway.

Note for anyone else attempting something similar: on macOS, be sure to pass RTLD_LOCAL (which is the default on Linux but not macOS). Without this flag, the handle returned by dlopen will be stale after renaming a dylib and moving a new one in its place. Similarly on Windows, LoadLibraryEx will return an old handle for a given full path to a dll, even if the dll has changed on disk. You must use a unique filename per version of the dll to be able to load multiple copies at the same time.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet