Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: panic when executing from multiple c-shared libraries #65050

Closed
cavokz opened this issue Jan 10, 2024 · 8 comments
Closed

runtime: panic when executing from multiple c-shared libraries #65050

cavokz opened this issue Jan 10, 2024 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@cavokz
Copy link

cavokz commented Jan 10, 2024

Go version

go version go1.21.5 darwin/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/Users/cavok/Library/Caches/go-build'
GOENV='/Users/cavok/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/cavok/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/cavok/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/Cellar/go/1.21.6/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/Cellar/go/1.21.6/libexec/pkg/tool/darwin_amd64'
GOVCS=''
GOVERSION='go1.21.6'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/9y/hlpdgn0s10s5c3_k60jc0d5c0000gn/T/go-build4292255947=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I encountered this issue on macOS while importing two Python extensions written in Go using Pygolo. I could reduce the problem to the following repro, completely removing Python from the picture.

This is a minimal shared library exporting a dummy function:

package main

import "C"

func main() {
}

//export fun
func fun() {
}

This test C program loads two libraries built from the above source code and invokes the exported fun function of each.

#include <assert.h>
#include <stdio.h>
#include <dlfcn.h>

typedef void (*fun)(void);

int main(int argc, char* argv[])
{
	void *lib1 = dlopen("./lib1.so", RTLD_NOW);
	if (!lib1) {
		printf("%s\n", dlerror());
	}
	assert(lib1);

	void *lib2 = dlopen("./lib2.so", RTLD_NOW);
	if (!lib2) {
		printf("%s\n", dlerror());
	}
	assert(lib2);

	fun fun1 = dlsym(lib1, "fun");
	assert(fun1);
	fun1();

	fun fun2 = dlsym(lib2, "fun");
	assert(fun2);
	fun2();

	dlclose(lib1);
	dlclose(lib2);
	return 0;
}

This Makefile builds the two libraries and the test program, invokes the test multiple times. In an handful of attempts the runtime explodes.

GO ?= go
LIBS := lib1.so lib2.so

ITERATIONS ?= 1000

all: $(LIBS) test
	for n in `seq $(ITERATIONS)`; do ./test || exit 1; printf .; done; echo ok

test: export LDFLAGS := -ldl

%.so: lib.go FORCE
	$(GO) build -buildmode=c-shared -o $@ $<

clean:
	rm -rf test $(LIBS) $(LIBS:.so=.h)

FORCE:
.PHONY: FORCE

What did you see happen?

After a few executions of the test program, the Go runtime panics. For example:

fatal error: bad sweepgen in refill

goroutine 17 [running, locked to thread]:
runtime.throw({0x1505830c6?, 0x0?})
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/panic.go:1077 +0x5c fp=0xc00006ec10 sp=0xc00006ebe0 pc=0x15055341c
runtime.(*mcache).refill(0x1096e6a68, 0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mcache.go:157 +0x20d fp=0xc00006ec50 sp=0xc00006ec10 pc=0x150537b6d
runtime.(*mcache).nextFree(0x1096e6a68, 0x10)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/malloc.go:929 +0x85 fp=0xc00006ec98 sp=0xc00006ec50 pc=0x150531645
runtime.mallocgc(0x58, 0x1505a6ec0, 0x1)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/malloc.go:1116 +0x448 fp=0xc00006ed00 sp=0xc00006ec98 pc=0x150531c08
runtime.newobject(0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/malloc.go:1328 +0x25 fp=0xc00006ed28 sp=0xc00006ed00 pc=0x150532145
runtime.acquireSudog()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:437 +0x229 fp=0xc00006ed90 sp=0xc00006ed28 pc=0x150556489
runtime.chanrecv(0x1c0000a0000, 0x0, 0x1)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/chan.go:563 +0x225 fp=0xc00006ee08 sp=0xc00006ed90 pc=0x15052bc25
runtime.chanrecv1(0x10984e950?, 0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/chan.go:442 +0x12 fp=0xc00006ee30 sp=0xc00006ee08 pc=0x15052b9f2
runtime.cgocallbackg1(0x150580620, 0xc00006efe0?, 0x0)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/cgocall.go:306 +0x214 fp=0xc00006ef00 sp=0xc00006ee30 pc=0x15052a7f4
runtime.cgocallbackg(0x0?, 0x0?, 0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/cgocall.go:245 +0x109 fp=0xc00006ef90 sp=0xc00006ef00 pc=0x15052a549
runtime.cgocallbackg(0x150580620, 0x7ff7b685ff50, 0x0)
	<autogenerated>:1 +0x29 fp=0xc00006efb8 sp=0xc00006ef90 pc=0x15057e4e9
runtime.cgocallback(0x0, 0x0, 0x0)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/asm_amd64.s:1035 +0xcc fp=0xc00006efe0 sp=0xc00006efb8 pc=0x15057bfcc
runtime: g 17: unexpected return pc for runtime.cgocallback called from 0x1098211e1
stack: frame={sp:0xc00006efb8, fp:0xc00006efe0} stack=[0xc00006e000,0xc00006f000)
0x000000c00006eeb8:  0x000000015057a46e <runtime.exitsyscall+0x000000000000012e>  0x000000c000006680
0x000000c00006eec8:  0x0000000200000003  0x000000c000006680
0x000000c00006eed8:  0x0000000000000000  0x0000000000000000
0x000000c00006eee8:  0x00000001505a99e8  0x000000c00006ef80
0x000000c00006eef8:  0x000000015052a549 <runtime.cgocallbackg+0x0000000000000109>  0x0000000150580620 <_cgoexp_47f08e3a3bbd_fun+0x0000000000000000>
0x000000c00006ef08:  0x000000c00006efe0  0x0000000000000000
0x000000c00006ef18:  0x00000001098211e1  0x0000000000000000
0x000000c00006ef28:  0x0000000000000000  0x0000000000000000
0x000000c00006ef38:  0x0000000000000000  0x0000000000000000
0x000000c00006ef48:  0x0000000000000000  0x0000000000000000
0x000000c00006ef58:  0x000000c00006efe0  0x000000c000006680
0x000000c00006ef68:  0x000000c000060000  0x0000000150580620 <_cgoexp_47f08e3a3bbd_fun+0x0000000000000000>
0x000000c00006ef78:  0x00007ff7b685ff50  0x000000c00006efa8
0x000000c00006ef88:  0x000000015057e4e9 <runtime.cgocallbackg+0x0000000000000029>  0x0000000000000000
0x000000c00006ef98:  0x0000000000000000  0x0000000000000000
0x000000c00006efa8:  0x00007ff7b685fef0  0x000000015057bfcc <runtime.cgocallback+0x00000000000000cc>
0x000000c00006efb8: <0x0000000150580620 <_cgoexp_47f08e3a3bbd_fun+0x0000000000000000>  0x00007ff7b685ff50
0x000000c00006efc8:  0x0000000000000000  0x0000000000000000
0x000000c00006efd8:  0x00000001098211e1 >0x0000000000000000
0x000000c00006efe8:  0x0000000000000000  0x0000000000000000
0x000000c00006eff8:  0x0000000000000000

goroutine 1 [runnable, locked to thread]:
runtime.gcTrigger.test({0x0?, 0x0?, 0x0?})
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgc.go:569 +0xdc fp=0x1c00006ac70 sp=0x1c00006ac68 pc=0x15053947c
runtime.mallocgc(0x38, 0x15059f6e0, 0x1)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/malloc.go:1245 +0x75d fp=0x1c00006acd8 sp=0x1c00006ac70 pc=0x150531f1d
runtime.newobject(0x1c00005a590?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/malloc.go:1328 +0x25 fp=0x1c00006ad00 sp=0x1c00006acd8 pc=0x150532145
syscall.nametomib({0x150582afe, 0x14})
	/usr/local/Cellar/go/1.21.6/libexec/src/syscall/syscall_darwin.go:50 +0x28 fp=0x1c00006ad60 sp=0x1c00006ad00 pc=0x15057fbe8
syscall.SysctlUint32({0x150582afe?, 0x1c00005a5f0?})
	/usr/local/Cellar/go/1.21.6/libexec/src/syscall/syscall_bsd.go:465 +0x1c fp=0x1c00006adb8 sp=0x1c00006ad60 pc=0x15057fb1c
syscall.adjustFileLimit(0x1c00006adf0)
	/usr/local/Cellar/go/1.21.6/libexec/src/syscall/rlimit_darwin.go:13 +0x25 fp=0x1c00006add8 sp=0x1c00006adb8 pc=0x15057fa05
syscall.init.0()
	/usr/local/Cellar/go/1.21.6/libexec/src/syscall/rlimit.go:37 +0x73 fp=0x1c00006ae10 sp=0x1c00006add8 pc=0x15057f9b3
runtime.doInit1(0x1505f8ee0)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:6740 +0xd8 fp=0x1c00006af40 sp=0x1c00006ae10 pc=0x150562a38
runtime.doInit(...)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:6707
runtime.main()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:249 +0x374 fp=0x1c00006afe0 sp=0x1c00006af40 pc=0x150555e54
runtime.goexit()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/asm_amd64.s:1650 +0x1 fp=0x1c00006afe8 sp=0x1c00006afe0 pc=0x15057c1e1

goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:398 +0xce fp=0x1c00005afa8 sp=0x1c00005af88 pc=0x1505561ee
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:404
runtime.forcegchelper()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:322 +0xb3 fp=0x1c00005afe0 sp=0x1c00005afa8 pc=0x150556073
runtime.goexit()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/asm_amd64.s:1650 +0x1 fp=0x1c00005afe8 sp=0x1c00005afe0 pc=0x15057c1e1
created by runtime.init.6 in goroutine 1
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:310 +0x1a

goroutine 3 [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:398 +0xce fp=0x1c00005b778 sp=0x1c00005b758 pc=0x1505561ee
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:404
runtime.bgsweep(0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgcsweep.go:280 +0x94 fp=0x1c00005b7c8 sp=0x1c00005b778 pc=0x150543f34
runtime.gcenable.func1()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgc.go:200 +0x25 fp=0x1c00005b7e0 sp=0x1c00005b7c8 pc=0x1505392c5
runtime.goexit()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/asm_amd64.s:1650 +0x1 fp=0x1c00005b7e8 sp=0x1c00005b7e0 pc=0x15057c1e1
created by runtime.gcenable in goroutine 1
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgc.go:200 +0x66

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x1c00007c000?, 0x1505988a8?, 0x1?, 0x0?, 0x1c000007520?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:398 +0xce fp=0x1c00005bf70 sp=0x1c00005bf50 pc=0x1505561ee
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/proc.go:404
runtime.(*scavengerState).park(0x1505fa360)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgcscavenge.go:425 +0x49 fp=0x1c00005bfa0 sp=0x1c00005bf70 pc=0x1505417e9
runtime.bgscavenge(0x0?)
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgcscavenge.go:653 +0x3c fp=0x1c00005bfc8 sp=0x1c00005bfa0 pc=0x150541d7c
runtime.gcenable.func2()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgc.go:201 +0x25 fp=0x1c00005bfe0 sp=0x1c00005bfc8 pc=0x150539265
runtime.goexit()
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/asm_amd64.s:1650 +0x1 fp=0x1c00005bfe8 sp=0x1c00005bfe0 pc=0x15057c1e1
created by runtime.gcenable in goroutine 1
	/usr/local/Cellar/go/1.21.6/libexec/src/runtime/mgc.go:201 +0xa5

What did you expect to see?

I expect to see no panics. Given the repro, I expect a simple . for each execution of the test program.

Various Go versions

Used goenv to try different versions on the same system (macOS 14.2.1) starting from 1.10.8.

1.10.8: fails to build, no messages

1.11.13, 1.12.17, 1.13.15, 1.14.15, 1.15.15, 1.16.15: build of the shared library fails with combining dwarf failed: Unknown load command 0x80000034 (2147483700)

1.16.15, 1.17.13, 1.18.10, 1.19.13, and 1.20.12: tested 1000 iterations, no panics

1.21.0, 1.21.1, 1.21.2, 1.21.3, 1.21.4, 1.21.5, 1.21.6: fail after a few iterations with fatal error: bad sweepgen in refill as above.

1.22-8db131082d: same as 1.21.x.

On Debian unstable, 1.21.5 works without issues. Did not try all the versions above but the test seems to pass on various distributions like Ubuntu, Debian, Alpine, RHEL, SLE. See https://gitlab.com/pygolo/py/-/blob/main/docs/TEST-MATRIX.md.

I will complete the map of success/failures at go-multi-c-shared.

@dmitshur dmitshur changed the title Panic when executing from multiple c-shared libraries runtime: panic when executing from multiple c-shared libraries; used to work in Go 1.16—1.20 Jan 10, 2024
@dmitshur dmitshur added this to the Backlog milestone Jan 10, 2024
@dmitshur dmitshur added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. compiler/runtime Issues related to the Go compiler and/or runtime. labels Jan 10, 2024
@dmitshur
Copy link
Contributor

CC @golang/runtime.

@cherrymui
Copy link
Member

Using multiple c-shared libraries in the same process is never really supported. Currently the c-shared library assumed it is the only copy of the Go runtime in the process. Unlike plugins, it doesn't try to see if there is any other Go runtime loaded in the process and integrate with them. Having multiple c-shared libraries in the same process might work in some simple cases, where each shared library mostly works in isolation. But if the program passes pointers around, weird things can happen.

You could try building them into a single c-shared library, or using plugins. Thanks.

@cherrymui cherrymui closed this as not planned Won't fix, can't repro, duplicate, stale Jan 10, 2024
@cavokz
Copy link
Author

cavokz commented Jan 10, 2024

Speaking with our use case in mind: writing Python extensions in Go.

The c-shared execution mode is at the base of Python extensions loading, we cannot build all the Go extensions in a single library. It's not even possible to know in advance which extensions would be loaded.

The plugin execution mode is available only to Go applications therefore it's not usable by the Python interpreter which is written C.

At this point I'd like to understand what plugins do that cannot be done by c-shared, what kind of 'passing pointers around' gets the runtime in trouble. Could you please elaborate?

Substantially, what can be done to support multiple c-shared libraries? What's the problem at the root that cannot be solved?

If this has been already discussed, please give me a pointer.

@cherrymui
Copy link
Member

The underlying problem is that currently each Go runtime assumes it has the complete information of all Go code in the process, and there is no other runtime or Go code exist in the same process. Specifically, say there are two c-shared libraries A and B. A doesn't know B exists, including all functions and types in B (that are not in A). So if the program has a call stack that have both functions from A and B on the same stack (say, by passing func values), A's runtime doesn't know how to unwind B's frames and the garbage collector doesn't know how to scan B's frames. Similarly, A's runtime doesn't know how to handle a type in B (e.g. for the garbage collector to scan it, if say an object from A pointing to an object from B). Also type identity may not be handled correctly.

For plugins at load time it takes extra steps to find exisiting Go runtime(s) and attach various metadata (tables) to the existing ones. For c-shared build mode, it currently doesn't do that. I think there is no fundamental reason that this couldn't be done. But it needs time to make it work.

Another possibility is probably make it possible to load plugins from a c-shared library, which may be less work. Then you can build one c-shared object to just do the plugin loading, and building other extensions as plugins.

@cherrymui
Copy link
Member

Also, currently, as the c-shared build mode is designed and implemented with the assumption that it is the only copy of Go, there is no care taken for unifying/deduplicating symbols from multiple copies. So if multiple c-shared libraries are loaded, currently whether a symbol of a function or global variable with the same name is deduplicated depends on the specific behavior of the system's dynamic linker. It may behave differently from one platform to another (e.g. on Linux and on macOS). Plugins are designed carefully with that in mind, so global variables are properly deduplicated.

@cavokz
Copy link
Author

cavokz commented Jan 10, 2024

The underlying problem is that currently each Go runtime assumes it has the complete information of all Go code in the process, and there is no other runtime or Go code exist in the same process. Specifically, say there are two c-shared libraries A and B. A doesn't know B exists, including all functions and types in B (that are not in A). So if the program has a call stack that have both functions from A and B on the same stack (say, by passing func values), A's runtime doesn't know how to unwind B's frames and the garbage collector doesn't know how to scan B's frames. Similarly, A's runtime doesn't know how to handle a type in B (e.g. for the garbage collector to scan it, if say an object from A pointing to an object from B). Also type identity may not be handled correctly.

This clearly points in the direction of keeping only a single runtime around. It also sounds reasonable for efficiency and resources consumption.

For plugins at load time it takes extra steps to find existing Go runtime(s) and attach various metadata (tables) to the existing ones. For c-shared build mode, it currently doesn't do that. I think there is no fundamental reason that this couldn't be done. But it needs time to make it work.

I like this approach, would you supervise/support me if I try it? I know nothing of Go internals and surely I don't have enough time but I already have some entries in the hall of shame so I've nothing to loose.

Another possibility is probably make it possible to load plugins from a c-shared library, which may be less work. Then you can build one c-shared object to just do the plugin loading, and building other extensions as plugins.

This is interesting but does not seem attractive, at least for my use case.

@cavokz
Copy link
Author

cavokz commented Jan 10, 2024

Also, currently, as the c-shared build mode is designed and implemented with the assumption that it is the only copy of Go, there is no care taken for unifying/deduplicating symbols from multiple copies. So if multiple c-shared libraries are loaded, currently whether a symbol of a function or global variable with the same name is deduplicated depends on the specific behavior of the system's dynamic linker. It may behave differently from one platform to another (e.g. on Linux and on macOS). Plugins are designed carefully with that in mind, so global variables are properly deduplicated.

This is quite cryptic, I think I already read it in some other similar issue.

Isn't it a generic problem of every shared library? It may well happen a clash with some other's library global symbol, there are no alternatives than leaving this in the hands of the library developer.

Are you maybe referring to global symbols of the runtime? Could you clarify?

In general it seems that plugins already solved most of the problems, I would try to adapt/reuse/generalize the solutions also to the c-shared case.

@cavokz
Copy link
Author

cavokz commented Jan 12, 2024

I modified the repro and indeed it gets in trouble also on Go 1.20.12.

Here fun wants to print something and then panic, call instead calls the C function passed as argument:

package main

// inline void call2(void *p)
// {
//     void (*f)(void) = p;
//     f();
// }
import "C"
import (
	"fmt"
	"unsafe"
)

func main() {
}

//export fun
func fun() {
	fmt.Println("fun!")
	panic("fun!")
}

//export call
func call(p unsafe.Pointer) {
	fmt.Printf("calling %p\n", p)
	C.call2(p)
}

The test now gets fun from lib1.so and passes it to call of lib2.so:

#include <assert.h>
#include <stdio.h>
#include <dlfcn.h>

typedef void (*fun)(void);
typedef void (*call)(void*);

int main(int argc, char* argv[])
{
	void *lib1 = dlopen("./lib1.so", RTLD_NOW);
	assert(lib1);

	void *lib2 = dlopen("./lib2.so", RTLD_NOW);
	assert(lib2);

	fun fun = dlsym(lib1, "fun");
	assert(fun);

	call call = dlsym(lib2, "call");
	assert(call);

	printf("fun: %p\n", fun);
	call(fun);

	dlclose(lib1);
	dlclose(lib2);
	return 0;
}

The result is this panic:

fun: 0x104b5ca60
calling 0x104b5ca60
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x104ade7bd]

goroutine 17 [running, locked to thread]:
runtime.throw({0x12c16bd7a?, 0x0?})
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/panic.go:1047 +0x5d fp=0x1c00006e810 sp=0x1c00006e7e0 pc=0x12c104add
runtime: g 17: unexpected return pc for runtime.sigpanic called from 0x104ade7bd
stack: frame={sp:0x1c00006e810, fp:0x1c00006e870} stack=[0x1c00006e000,0x1c00006f000)
0x000001c00006e710:  0x0000000000000000  0x0000000000000000
0x000001c00006e720:  0x0000000000000000  0x0000000000000000
0x000001c00006e730:  0x0000000000000000  0x0000000000000000
0x000001c00006e740:  0x0000000000000000  0x0000000000000000
0x000001c00006e750:  0x0000000000000000  0x0000000000000000
0x000001c00006e760:  0x0000000000000000  0x0000000000000000
0x000001c00006e770:  0x0000000000000000  0x0000000000000000
0x000001c00006e780:  0x0000000000000000  0x0000000000000000
0x000001c00006e790:  0x000000012c13080e <runtime.systemstack+0x000000000000002e>  0x000000012c104e2c <runtime.fatalthrow+0x000000000000006c>
0x000001c00006e7a0:  0x000001c00006e7b0  0x000001c000006680
0x000001c00006e7b0:  0x000000012c104e60 <runtime.fatalthrow.func1+0x0000000000000000>  0x000001c000006680
0x000001c00006e7c0:  0x000000012c104add <runtime.throw+0x000000000000005d>  0x000001c00006e7e0
0x000001c00006e7d0:  0x000001c00006e800  0x000000012c104add <runtime.throw+0x000000000000005d>
0x000001c00006e7e0:  0x000001c00006e7e8  0x000000012c104b00 <runtime.throw.func1+0x0000000000000000>
0x000001c00006e7f0:  0x000000012c16bd7a  0x000000000000002a
0x000001c00006e800:  0x000001c00006e860  0x000000012c1195e9 <runtime.sigpanic+0x00000000000003e9>
0x000001c00006e810: <0x000000012c16bd7a  0x0000000000000000
0x000001c00006e820:  0x0000000000000000  0x0000000000000000
0x000001c00006e830:  0x0000000000000000  0x0000000000000000
0x000001c00006e840:  0x000001c000006680  0x0000000000000000
0x000001c00006e850:  0x0000000000000000  0x0000000000000000
0x000001c00006e860:  0x000001c00006e880 !0x0000000104ade7bd
0x000001c00006e870: >0x0000000000000000  0x0000000000000000
0x000001c00006e880:  0x000001c00006e938  0x0000000104adefff
0x000001c00006e890:  0x0000000000000000  0x0000000000000000
0x000001c00006e8a0:  0x0000000000000000  0x0000000000000000
0x000001c00006e8b0:  0x0000000000000000  0x0000000000000000
0x000001c00006e8c0:  0x0000000000000040  0x0000000000000034
0x000001c00006e8d0:  0x0000000000000003  0x0000000000000000
0x000001c00006e8e0:  0x000e000e000e000e  0x0000000000000000
0x000001c00006e8f0:  0x0000000000000000  0x0000000000000000
0x000001c00006e900:  0x000001c00006e968  0x0000000104ad83ca
0x000001c00006e910:  0x0000000000000000  0x0000000000000000
0x000001c00006e920:  0x0000000000000000  0x0000000000000000
0x000001c00006e930:  0x0000000000000000  0x000001c00006e9a0
0x000001c00006e940:  0x0000000104ad8387  0x000001c0001ae800
0x000001c00006e950:  0x0000000000000800  0x0000000000000800
0x000001c00006e960:  0x0000000104b8bd80  0x000000c00006e9c8
runtime.sigpanic()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/signal_unix.go:825 +0x3e9 fp=0x1c00006e870 sp=0x1c00006e810 pc=0x12c1195e9

goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:381 +0xd6 fp=0x1c00005cfb0 sp=0x1c00005cf90 pc=0x12c1077b6
runtime.goparkunlock(...)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:387
runtime.forcegchelper()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:305 +0xb0 fp=0x1c00005cfe0 sp=0x1c00005cfb0 pc=0x12c1075f0
runtime.goexit()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/asm_amd64.s:1598 +0x1 fp=0x1c00005cfe8 sp=0x1c00005cfe0 pc=0x12c132a01
created by runtime.init.6
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:293 +0x25

goroutine 3 [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:381 +0xd6 fp=0x1c00005d780 sp=0x1c00005d760 pc=0x12c1077b6
runtime.goparkunlock(...)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:387
runtime.bgsweep(0x0?)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgcsweep.go:278 +0x8e fp=0x1c00005d7c8 sp=0x1c00005d780 pc=0x12c0f4b6e
runtime.gcenable.func1()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgc.go:178 +0x26 fp=0x1c00005d7e0 sp=0x1c00005d7c8 pc=0x12c0e9fe6
runtime.goexit()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/asm_amd64.s:1598 +0x1 fp=0x1c00005d7e8 sp=0x1c00005d7e0 pc=0x12c132a01
created by runtime.gcenable
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgc.go:178 +0x6b

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x1c00007c000?, 0x12c183238?, 0x1?, 0x0?, 0x0?)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:381 +0xd6 fp=0x1c00005df70 sp=0x1c00005df50 pc=0x12c1077b6
runtime.goparkunlock(...)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:387
runtime.(*scavengerState).park(0x12c21d4c0)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgcscavenge.go:400 +0x53 fp=0x1c00005dfa0 sp=0x1c00005df70 pc=0x12c0f2a53
runtime.bgscavenge(0x0?)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgcscavenge.go:628 +0x45 fp=0x1c00005dfc8 sp=0x1c00005dfa0 pc=0x12c0f3025
runtime.gcenable.func2()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgc.go:179 +0x26 fp=0x1c00005dfe0 sp=0x1c00005dfc8 pc=0x12c0e9f86
runtime.goexit()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/asm_amd64.s:1598 +0x1 fp=0x1c00005dfe8 sp=0x1c00005dfe0 pc=0x12c132a01
created by runtime.gcenable
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mgc.go:179 +0xaa

goroutine 19 [finalizer wait]:
runtime.gopark(0x1a0?, 0x12c21d900?, 0xa0?, 0x61?, 0x1c00005c770?)
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/proc.go:381 +0xd6 fp=0x1c00005c628 sp=0x1c00005c608 pc=0x12c1077b6
runtime.runfinq()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mfinal.go:193 +0x107 fp=0x1c00005c7e0 sp=0x1c00005c628 pc=0x12c0e9027
runtime.goexit()
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/asm_amd64.s:1598 +0x1 fp=0x1c00005c7e8 sp=0x1c00005c7e0 pc=0x12c132a01
created by runtime.createfing
	/Users/cavok/.goenv/versions/1.20.12/src/runtime/mfinal.go:163 +0x45

If test is modified to take both call and fun from lib1.so (or lib2.so), the output is much nicer:

fun: 0x10aa86a60
calling 0x10aa86a60
fun!
panic: fun!

goroutine 17 [running, locked to thread]:
main.fun(...)
	/Users/cavok/devel/go-multi-c-shared.git/lib.go:20
main._Cfunc_call2(0x10aa86a60)
	_cgo_gotypes.go:39 +0x45
main.call.func1(0x10aac10f8?)
	/Users/cavok/devel/go-multi-c-shared.git/lib.go:26 +0x3a
main.call(0x10aa86a60)
	/Users/cavok/devel/go-multi-c-shared.git/lib.go:26 +0x67

What I don't get is that in all (except Go 1.10.4 on Ubuntu 18.04, the only 1.10.x in the batch) of the Pygolo pipelines, which include also some macOS versions (??), the runtime does not panic in any of the 1000 iterations. fun is correctly executed and indeed it panics as expected with fun!.

What happens is that call (from lib2.so) is totally absent from the stack traces. In another round where fun and call both come from lib1.so I see the nicer stack trace (also here Ubuntu 18.04 fails with runtime: address space conflict, let's ignore it).

So, despite that the cross c-shared function invocation seems to work, I see the holes as you described above.

@cherrymui, what should I add to this test in order to have a complete picture of what needs to be implemented?

You mentioned types identity but Go types do not cross the C API barrier. Even when the underlying data of a void * pointer is actually a Go value, if it crosses the C API it eventually needs to be casted and from that point on Go can only trust the developer in doing correct casts.

@cavokz cavokz changed the title runtime: panic when executing from multiple c-shared libraries; used to work in Go 1.16—1.20 runtime: panic when executing from multiple c-shared libraries Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants