New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/link: loading c-shared into Go program crashes on Windows #22192
Comments
If calling from C, works fine for me. #include <windows.h>
typedef void (*pfn)(void);
int
main(int argc, char* argv[]) {
pfn fn;
HMODULE h = LoadLibrary("foo.dll");
fn = (pfn) GetProcAddress(h, "Foo");
fn();
return 0;
} |
I suspect what is happening here is that you cannot have 2 Go runtimes coexist in a single process. In particular the way we access TLS has to be changed to allow 2 or more runtimes in one process. I will let Ian decide what to do here. Alex |
The intent of c-shared is to permit loading a Go DLL into a C program. You seem to be using c-shared to load a Go DLL into a Go program. That is problematic. The expectation is that you would be using -buildmode=plugin or -buildmode=shared here. But I can see why Windows developers would expect this to work, especially since neither -buildmode=plugin nor -buildmode=shared currently work on Windows. But I don't know how to make it work. I'm not sure what we should do. |
@ianlancetaylor This is caused by that runtime.islibrary is set to 1 when it called from exe built with Go. When it called from Go, all of global variables are not initialized (include os.Stdout). As alex mentioned, add way to access TLS or put global variables into bss section? |
In general, Go requires that the complete executable image have only a single copy of the runtime package in the overall executable image. I don't know how we can make it work if there are multiple copies. A c-shared is intended to have its own copy of the runtime, since a c-shared is expected to be loaded by a C program that will not, of course, have a copy of the runtime. Loading a c-shared by a Go program inevitably means multiple copies of the runtime. Windows does not have ELF-style interposition, so I see no simple way to make the c-shared used the Go program's copy of the runtime. |
How about to install main-runtime into copies using function exported from DLL? For example, DLL export runtime.install like below. type runtime_share strcut {
name string
ptr uintptr
}
var runtime_shared []runtime_share
func runtime_install(data []runtime_share)
for _, rs := range runtime_share
*((*unsafe.Pointer)(find_runtime_share(rs[i].name).ptr)) =
*((*unsafe.Pointer)(rs.ptr))
}
} And runtime_shared is appended from each packages used in DLL. os/file.go func init() {
runtime_install([]runtime_share{
{"os.Stdout", &os.Stdout},
{"os.Stdin", &os.Stdin},
...
})
} |
That suggestion seems to get the initial values the same, which is a start, but we really need to have a single instance of each variable. When one instance of the runtime changes a variable, the other instance of the runtime has to see that change. |
Yes, also runtime functions in DLL should be replated to runtime functions in main. |
This is talking about -buildmode=shared. Let's focus to c-shared. At least, about c-shared, we need to initialize runtime as same as call from C. |
Sorry, I was confused. runtime.islibrary is not related on this issue. This seems to be conflicting of bss section? |
What is involved implementing either of those? Do I start with correspondent /misc/cgo/test... and see where it takes me? Do you think it is hard? Do you think it is possible to implement them on WIndows? I will also try to see why @mattn example fails (when I have time). Maybe it is something simple. I also wonder what happens, if two or more Go DLLs are loaded from C program. I suspect we might get a similar crash too. Alex |
Unfortunately I don't remember all of the details of Windows DLLs. My vague recollection is that a Windows DLL needs to have a clear notion of whether a symbol is defined in a different DLL at compile time. If that is correct, it doesn't fit will with the current approach, in which we only really know whether a symbol is defined in a different DLL at link time. That would be the first question to resolve. |
Windows DLL is similar to Windows EXE regarding external symbols - all external symbols (functions) are kept in a DLL file. There are 2 ways that external DLL functions are found:
Perhaps you are talking about something else. Alex |
The question is not so much where functions are found, as how the compiler needs to handle the code that refers to those functions. When compiling a reference to a variable defined in a different DLL, does the compiler need to know that the variable is defined elsewhere? |
Compiler does need to know that function lives in external DLL. Like I said before both Windows executables and DLLs have "import" section that is used by Windows to resolve all external call sites when program starts. So as long as Go compiler arranges all those external calls as required, it all just works. That is what we do for Windows executables already. Alex |
I suspect this issue might be due to us using unsupported TLS slot. https://devblogs.microsoft.com/oldnewthing/20190418-00/?p=102428 Windows amd64 is using MOVQ 0x28(GS), CX instruction to read But we need to read Our windows-arm port is already doing correct hing - see runtime·save_g and runtime·load_g. But, I still could not copy that code onto amd64. Perhaps amd64 and arm assemblers are different. @ianlancetaylor can you help? Thank you. Alex |
I don't have any simple solutions. I would look into the way that Android supports TLS on amd64 before Android Q. |
Hi,how is this problem now? This is another repo that can reproduce this problem,and random errors. |
Another example here: #36157 Funny thing: adding a MessageBox before calling the DLL's exported function makes it work. |
I think this is your illusion. |
Maybe but the test program does not crash. It runs, prints, opens the dialog and returns fine. |
Is there any progress? |
@mattn Can this problem be dealt with |
@86speed This problem is hard to fix since current implementation can not use external runtime's goroutine from main goroutine. |
1.5 yrs on.... where we at... |
hello, how to solve the problem? |
@alexbrainman I can confirm that this issue is not reproduced on windows/arm64, at least the basic reproducer in the issue description works well. I'm also leaning towards a TLS issue. Will give it a try. |
I have a (yet-not-submitted) change that fixes this issue by using a proper TLS slot instead of the arbitrary pointer memory. Still fighting a couple of sharp edges, I'll try to submit it for review during the next week if I time allows. |
Change https://go.dev/cl/431775 mentions this issue: |
Sorry for being late here (just saw this). But with the current design, I don't think the c-shared build mode works well (and intended to) with loading from a Go program. Currently it is designed to be the only instance of the Go runtime in the program. It doesn't know other instances of the Go runtime, and doesn't do anything to combine the various metadata the runtime needs (some needs to be deduplicated, some needs to be merged, etc.) It is possible that it works for some simple use cases, if we carefully keep the two Go world completely separated. (And the TLS patch would help on that.) But it wouldn't be surprised that things would go wrong with more complex programs, especially with passing references around between two Go worlds. I'm not sure it is the right direction to pursue this path. |
Your call. The patch is there in case someone wants to keep pursuing this path. Additionally, the test coverage for loading Go shared libraries on other Go processes is next to nothing, so I wouldn't feel confortable saying that this scenario is supported, even if my patch is merged and the runtime metadata is combined. |
Thanks.
Right. It is nothing because it is not currently intended to be supported. If in the future we support this mode we will need more tests. |
Regardless of your decision if calling Go DLL from Go executable, I think you should consider using TlsAlloc Windows API to allocate TLS slots ( what https://go.dev/cl/431775 does ). Currently Go uses unsupported way to store TLS register ( #22192 (comment) ). If any DLL imported by Go program decides to use the same unsupported trick, chaos ensues. Go program does not have control of what DLLs it imports, because these DLLs are installed by companies IT departments and people who don't realise what they are installing (think of antivirus and any other tricky software). So there is a good chance we fix some unexplained bugs by replacing current TLS implementation with TlsAlloc. Just MHO. Alex |
Yeah, I'm not opposing using TlsAlloc, which I think is probably a good thing to do by itself. I'm just saying this doesn't necessarily make loading c-shared by a Go program (this issue) work. Removed the Hold on the CL. Thanks. |
Please try https://go-review.googlesource.com/c/go/+/431775 to see if it fixes your problem. Thank you. Alex |
@aarzilli @derekparker take a look at https://go.dev/cl/431775. If it is finally merged, Delve assumption about G always being at 0x28 bytes from TLS will no longer be true. It will have to be inferred using |
@alexbrainman @qmuntal I confirmed the CL fixes my problem. Thank you. |
It seems that cl431775 fixes #11100 too. Right? |
Ah, issue #1110 is not for only Windows. Ignore this. |
I don't think this issue is constrained to Windows only. It also occurs on Darwin. Reproducing isn't the same, but any loading a c-shared dynamic library from Go will have issues. |
This CL redesign how we get the TLS pointer on windows/amd64. We were previously reading it from the [TEB] arbitrary data slot, located at 0x28(GS), which can only hold 1 TLS pointer. With this CL, we will read the TLS pointer from the TEB TLS slot array, located at 0x1480(GS). The TLS slot array can hold multiple TLS pointers, up to 64, so multiple Go runtimes running on the same thread can coexists with different TLS. Each new TLS slot has to be allocated via [TlsAlloc], which returns the slot index. This index can then be used to get the slot offset from GS with the following formula: 0x1480 + index*8 The slot index is fixed per Go runtime, so we can store it in runtime.tls_g and use it latter on to read/update the TLS pointer. Loading the TLS pointer requires the following asm instructions: MOVQ runtime.tls_g, AX MOVQ AX(GS), AX Notice that this approach is also implemented on windows/arm64. [TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block [TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc Updates #22192 Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b Reviewed-on: https://go-review.googlesource.com/c/go/+/431775 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
5 years... and were all still waiting for this to get sorted... |
@Linkangyis proposing alternative solutions is always welcome, but having intensive discussion about your project is probably not going to solve this issue. If you have any idea that helps this issue, you're welcome to discuss or contribute a CL. Otherwise, please keep the discussion about your project on your own issue tracker or a forum. Thanks. |
@cherrymui Okay, I have deleted the relevant information. Sorry for the inconvenience 以上为机翻 |
What version of Go are you using (
go version
)?go version devel +bb0bfd002a Tue Oct 10 01:02:27 2017 +0000 windows/amd64
Does this issue reproduce with the latest release?
Only tip
What operating system and processor architecture are you using (
go env
)?What did you do?
foo.go
creating foo.dll
call Foo in foo.dll
What did you see instead?
output
foo
What did you expect to see?
panic
As far as I can see, panic occur at
(*File) write(b []byte)
Maybe, poller in dll is conflicting with main process.
The text was updated successfully, but these errors were encountered: