New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/cgo: don't use syscall.Errno type as errno return on Windows #23468

Open
technicalviking opened this Issue Jan 18, 2018 · 14 comments

Comments

Projects
None yet
5 participants
@technicalviking

technicalviking commented Jan 18, 2018

Apologies in advance If I'm explaining anything poorly.

What did you do?

I made an Go binding library for a static C lib, then wrote a go application to use that. During the course of developing that application, I managed to trigger the following error:

The process cannot access the file because another process has locked a portion of the file.

I have a stripped down testable chunk of code (complete with a more in depth readme) here: https://github.com/technicalviking/cgotest

I'd like to stress: I'm not looking to debug the code itself. I know the outputs are not what I expected (casting the C **double variable back to [][]float64 in 'extractOutputs' shows a significant number of NaN entries, but meh), but rather why the functionality puts the code in a state where a windows error populates the error return value of the go reference to the C function in this case.

EDIT TO INVESTIGATION

On further digging I found that calling sqrt on a negative number, even in a function defined in the preamble, is sufficient to trigger this behavior. The test code referenced in this issue has been updated accordingly.

What did you expect to see?

No errors

What did you see instead?

calling the c function using the mechanism
doResponse, doError = C.bridgeIndicatorFunction(...)
to leverage a C function pointer results in doError containing the following value: "The process cannot access the file because another process has locked a portion of the file."

System details

go version go1.9.2 windows/amd64
GOARCH="amd64"
GOBIN=""
GOEXE=".exe"
GOHOSTARCH="amd64"
GOHOSTOS="windows"
GOOS="windows"
GOPATH="C:\Users\dmurker\Documents\Dev\DM\Go Projects"
GORACE=""
GOROOT="C:\Program Files Dev\Go"
GOTOOLDIR="C:\Program Files Dev\Go\pkg\tool\windows_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\D-DANI~1\AppData\Local\Temp\1\go-build223821631=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOROOT/bin/go version: go version go1.9.2 windows/amd64
GOROOT/bin/go tool compile -V: compile version go1.9.2
gdb --version: GNU gdb (GDB) 7.10.1

technicalviking pushed a commit to technicalviking/tulipindicators that referenced this issue Jan 18, 2018

Removing error handling for error from c function due to issue on
windows described here: golang/go#23468.

Outputs should still populate as expected.
@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jan 18, 2018

I don't know what that error means. It's not coming from Go.

@ianlancetaylor ianlancetaylor changed the title from CGO windows unable to return control of application to Go from C land (unless I ignore the error)? to runtime: CGO windows unable to return control of application to Go from C land (unless I ignore the error)? Jan 18, 2018

@ianlancetaylor ianlancetaylor added this to the Go1.11 milestone Jan 18, 2018

@technicalviking

This comment has been minimized.

technicalviking commented Jan 18, 2018

@technicalviking

This comment has been minimized.

technicalviking commented Jan 18, 2018

I've managed to get a MUCH simpler version of the code that demonstrates the problem.

https://github.com/technicalviking/cgotest2/blob/master/main.go

package main

/*
#include <math.h>
double getSqrt(const double test) {
    return sqrt(test);
}
*/
import (
	"C"
)
import "fmt"

func main() {

	for i := 10; i > -10; i-- {
		res, err := C.getSqrt(C.double(float64(i)))

		fmt.Printf("SQRT of %d is %f\n", i, res)

		if err != nil {
			//when i < 0, err contains the string:
			//"The process cannot access the file because another process has locked a portion of the file."
			fmt.Printf("Error for finding the sqrt of %d: %v\n", i, err)
		}
	}

}
@slrz

This comment has been minimized.

slrz commented Jan 19, 2018

Isn't this just the documented cgo behaviour of returning errno when calling a C function in multiple assignment context? If your C function does not set errno, don't expect to find anything meaningful there. It's likely just leftover garbage from whatever its last user happened to put there. Try setting errno to 0 before returning from your C function.

Any C function (even void functions) may be called in a multiple assignment context to retrieve both the return value (if any) and the C errno variable as an error (use _ to skip the result value if the function returns void). For example:

n, err = C.sqrt(-1)
_, err := C.voidFunc()
var n, err = C.sqrt(1)

https://golang.org/cmd/cgo/#hdr-Go_references_to_C

edit: yes it is. The Windows error ERROR_LOCK_VIOLATION is the integer 33, which I guess your libc is using for EDOM, which is what sqrt(3) returns on a domain error (i.e. calling it with a negative argument).

@technicalviking

This comment has been minimized.

technicalviking commented Jan 19, 2018

That... tells me everything I need to know (as disappointing as it is...). The ansi c documentation I found does indeed indicate that the EDOM macro is also the integer 33.

However the error interface value being returned outputs as the string mentioned above (when using %v), not the actual integer errno itself. (Further digging reveals that using the %#v verb outputs 0x21, and using the %d verb outputs 33). Given the mismatch between the ansi c errno description, and the OS description for that error code, would it be reasonable to ask that using the %v verb to output the errno value give the ansi c macro name (or some other related value, such as the output of the %#v verb), rather than the host OS interpretation of that code?

@slrz

This comment has been minimized.

slrz commented Jan 19, 2018

The issue is that the Go side doesn't know the error code convention of the called C function. If you call a Windows API function through cgo, that same integer 33 actually means ERROR_LOCK_VIOLATION. You probably don't want the error returned from C.CreateFile to print as "numerical argument out of domain".

You can probably make a list of C standard library functions and special-case them (making the error message from sqrt look right), but there is lots of code outside of libc that uses C-style error conventions. It would just add even more confusion.

It's easier on Unix as those just incorporate the ANSI C errno definitions into the system interface, so this kind of ambiguity doesn't occur there.

@technicalviking

This comment has been minimized.

technicalviking commented Jan 19, 2018

That seems like an even better reason to have the %v verb translate the errno to either the integer value (in this case 33) as output by the %d verb, or the hex value (in this case 0x21) as would be output by the %#v verb, instead of the host OS interpretation of that code. If the Go side doesn't know the error code convention of the called C function, then the 'default format' (as described in the documentation of the fmt package for the %v verb) of the error code should be unambiguous as well. I would submit that translating the errno to a useable string should be an exercise for the programmer with a priori knowledge of the C function being called, not an assumption made by Go.

@technicalviking

This comment has been minimized.

technicalviking commented Jan 25, 2018

given the investigative work done here, would it be better to close this and open a new issue related to the default output (%v verb) of the cgo error, or should I edit the title and description of this issue? Or is this discussion chain sufficient?

@zhangyoufu

This comment has been minimized.

zhangyoufu commented May 21, 2018

related code in cgo:

go/src/cmd/cgo/out.go

Lines 511 to 513 in be012e1

if n.AddError {
fmt.Fprintf(fgo2, "\tif errno != 0 { r2 = syscall.Errno(errno) }\n")
}

Maybe we need a separate C.Errno that follows the meaning of errno in C language instead of using syscall.Errno for all platforms.

@technicalviking

This comment has been minimized.

technicalviking commented Jun 4, 2018

@zhangyoufu IMHO the issue is more concerning the difference in functionality between

https://golang.org/src/syscall/syscall_windows.go?#L90
and
https://golang.org/src/syscall/syscall_unix.go?#L103

Converting the errno to a syscall.Errno makes sense, as far as it goes, but the obfuscation of what happens between operating systems when trying to output the syscall.Errno using the %v verb (which I think calls the .Error() method to get the string value if it's available) feels like there's an assumption that C code used by a developer on windows will always be code accessing the Windows API. Silly me, I stumbled across a bug in the C code I was writing bindings for which had no such goal, and ran face first into a (potentially uncommon) case where this assumption wasn't valid. Part of this is admittedly my lack of experience with the language (I'd been using it for only a paltry few months at the time).

Potential fixes:

  1. Bring the two syscall.Errno.Error() implementations more in line each other, deferring to the unix implementation on a best effort basis (33 EDOM in this case), and leave the translation to a WinAPI error as an exercise for the coder.
  2. Modify the Windows version of syscall.Errno.Error() to output both the ANSI-C error code (33 EDOM in this case), as well as the Windows specific implementation if it should apply, and leave differentiating the actual meaning of the error as an exercise for the user under the assumption that they are cognizant of the C code the coder is writing CGO code for.
  3. Make zero assumptions about the operating system, and simply output the strconv.ITOA value of the underlying uint, leaving ALL interpretation of the code as an exercise for the coder.

Any of these potential solutions (or anythinig like them) would preclude the need for special interpretation of the %v verb when outputting a syscall.Errno with the fmt package.

@ianlancetaylor thoughts?

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Jun 5, 2018

If I'm following this correctly, the problem is that C functions on Windows set errno to a value that is unrelated to WIndows error codes. The syscall.Errno type is used for Windows error codes, as printed by the Win32 FormatMessageW function. It follows that it's wrong for cgo to convert the C errno value to the type syscall.Errno. We should change cgo to do something better. Perhaps it would be possible to a define a new type that implements the Error method using strerror, but it would also be OK to use a type that implements Error via strconv.Itoa.

Once we have that type, @zhangyoufu already pointed out the line in cgo that needs to change on Windows.

@technicalviking

This comment has been minimized.

technicalviking commented Jun 6, 2018

@ianlancetaylor That is a much better explanation, yes.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jul 6, 2018

@ianlancetaylor ianlancetaylor changed the title from runtime: CGO windows unable to return control of application to Go from C land (unless I ignore the error)? to cmd/cgo: don't use syscall.Errno type as errno return on Windows Jul 6, 2018

@technicalviking

This comment has been minimized.

technicalviking commented Aug 27, 2018

@ianlancetaylor Kinda sad this got pulled from go 1.11. Is this issue going to be resolved before 1.12 or 2.0?

@ianlancetaylor

This comment has been minimized.

Contributor

ianlancetaylor commented Aug 28, 2018

Someone needs to actually work on it. Want to send a change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment