Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/cgo: C.GoStringN should document that it copies the entire length of the buffer #12427

Closed
siebenmann opened this issue Sep 1, 2015 · 5 comments

Comments

@siebenmann
Copy link

commented Sep 1, 2015

A naive reader of the minimal documentation for C.GoStringN() would be led to believe that it is a bounded version of C.GoString(), ie it copies a null-terminated string but stops after at most N bytes whether or not a null was found. This is not the actual behaviour, which is that GoStringN() always copies N bytes regardless of whether or not the 'C string' is null terminated before then. Since I recently made this mistake myself, I think that the cmd/cgo documentation should be clear about this.

(In an ideal world I think that C.GoStringN()'s first argument would be an unsafe.Pointer instead of a *C.char, but I assume it's far too late to make that change even if people were inclined to do so.)

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Sep 1, 2015

@ianlancetaylor ianlancetaylor changed the title cmd/cgo: C.GoStringN() should document that it copies the entire length of the buffer cmd/cgo: C.GoStringN should document that it copies the entire length of the buffer Sep 1, 2015

@minux

This comment has been minimized.

Copy link
Member

commented Sep 1, 2015

One way to understand is that Go string can have embedded NULs,
or if C.GoStringN stop on first NUL byte, it's hard to ever convert
arbitrary binary data from C to a Go string.

I don't think changing the type of the first argument to C.GoString/
C.GoStringN would matter much as then you almost always need
to cast a pointer to unsafe.Pointer whereas right now if the input
is really a C string, you already have *C.char. The only case that
it helped is when you have a C void * pointer.

@siebenmann

This comment has been minimized.

Copy link
Author

commented Sep 1, 2015

When you say that something takes a *C.char as its argument and describe that as a C string, you are creating the expectation in many readers that it will actually be treated as a C string. C strings are null terminated (and so by definition do not contain nulls). Making an argument an unsafe.Pointer (and not calling it a C string in documentation) makes it clear that it is not being treated as a C string and so nulls do not matter. Contrast the lack of ambiguity in C.GoBytes versus the current ambiguity in C.GoStringN.

I agree that C.GoStringN serves a useful purpose today with its current semantics and anyway, the current behaviour is what it is and people have written code to that behaviour. I just want those semantics to be clear to everyone who reads godoc cmd/cgo.

@minux

This comment has been minimized.

Copy link
Member

commented Sep 1, 2015

No, char * in C doesn't necessary represent a C string.
Arbitrary binary data is usually also represented as char *
(or unsigned char *, but sometimes utf-8 strings are
also represented as unsigned char *, e.g. ICU)

I don't think taking a *C.char parameter alone implies
the function is taking a C string.

And in other languages that supports embedded NULs
in string, they also use similar APIs and generally
document it as taking a string and size (length).

e.g.
Lua:
const char *lua_pushlstring (lua_State *L, const char *s, size_t len);
Pushes the string pointed to by s with size len onto the stack.

CPython:
PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
Return a new string object with a copy of the string v as value and
length len on success, and NULL on failure. If v is NULL, the contents
of the string are uninitialized.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Oct 16, 2015

It seems fine to me to change:

// C string, length to Go string
func C.GoStringN(*C.char, C.int) string

// C pointer, length to Go []byte
func C.GoBytes(unsafe.Pointer, C.int) []byte

to say

// C data with explicit length to Go string

// C data with explicit length to Go []byte
@gopherbot

This comment has been minimized.

Copy link

commented Oct 16, 2015

CL https://golang.org/cl/15997 mentions this issue.

@minux minux closed this in 75a423a Oct 23, 2015

@golang golang locked and limited conversation to collaborators Oct 24, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants
You can’t perform that action at this time.