-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Go 2: permit converting a string constant to a byte array type #36890
Comments
The latter you can do with |
Most systems nowadays are 64-bit. A string takes 16 bytes and a slice takes 24 bytes. It is especially useful if you want to:
|
I don't see how this matters. Your first proposal would happen at compile time, and the second one just needs an extra 8 bytes of stack space (which could be optimized away if we cared).
That seems like a reasonable thing to want. You could do
The only thing that |
How often does code want to initialize a byte array (not slice) with a constant string? |
I have, on occasion, wanted a byte-array from a constant string in order to have something addressable to pass to a C function (https://play.golang.org/p/vTuzoWnwRhA). I typically end up working around it by using a variable and I do not know how representative that use-case is. |
this will use extra 24 bytes on the stack compared to a byte array:
If buf is a fixed size buffer and Initialzing a byte array from a string can be really convenient: It can be used for enumerations of strings with a specific format like above, among other useful cases.
Ignoring having to type an extra line, yes automatic sizing is also a convenient gain ;) |
Minor nit but it seems like it should be |
@jfcg no "v2.0" is planned. The "Go2" label means "defer until language changes resume" (they resumed in 1.13). |
Comparing, say,
and
We could easily optimize the expression |
Having to call a function like Ok, this is different than having a fixed size buffer and seems the proposal has a slight 8 byte advantage, and three characters |
That is true,
It always has to do at least a copy of the data, because the compiler currently has no analysis to prove that the byte array is not modified subsequently (#31506, #2205). If the result doesn't escape, it can allocate the backing array on the stack. But I don't think the details of that are much different in the two cases. |
I've started on removing unnecessary cap arguments to a few runtime calls. It's mostly done. I'll plan to finish it up and mail during 1.15. (Writing here to try to avoid duplicate work.) |
As noted above, we can already convert byte arrays to strings by adding How often does this really come up in practice? For example, can you find examples of existing code that would be simplified if we added this conversion to the language? |
In the top 15 trending Go repos this month
yields 74 results:
This is an understated statistics. It is pretty clear this feature is direly needed... |
This compiles on the playground: const staticname = "foo\x00"
var name [len(staticname)]byte
copy(name[:], staticname) Do we really need it to be more convenient than this? |
Are you seriously comparing that to:
C and C++ has this for years:
|
@jfcg I think he was serious, yes. Please be polite. Although you may disagree, it is not an unreasonable position; the bar for language changes is very high. |
The C/C++ version is not directly relevant to the discussion:
The question is not "is |
The discussion about "24 bytes of waste" is assuming specific compiler behavior. It's misleading because compilers can change. In the example above it's true that In general, we should avoid making language decisions based on compiler behavior. (Which is not to say that I am opposed to this change, but if there is a reason to do it it has to do with making code simpler and more readable, and is not because of saving space on the stack.) |
Your example is very artificial and hard to draw conclusions from. In this case, fmt.Println accepts a slice of interface{}, so you're already "wasting" many more bytes on that. But the program probably isn't memory constrained, so there's no reason to optimize it either. |
@carlmjohnson, any fixed-size local buffer that does not escape the function is better as a byte array. @ianlancetaylor, byte slices are officially seperate types that reference to memory buffers, am I wrong? How can a compiler choose to not allocate space for that slice, can you explain? |
Both byte slices and byte arrays have an array of bytes. We can disregard that when comparing them. A byte slice is three different values: a pointer to the array of bytes, a length, and a capacity. A byte array will internally be represented using a pointer to the array, and a constant length. When the compiler sees The 24 bytes only arises when the compiler has to construct the slice in memory. In practice that only happens if the slice is converted to an interface type. When the slice is indexed, the compiler can compare the index with the length, in this case a constant. When the slice is passed to a function, the compiler will pass three separate values, where in this case two of them are constants. The compiler does not need to construct a 24 byte value in order to do that (the arguments to the function will take 24 bytes, but exactly the same would happen when passing a byte array to a function that expects a byte slice, by slicing the array). In other words, the compiler doesn't think of a byte slice as a single value of size 24 bytes. It thinks of it as three different values, one pointer and two ints. Only when the slice must be stored into memory, as when converting to an interface type or setting a global variable, does the compiler need to assemble the 24 byte value. |
Ok I see, so the compiler is smart for many cases. |
@bcmills, you are right, that is another valid use case for byte arrays. I wasn't trying to ignore your comment from a few weeks back, but rather wanted to emphasize that focussing on the (supposedly) saved space might not be the way to gain enough support for the proposal. Sorry, if I didn't make that clear enough. |
As far as I know the gc compiler doesn't have the "24 byte problem." And, if it does, we should treat as something to fix in the compiler, rather than something to fix in the language. (As I mentioned above I'm not opposed to this change, I just don't think that optimization is a reason for it.) |
Hi. To better understand slice/array initialization differences, I disassembled
array.go:
with
I think both of these simplified versions spend same amount of stack as @ianlancetaylor said. Could someone please explain these little differences? |
I’m on my phone, but I suspect that after https://go-review.googlesource.com/c/go/+/220499 goes in, those differences may disappear. That change description may help you understand the difference as well. If you want to know more, look for array and slice initialization in src/cmd/compile/internal/gc/sinit.go. |
Thanks for finding the examples where this could be used. I note that many of the cases are one map in x/text/internal/language, which appears twice in the examples. Many of the other cases are only in tests. This doesn't seem to be used often enough to justify changing the language, per the criteria for language changes at https://blog.golang.org/go2-here-we-come. It's a bit odd to restrict this conversion only to constant strings. But it's also odd to permit non-constant strings, as it's not obvious what should happen if the string is too long. Given that it only applies to constant strings, it's syntactic sugar for listing out the bytes. As discussed above, the effects on the compiler should be fixed in the compiler, not in the language. For these reasons, this is a likely decline. Leaving open for four weeks for final comments. |
If this proposal does not satisfy the condition of inclusion, I dont see what does. This is very wrong. Go is a programming language, not a religion. I dont get it. |
@jfcg The language can't rely only tests like The blog post lists three criteria for language changes. One of those is that a language change must "address an important issue for many people." The cases in x/text/internal/language/lookup.go we can effectively treat as a single example, as they are all the same. Ignoring cases in tests, I count 12 examples in #36890 (comment). That isn't many, especially considering that you looked at the whole standard library. And each of those 12 examples is easy to write today; this language change would make it slightly more convenient, but it wouldn't add a feature that is not already available. I don't see any particular reason to assume that any cases of converting a string to a slice could instead convert a string to a byte array. Perhaps some could, perhaps not. Slices and arrays are different and serve different purposes. I don't agree that adding this feature would close an asymmetry in the language. You can't convert an array of bytes to a string. |
Any string whose size can be determined at compile-time can be converted to a byte array. How is
Ian, how many
we can write |
The Go language is defined by a language spec, not by an implementation. There are multiple Go compilers. If we implement this feature, all compilers must agree on exactly what it is permitted to convert a string to a byte array, and when it is not. Otherwise a program might compile with one compiler but not with another. Therefore, we can't just say "size can be determined at compile-time." We must write down the precise rules by which the size can be determined at compile time.
I understand what you are asking, but I don't think it's comparable. Before we added the ability to convert a
Yes.
That is one slice expression and one conversion. |
Just curious, why would you want to rewrite them to use arrays when they currently work fine with slices? |
Change https://golang.org/cl/227163 mentions this issue: |
Some runtime calls accept a slice, but only use ptr and len. This change modifies most such routines to accept only ptr and len. After this change, the only runtime calls that accept an unnecessary cap arg are concatstrings and slicerunetostring. Neither is particularly common, and both are complicated to modify. Negligible compiler performance impact. Shrinks binaries a little. There are only a few regressions; the one I investigated was due to register allocation fluctuation. Passes 'go test -race std cmd', modulo #38265 and #38266. Wow, does that take a long time to run. Updates #36890 file before after Δ % compile 19655024 19655152 +128 +0.001% cover 5244840 5236648 -8192 -0.156% dist 3662376 3658280 -4096 -0.112% link 6680056 6675960 -4096 -0.061% pprof 14789844 14777556 -12288 -0.083% test2json 2824744 2820648 -4096 -0.145% trace 11647876 11639684 -8192 -0.070% vet 8260472 8256376 -4096 -0.050% total 115163736 115118808 -44928 -0.039% Change-Id: Idb29fa6a81d6a82bfd3b65740b98cf3275ca0a78 Reviewed-on: https://go-review.googlesource.com/c/go/+/227163 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
|
No change in consensus. |
Change https://golang.org/cl/245099 mentions this issue: |
This ports https://golang.org/cl/227163 to the Go frontend. This is a step toward moving up to the go1.15rc1 release. Original CL description: cmd/compile,runtime: pass only ptr and len to some runtime calls Some runtime calls accept a slice, but only use ptr and len. This change modifies most such routines to accept only ptr and len. After this change, the only runtime calls that accept an unnecessary cap arg are concatstrings and slicerunetostring. Neither is particularly common, and both are complicated to modify. Negligible compiler performance impact. Shrinks binaries a little. There are only a few regressions; the one I investigated was due to register allocation fluctuation. Passes 'go test -race std cmd', modulo golang/go#38265 and golang/go#38266. Wow, does that take a long time to run. file before after Δ % compile 19655024 19655152 +128 +0.001% cover 5244840 5236648 -8192 -0.156% dist 3662376 3658280 -4096 -0.112% link 6680056 6675960 -4096 -0.061% pprof 14789844 14777556 -12288 -0.083% test2json 2824744 2820648 -4096 -0.145% trace 11647876 11639684 -8192 -0.070% vet 8260472 8256376 -4096 -0.050% total 115163736 115118808 -44928 -0.039% For golang/go#36890 Change-Id: I1dc1424ccb092a9ad70472e560a743c35dd27bfc Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/245099 Reviewed-by: Cherry Zhang <cherryyz@google.com>
This ports https://golang.org/cl/227163 to the Go frontend. This is a step toward moving up to the go1.15rc1 release. Original CL description: cmd/compile,runtime: pass only ptr and len to some runtime calls Some runtime calls accept a slice, but only use ptr and len. This change modifies most such routines to accept only ptr and len. After this change, the only runtime calls that accept an unnecessary cap arg are concatstrings and slicerunetostring. Neither is particularly common, and both are complicated to modify. Negligible compiler performance impact. Shrinks binaries a little. There are only a few regressions; the one I investigated was due to register allocation fluctuation. Passes 'go test -race std cmd', modulo golang/go#38265 and golang/go#38266. Wow, does that take a long time to run. file before after Δ % compile 19655024 19655152 +128 +0.001% cover 5244840 5236648 -8192 -0.156% dist 3662376 3658280 -4096 -0.112% link 6680056 6675960 -4096 -0.061% pprof 14789844 14777556 -12288 -0.083% test2json 2824744 2820648 -4096 -0.145% trace 11647876 11639684 -8192 -0.070% vet 8260472 8256376 -4096 -0.050% total 115163736 115118808 -44928 -0.039% For golang/go#36890 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/245099
byte slices and strings are convertable to each other:
I propose to extend this by allowing to convert:
Also:
I believe this is backward-compatible with Go 1. What do you think?
The text was updated successfully, but these errors were encountered: