Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/binary: add var NativeEndian; also x/sys/cpu.IsBigEndian #57237

Open
bradfitz opened this issue Dec 11, 2022 · 39 comments
Open

encoding/binary: add var NativeEndian; also x/sys/cpu.IsBigEndian #57237

bradfitz opened this issue Dec 11, 2022 · 39 comments

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Dec 11, 2022

I'd like to revisit #35398 (proposal: encoding/binary: add NativeEndian). Sorry. I know the history & people's opinions already. But I come with a story! 😄

Go 1.19 added support forGOARCH=loong64 (https://go.dev/doc/go1.19#loong64)

So naturally somebody wanted to compile tailscale.com/cmd/tailscaled with GOOS=linux GOARCH=loong64. Compilation failed.

It turned out we have four different native endian packages in our dependency tree:

So we had to update all four, along with the various requisite go.mod bumps.

Some observations:

  • they're all very similar
  • people don't like taking dependencies on other big packages. @josharian's github.com/josharian/native that he mentioned in proposal: encoding/binary: add NativeEndian #35398 (comment) is closest, but lacks the constant that we ended up needing in Tailscale. So everybody makes their own local copies instead. That works until a new GOARCH comes along. Maybe that's rare enough? But I'm sure more riscv* variants will come along at some point.

x/sys/cpu already has this code:

https://cs.opensource.google/go/x/sys/+/refs/tags/v0.3.0:cpu/byteorder.go;l=44

And it has even more GOARCH values (for gccgo) than any other package has!

So everybody has a different subset of GOARCH values it seems.

I know people don't want to encourage thinking about or abusing endianness, but it's a reality when talking to kernel APIs. And this is kinda ridiculous, having this duplicated incompletely everywhere.

It would've been neat if Go could've added loong64 and had a bunch of code in the Go ecosystem just work right away and not require adjusting build tags.

Alternatively, if std and x/sys/cpu are too objectionable: what about new build tags?

/cc @josharian @mdlayher @hugelgupf @zx2c4 @yetist @jwhited @raggi

@gopherbot gopherbot added this to the Proposal milestone Dec 11, 2022
@mdlayher
Copy link
Member

Yes please. Another option: unsafe.NativeEndian to imply that you shouldn't use this unless you are aware of the implications?

@ianlancetaylor
Copy link
Contributor

CC @robpike

@josharian
Copy link
Contributor

@josharian's github.com/josharian/native that he mentioned in #35398 (comment) is closest, but lacks the constant that we ended up needing in Tailscale.

FWIW, I know the author of that package, and he’s amenable to adding a constant.

But an x/sys/cpu or build tag option would be better. :)

bradfitz added a commit to bradfitz/native that referenced this issue Dec 13, 2022
josharian pushed a commit to josharian/native that referenced this issue Dec 13, 2022
bradfitz added a commit to tailscale/tailscale that referenced this issue Dec 13, 2022
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
bradfitz added a commit to bradfitz/uio that referenced this issue Dec 13, 2022
bradfitz added a commit to bradfitz/uio that referenced this issue Dec 13, 2022
Fixes u-root#7
Updates golang/go#57237

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
bradfitz added a commit to tailscale/tailscale that referenced this issue Dec 13, 2022
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
hugelgupf pushed a commit to u-root/uio that referenced this issue Dec 13, 2022
Fixes #7
Updates golang/go#57237

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
@robpike
Copy link
Contributor

robpike commented Dec 13, 2022

You know my position about byte order. Yes, it's sometimes necessary but it's used far more than it should be. I'd prefer to take up @josharian's first suggestion and not have the standard library promote the concept.

bradfitz added a commit to bradfitz/dhcp that referenced this issue Dec 13, 2022
bradfitz added a commit to bradfitz/dhcp that referenced this issue Dec 13, 2022
To pick up this change:
u-root/uio@c353755

Updates golang/go#57237

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
@bradfitz
Copy link
Contributor Author

I'd prefer to take up @josharian's first suggestion and not have the standard library promote the concept.

What about const IsBigEndian = false in x/sys/cpu? That's not the standard library. And that cpu package is very much about telling you what the CPU does.

@robpike
Copy link
Contributor

robpike commented Dec 13, 2022

I'm not going to die on this hill, but know too that big/little endian is not the full story for some architectures. Maybe Go will never support a nutty layout, but who knows?

In other words, is a boolean sufficient? It might be nowadays, but I honestly don't know. It's probably sufficient to support the architectures encoding/binary does, so maybe it's enough.

At least your suggestion doesn't promote the idea of "native" byte order explicitly.

But: A single third-party package that just told you the byte order by computing it (this can only be done unsafely, which is a BIG part of why I dislike the concept) seems like the real right answer to me.

@raggi
Copy link
Contributor

raggi commented Dec 13, 2022

Is part of the problem here more one of naming, e.g. "NativeEndian", perhaps something closer to what this is used for in most good cases is the systems ABI endianness. I'm not sure of a good short name to describe that, but perhaps if we had one it would be less objectionable?

@robpike
Copy link
Contributor

robpike commented Dec 14, 2022

@raggi Not especially. It's not what it's called, it's what it represents, an internal detail that is almost always (not always, but almost always) irrelevant. A long history of bad C code has taught people that it matters more than it does.

Only unsafe code can care.

@DeedleFake
Copy link

DeedleFake commented Dec 14, 2022

I made this exact mistake recently. I'm working on a pure Go Wayland protocol implementation. Wayland, despite being essentially a network protocol, is constrained to Unix domain sockets for various practical reasons. Since everything's on the same machine anyways, I assume, the developers decided to just use native endianness for data being sent. Early on in my project, I created a global var byteOrder binary.ByteOrder that I then set in an init() by detecting the native endianness via unsafe. It was only later that I realized that this was 100% pointless, serving literally no purpose other than hurting the performance ever so slightly. It's pointless because all it's actually doing is getting the raw bytes of various types. This is just as easily, and far more efficiently, accomplished by just, for example, doing something like *(*[4]byte)(unsafe.Pointer(&v)) to get the raw bytes of either a int32 or uint32. I wrote a couple of functions, such as func Bytes[T ~int32 | ~uint32](v T) [4]byte, to simplify it a bit further and that was the end of it.

I have to agree with @robpike. I used to wonder why this was missing but now I just don't see the point. All it would do would be to obscure stuff that probably should be unsafe. I took a look through several packages that use the above referenced native endianness packages and none of them require a native endianness. For example, Tailscale doesn't use it much, and every usage could easily be replaced with an unsafe conversion, the same goes for Cloudflare's usage of it, and I'm not even sure what's going on in this one. On the other hand, wireguard-go has a discussion that talks about unsafe casting, but only seems to consider casting an entire struct as an alternative instead of casting just the slices that they're already handling individually anyways, which seems a bit odd to me if the goal is remove the dependency.

Unless I'm missing something, none of these seem like they actually require any kind of NativeEndian and their insistence on using a library to detect how their system reads and writes memory instead of just reading and writing that memory and letting the computer do what it's designed to do is causing issues that they just simply wouldn't otherwise have. It's also notable that every usage that I saw directly calls the methods on the ByteOrder implementation, rather than passing it to binary.Read() or binary.Write(). Calling the methods directly is even more pointless, since you have to know the size in advance anyways.

Endianness is generally not something that should affect anything outside of network protocols and file formats. In other words, it only really matters for stuff where data could be read by two processes that might be running on different computers. In those cases, the endianness needs to be explicitly declared as something standardized between the two, hence binary.LittleEndian and binary.BigEndian. I don't know what the point of a native endian implementation of binary.ByteOrder would be other than to confuse people into thinking that there's a similarity in the usage between native endianness and predefined endianness.

@zx2c4
Copy link
Contributor

zx2c4 commented Dec 14, 2022

You have to be careful with alignment on some archs with that sort of unsafe cast.

Typically what I've done in C is memcpy to the destination and let gcc figure out whether that has to be byte by byte or can be word-wise, with load store folding removing the insufficiency.

@DeedleFake
Copy link

DeedleFake commented Dec 14, 2022

You have to be careful with alignment on some archs with that sort of unsafe cast.

How so, and how does a NativeEndian implementation help with that? If you can reference the memory, the computer shouldn't care what type you're treating the bytes as I would think. For example, what's the difference between

v.hdrLen = endian.Native.Uint16(b[2:])

and

v.hdrLen = *(*uint16)(unsafe.Pointer(&b[2]))

Genuinely curious. It's reading the same bytes in the same order. If there are issues with reading those bytes that seems like it should be something the compiler should handle transparently.

@josharian
Copy link
Contributor

@DeedleFake the compiler only coalesces byte-by-byte memory reads/writes if it knows that it is safe to do so, based on the architecture, alignment, whatnot. The unsafe version says to the compiler: just do it, whether or not it is safe.

@DeedleFake
Copy link

DeedleFake commented Dec 14, 2022

What actually happens if you attempt to do

v.hdrLen = *(*uint16)(unsafe.Pointer(&b[2]))

on something like MIPS and the alignment isn't correct? Do you just get junk data? Does it panic? That's not something I've run into before, despite working with some relatively low level stuff on MIPS a bit a fair while back.

Edit: Never mind. I think I answered my own question.

@cuiweixie
Copy link
Contributor

cuiweixie commented Jan 24, 2023

I believe that we could avoid duplicating the full implementation with a touch of embedding:

var NativeEndian nativeEndian

type nativeEndian struct {
  littleEndian // or bigEndian
}

if we have known endian at compile time, If could be some problem when cross compile. For example, the host machine is
AMD64, we get

type nativeEndian struct {
    littleEndian
}

now this code have been generate(or somehow) to encoding/binary, it's ok if the target host is GOARCH=arm64. what if now we need to cross compile a program to a big endian machine like arm64be, this may cause problem. the native endian become littleEndian which should be bigEndian.

@ianlancetaylor
Copy link
Contributor

@cuiweixie There isn't going to be any generated code here. We will use build tags to select which variant of nativeEndian to use. Cross-compilation won't be an issue.

@cuiweixie
Copy link
Contributor

@cuiweixie There isn't going to be any generated code here. We will use build tags to select which variant of nativeEndian to use. Cross-compilation won't be an issue.

Wow, it's a good idea. Is it someone working on this. If not, maybe I can try to implement proposal.

@gopherbot
Copy link

Change https://go.dev/cl/463335 mentions this issue: cpu: add IsBigEndian

@robpike
Copy link
Contributor

robpike commented Jan 25, 2023

Why do you need a special function IsBigEndian? You could just ask if NativeEndian == BigEndian.

@ianlancetaylor
Copy link
Contributor

Part of this proposal is to add a new constant (not function) IsBigEndian in the x/sys/cpu package. Yes, one could do the same thing by importing the encoding/binary package and writing NativeEndian == BigEndian. The x/sys/cpu package already defines some processor-specific information.

@gopherbot
Copy link

Change https://go.dev/cl/463218 mentions this issue: encoding/binary: add var NativeEndian

gopherbot pushed a commit to golang/sys that referenced this issue Jan 27, 2023
Copy the definition of x/sys/unix.isBigEndian to x/sys/cpu.

Updates golang/go#57237

Change-Id: Iefbf4303720445611de93b0a3ea365f8208c033b
Reviewed-on: https://go-review.googlesource.com/c/sys/+/463335
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Rob Pike <r@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Jan 27, 2023
Updates #57237

Change-Id: I149c8b7eeac91b87b5810250f96d48ca87135807
Reviewed-on: https://go-review.googlesource.com/c/go/+/463218
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: xie cui <523516579@qq.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/463985 mentions this issue: encoding/binary: add String and GoString method to nativeEndian

matzf added a commit to matzf/scion that referenced this issue Jan 30, 2023
No longer needed.
Native byte order is not often needed, but will eventually show up in
standard library anyway (golang/go#57237).
gopherbot pushed a commit that referenced this issue Jan 30, 2023
Updates #57237

Change-Id: Ib626610130cae9c1d1aff5dd2a5035ffde0e127f
Reviewed-on: https://go-review.googlesource.com/c/go/+/463985
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: xie cui <523516579@qq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@dmitshur dmitshur modified the milestones: Backlog, Go1.21 Jan 30, 2023
bradfitz added a commit to josharian/native that referenced this issue Feb 2, 2023
bradfitz added a commit to tailscale/tailscale that referenced this issue Feb 2, 2023
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
bradfitz added a commit to josharian/native that referenced this issue Feb 2, 2023
bradfitz added a commit to tailscale/tailscale that referenced this issue Feb 2, 2023
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
bradfitz added a commit to tailscale/tailscale that referenced this issue Feb 2, 2023
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
bradfitz added a commit to tailscale/tailscale that referenced this issue Feb 2, 2023
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
coadler pushed a commit to coder/tailscale that referenced this issue Feb 2, 2023
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
coadler pushed a commit to coder/tailscale that referenced this issue Feb 2, 2023
To pull in insomniacslk/dhcp#484 to pull in u-root/uio#8

Updates golang/go#57237

Change-Id: I1e56656e0dc9ec0b870f799fe3bc18b3caac1ee4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
coadler pushed a commit to coder/tailscale that referenced this issue Feb 2, 2023
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
matzf added a commit to matzf/scion that referenced this issue Feb 3, 2023
No longer needed.
Native byte order is not often needed, but will eventually show up in
standard library anyway (golang/go#57237).
matzf added a commit to scionproto/scion that referenced this issue Feb 9, 2023
The pkg/private/common, util and xtest packages have rather fuzzy scope,
and have accumulated a bit of cruft and unused or outdated
functionality. Clean this up a bit:

* pkg/private/common: 
    * remove unused constants
    * remove outdated error handling helpers and replace remaining use
    * remove NativeOrder and IsBigEndian: No longer needed.
      Native byte order is not often needed, but will eventually show up
      in standard library anyway (golang/go#57237).
* pkg/private/util:
    * remove unused helper functionality
    * remove Checksum: only used to compute reference value in slayers
      test cases. Use a simpler, non-optimized implementation for this.
      Closes #4262.
    * move RunsInDocker to private/env
    * move ASList to tools/integration
* pkg/private/xtest: 
    * remove unused helpers
    * remove unused Callback and MockCallback
    * replace FailOnErr with require.NoError
    * replace AssertErrorsIs with assert.ErrorIs


There are still more things to clean up in `pkg/private`, in future PRs,
in particular: 
* `common.ErrMsg` should be integrated in `serrors`
* `common.IFIDType` should be removed or renamed and moved somewhere
  more appropriate
* Merge the remainder of `util` and `common` 
* Clean up  `LinkType` and `RevInfo` from `pkg/private/ctrl`
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
Updates golang#57237

Change-Id: Ib626610130cae9c1d1aff5dd2a5035ffde0e127f
Reviewed-on: https://go-review.googlesource.com/c/go/+/463985
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: xie cui <523516579@qq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests