Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: EpollEvent struct incorrect for ppc64/ppc64le #15135

Closed
laboger opened this issue Apr 5, 2016 · 20 comments
Closed

syscall: EpollEvent struct incorrect for ppc64/ppc64le #15135

laboger opened this issue Apr 5, 2016 · 20 comments
Milestone

Comments

@laboger
Copy link
Contributor

@laboger laboger commented Apr 5, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    Fails on both golang master and golang 1.6
  2. What operating system and processor architecture are you using (go env)?
    ppc64le Ubuntu 15.04 and ppc64 RHEL 7.2 same failures
  3. What did you do?
    If possible, provide a recipe for reproducing the error.
    A complete runnable program is good.
    A link on play.golang.org is best.

Tried to build and run fsnotify tests from github.com/fsnotify.

  1. What did you expect to see?
    No build or test failures.
  2. What did you see instead?
    Many test failures with golang; most golang failures will pass with gccgo but one hang there too.

I can see the first problem is that the epoll_event structure being passed to the epoll syscalls is not the same size or contents between gccgo and golang but continuing to investigate and understand the hang mentioned.

To reproduce:

go get -u github.com/fsnotify/fsnotify
cd src/fsnotify/fsnotify
go test -c

On master I see this output:

./fsnotify.test -test.run=Poller
--- FAIL: TestPollerWithData (0.00s)
inotify_poller_test.go:85: expected poller to return true
--- FAIL: TestPollerWithClose (0.00s)
inotify_poller_test.go:119: expected poller to return true
--- FAIL: TestPollerWithWakeupAndData (0.00s)
inotify_poller_test.go:140: expected poller to return true
--- FAIL: TestPollerConcurrent (0.05s)
inotify_poller_test.go:197: expected true
FAIL

If the 'go' tool used corresponds to gccgo then none of these failures occur. Through debugging I have found that argument size for the event information passed to the syscall is incorrect in golang.

Interestingly, there is also some epoll syscalls made in the runtime package and those use a different structure (from defs_linux_ppc64le.go) which looks right:

type epollevent struct {
events uint32
pad_cgo_0 [4]byte
data [8]byte // unaligned uintptr
}
Here is the declaration in syscall/ztypes_linux_ppc64le.go which is not correct:

type EpollEvent struct {
Events uint32
Fd int32
Pad int32
}
I tried to make a change to linux_defs.go to my_epoll_event but that didn't change the generated structure for me so thought it was time to open the issue.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Apr 5, 2016

Try adding:

diff --git a/src/syscall/types_linux.go b/src/syscall/types_linux.go
index 9bccfca..3a105a4 100644
--- a/src/syscall/types_linux.go
+++ b/src/syscall/types_linux.go
@@ -117,6 +117,9 @@ struct my_epoll_event {
        // alignment requirements of EABI
        int32_t padFd;
 #endif
+#ifdef __powerpc64__
+       int32_t _pad;
+#endif
        int32_t fd;
        int32_t pad;
 };

And then regenerating it, then send a CL with both changes.

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 5, 2016

OK, this is what I tried, but didn't know I had to do something
explicitly to
regenerate the file. Do I just run mkall.sh with the right GOOS and GOARCH?

Also, this fixes the size but doesn't it matter that the members of the
structure don't really describe the fields correctly?

In runtime/defs_linux_ppc64le.go it looks right:

type epollevent struct {
events uint32
pad_cgo_0 [4]byte
data [8]byte // unaligned uintptr
}

On 04/05/2016 01:41 PM, Brad Fitzpatrick wrote:

Try adding:

diff --git a/src/syscall/types_linux.go b/src/syscall/types_linux.go
index 9bccfca..3a105a4 100644
--- a/src/syscall/types_linux.go
+++ b/src/syscall/types_linux.go
@@ -117,6 +117,9 @@ struct my_epoll_event {
// alignment requirements of EABI
int32_t padFd;
#endif
+#ifdef powerpc64

  • int32_t _pad;
    +#endif
    int32_t fd;
    int32_t pad;
    };

And then regenerating it, then send a CL with both changes.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#15135 (comment)

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Apr 5, 2016

Do I just run mkall.sh with the right GOOS and GOARCH?

Probably. Or whatever it says at the top of the file.

Also, this fixes the size but doesn't it matter that the members of the
structure don't really describe the fields correctly?

It's a C union. Can't do much better. Users can manipulate the two 32-bit fields as needed, using unsafe if necessary. It's consistent with what we do elsewhere.

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 6, 2016

I created this CL https://go-review.googlesource.com/#/c/21582/
One note is that when I tried to use mkall.sh it was flaky. I tried just running the script and it changed multiple files and even the only file I cared about had extra changes unrelated to my change in types_linux.go. In addition, it removed all the +build ppc64le tags in the generated z files which I believe must be there. So what I did was just manually update the one file that mattered.

I just realized I still need to add the ppc64 change.... will do that now.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Apr 6, 2016

Okay, we really need to figure out why the auto-generation isn't working then. Hand-editing the z* files is just asking for pain later.

/cc @minux @ianlancetaylor

@bradfitz bradfitz added this to the Go1.7 milestone Apr 6, 2016
@gopherbot
Copy link

@gopherbot gopherbot commented Apr 6, 2016

CL https://golang.org/cl/21582 mentions this issue.

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 6, 2016

If I just run mkall.sh like this:

GOOS=linux GOARCH=ppc64le ./mkall.sh from the syscall directory then there are changes made to these files:

    modified:   zerrors_linux_ppc64le.go
    modified:   zsyscall_linux_ppc64le.go
    modified:   zsysnum_linux_ppc64le.go
    modified:   ztypes_linux_ppc64le.go

Instead if I use -n to show me the commands, I think I only need this one:

GOARCH=ppc64le go tool cgo -godefs types_linux.go |go run mkpost.go >ztypes_linux_ppc64le.go

But if I run that one alone it adds more changes than just to the epoll struct and removes the build tag. I see that the mkpost.go change was recent so I tried without that and got a lot more changes to the file.

Would I get different changes to these files if I ran this on different distros?

@mundaym
Copy link
Member

@mundaym mundaym commented Apr 6, 2016

Would I get different changes to these files if I ran this on different distros?

I think so. I saw quite a few changes moving between RHEL 7 and Ubuntu 16.04. In particular I think the names of the padding fields has a habit of changing and there may be constants added/removed/changed.

But if I run that one alone it adds more changes than just to the epoll struct and removes the build tag. I see that the mkpost.go change was recent so I tried without that and got a lot more changes to the file.

I don't think you should see any fewer changes because of mkpost.go. It should only do 'gofmt' on ppc64. It would be nice to add the build tag generation to it. I think these been added manually: https://go-review.googlesource.com/#/c/10113/

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 6, 2016

Here is patch showing the changes to ztypes_linux_ppc64le.go if I just pull latest master and run:
GOARCH=ppc64 go tool cgo -godefs types_linux.go |go run mkpost.go >ztypes_linux_ppc64le.go
without changing any files.

golang-mkall.txt

This was done on RHEL7.2 ppc64, but the same kind of patch shows up on Ubuntu 14.04 & 14.10 ppc64le. I see if I can try a later Ubuntu.

If I remove the pipe through mkpost.go, the differences are due to formatting.

@mundaym
Copy link
Member

@mundaym mundaym commented Apr 6, 2016

The struct ptrace* {} declarations were added for s390x but are harmless because they don't get exported on ppc64.

The fields that are removed are of length 0... I guess technically they shouldn't be removed due to API promises, but I don't see what practical use they could have (marker for the end of the struct perhaps?).

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 6, 2016

@bradfitz I missed your earlier request to update the CLs with the regenerated files. I just did the update as you requested earlier but I did change the patch so it didn't remove the build tags.

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 6, 2016

@bradfitz In mkpost.go there is this code, only done for s390x:

            // Export the types of PtraceRegs fields.
            re := regexp.MustCompile("ptrace(Psw|Fpregs|Per)")
            s = re.ReplaceAllString(s, "Ptrace$1")

            // Replace padding fields inserted by cgo with blank identifiers.
            re = regexp.MustCompile("Pad_cgo[A-Za-z0-9_]*")
            s = re.ReplaceAllString(s, "_")

            // Replace other unwanted fields with blank identifiers.
            re = regexp.MustCompile("X_[A-Za-z0-9_]*")
            s = re.ReplaceAllString(s, "_")

            // Force the type of RawSockaddr.Data to [14]int8 to match
            // the existing gccgo API.
            re = regexp.MustCompile("(Data\\s+\\[14\\])uint8")
            s = re.ReplaceAllString(s, "${1}int8")

I believe you were asking in the CL to replace unwanted fields with _ as is done in the 3rd regexp above, so I can move that below the if test and then it will be done for all GOARCHes instead of just s390x. But what about the replacement of padding fields with _ as in the 2nd regexp above? Should that be done for all GOARCHes too?

@clnperez
Copy link

@clnperez clnperez commented Apr 7, 2016

FYI, I've just rebuilt go using Lynn's patch to see if it fixed a docker hang we were seeing (on ppc64le), and it did. 👍

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 12, 2016

I was out for a few days.... This is a high priority for us because it affects the upstream Docker CI build on ppc64le and would like to get the golang fix upstream so Ubuntu can pick it up. I have updated the mkpost.go change according to the earlier suggestion by @bradfitz. Let me know what else needs to be done so we can get this in ASAP. Thanks.

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 13, 2016

Based on Ian's latest comments I have reverted my patch back to the minimum change needed to ztypes_linux_ppc64le.go and ztypes_linux_ppc64.go to fix the EpollEvent structure on these GOARCHes along with the change in types_linux.go. Trying to use the file generation scripts result in changes unrelated to this problem and introduce incompatibilities. I can open a new issue for the file generation problem if that is what should be done.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 13, 2016

(Sorry, just asked on the CL what you answered here.)

I don't really mind if you revert unwanted changes to the ztypes files, but I do want to ensure that the changes you do make are ones produced by mkall.sh. Otherwise you are introducing yet another problem to deal with. I'm OK with leaving the old ones alone, but not with introducing new ones.

@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 13, 2016

Not quite. Let me change the ztypes_* files so they match what gets generated, revert what shouldn't be there, and retest on both to make sure it's all good.

@gopherbot gopherbot closed this in 44f80f6 Apr 13, 2016
@laboger
Copy link
Contributor Author

@laboger laboger commented Apr 13, 2016

Can this be backported to go1.6?

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Apr 13, 2016

@laboger, no promises, but filed #15288 to consider it.

@gopherbot
Copy link

@gopherbot gopherbot commented Apr 19, 2016

CL https://golang.org/cl/22207 mentions this issue.

gopherbot pushed a commit that referenced this issue Apr 19, 2016
The existing epoll_event structure used by many of
the epoll_* syscalls was defined incorrectly
for use with ppc64le & ppc64 in the syscall
directory.  This resulted in the caller getting
incorrect information on return from these
syscalls.  This caused failures in fsnotify as
well as builds with upstream Docker.  The
structure is defined correctly in gccgo.

This adds a pad field that is expected for
these syscalls on ppc64le, ppc64.
Fixes #15135

Fixes #15288

Change-Id: If7e8ea9eb1d1ca5182c8dc0f935b334127341ffd
Reviewed-on: https://go-review.googlesource.com/21582
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-on: https://go-review.googlesource.com/22207
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@golang golang locked and limited conversation to collaborators Apr 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.