Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: pipe2: too many open files on s390x builder #22759

Closed
mundaym opened this issue Nov 16, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@mundaym
Copy link
Member

commented Nov 16, 2017

The build on the s390x builder has, with one exception, been failing since CL 77670 with the error pipe2: too many open files in the net package tests, which implies the tests are bumping into the per-process open file descriptor limit. I don't see any obvious reason why this CL would increase the number of open file descriptors.

ulimit -n is 1024 on the builder which I think should be enough (and presumably the limit on other builders too?). I haven't managed to recreate the issue by hand either on the builder or locally. The net package still passes the tests fine when run manually on the builder with ulimit -n 32 (other tests do start to fail when ulimit -n is reduced to 128 and below). Perhaps I am missing some part of the environment?

I'm not sure what to try next other than increase ulimit -n on the builder.

@mundaym

This comment has been minimized.

Copy link
Member Author

commented Nov 16, 2017

/cc @bradfitz

@mundaym

This comment has been minimized.

Copy link
Member Author

commented Nov 16, 2017

I doubled the file descriptor limit to 2048 and rebooted the machine and it is still failing unfortunately. Will investigate further tomorrow.

@mundaym

This comment has been minimized.

Copy link
Member Author

commented Nov 20, 2017

Reverting CL 77670 (CL 78515) fixed the builder, so the failures do appear to be directly related to that CL somehow.

I'm still confused as to why the issue is only reproducible when the tests are executed by the buildlet.

A couple of questions about the buildlet I haven't managed to figure out yet:

  • How is the buildlet actually executing the tests (just fork/exec or some sort of containerization)?
  • What version of Go is the buildlet built with? (I'm wondering if it is still the original development version we used to bootstrap the tests)
@bradfitz

This comment has been minimized.

Copy link
Member

commented Nov 20, 2017

How is the buildlet actually executing the tests (just fork/exec or some sort of containerization)?

Just uses os/exec: https://github.com/golang/build/blob/3da79c2/cmd/buildlet/buildlet.go#L866

What version of Go is the buildlet built with? (I'm wondering if it is still the original development version we used to bootstrap the tests)

It was last updated Apr 2, 2017.

$ curl -I https://storage.googleapis.com/go-builder-data/buildlet.linux-s390x
HTTP/1.1 200 OK
X-GUploader-UploadID: AEnB2Upqi7klVyV65VOqt9Ghr_PY0pyM2O4doDnxva0YppF_icz4oeRv2YsNwa_hgTn3wEh_MZgJLo1NhWU9BDB1MyXzMTVScg
Date: Mon, 20 Nov 2017 17:37:49 GMT
Cache-Control: no-cache
Expires: Tue, 20 Nov 2018 17:37:49 GMT
Last-Modified: Sun, 02 Apr 2017 16:45:08 GMT
ETag: "71a92bad25e0b97d292528488686e781"
x-goog-generation: 1491151508765000
x-goog-metageneration: 1
x-goog-stored-content-encoding: identity
x-goog-stored-content-length: 12244931
Content-Type: application/octet-stream
x-goog-hash: crc32c=TtVQug==
x-goog-hash: md5=cakrrSXguX0pJShIhobngQ==
x-goog-storage-class: STANDARD
Accept-Ranges: bytes
Content-Length: 12244931
Server: UploadServer

Looks like Go 1.8 era:

$ strings buildlet.linux-s390x  | grep -i go1.8 | wc -l
928
$ strings buildlet.linux-s390x  | grep -i go1.9 | wc -l
0

Want me to update it from master?

@bradfitz

This comment has been minimized.

Copy link
Member

commented Nov 20, 2017

Btw, you can run the buildlet locally and then test against it by using the golang.org/x/build/cmd/gomote command. There's a special case for a builder type name to refer to a specific buildlet process for development:

// clientAndConfig returns a buildlet.Client and its build config for                                                              
// a named remote buildlet (a buildlet connection owned by the build                                                               
// coordinator).                                                                                                                   
//                                                                                                                                 
// As a special case, if name contains '@', the name is expected to be                                                             
// of the form <build-config-name>@ip[:port]. For example,                                                                         
// "windows-amd64-race@10.0.0.1".                                                                                                  

So you could say gomote push linux-s390x-ibm@localhost:8080 and gomote run -debug linux-s390x-ibm@localhost:8080 go/src/all.bash, etc.

But updating the binary and hoping it works might be easier, especially if you remember some old fd leak that's probably fixed since then.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2018

The problematic CL 77670 was reverted by CL 78515. The overall problem was, I believe, fixed by CL 100840. I'm going to close this as fixed.

@golang golang locked and limited conversation to collaborators Mar 29, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.