Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: go1.8 build fails when run as a Jenkins job (was #18551) #19203

Closed
hartzell opened this issue Feb 20, 2017 · 17 comments

Comments

Projects
None yet
6 participants
@hartzell
Copy link

commented Feb 20, 2017

This is a continuation of #18551. I can now reproduce the problem on a smaller system and w/out involving Spack.

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

Using go-1.4 (built via Spack's go-bootstrap package) to build v1.8.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/isilon/Analysis/scratch/hartzelg/linuxbrew/Cellar/go/1.7.5/libexec"
GOTOOLDIR="/isilon/Analysis/scratch/hartzelg/linuxbrew/Cellar/go/1.7.5/libexec/pkg/tool/linux_amd64"
CC="/usr/bin/cc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build542396042=/tmp/go-build -gno-record-gcc-switches"
CXX="/usr/bin/c++"
CGO_ENABLED="1"

What did you do?

I had been using Spack's go package to build go and seeing the crashes reported in #18551. I initially attributed it to the size of the machine and considered Spack a second candidate culprit.

I have replace Spack with this shell script (stop giggling...):

#!/bin/bash

set -x
set -e
set -u

cd /path/to/tmp
rm -rf poodle
mkdir poodle
cd poodle
wget https://storage.googleapis.com/golang/go1.8.src.tar.gz
tar -xzf go1.8.src.tar.gz
cd go/src
export GOROOT_BOOTSTRAP=/path/to/go-bootstrap-1.4
./all.bash

I can run this command from the command line successfully.

I have a Jenkins job set up with this snippet of Jobs DSL

job('go-shell-script') {
  concurrentBuild(true)
  label(themachine')
  steps {
    shell '''#!/bin/bash
~me/build-go.sh
'''
  }
  publishers {
    mailer('me@work.com', false, true)
  }
}

The Jenkins master->slave connection is via SSH and it uses my account.

When I build it, I see the same symptoms I've been seeing in #18551. The output is in this gist.

What did you expect to see?

All tests passing in both command line and Jenkins build.

What did you see instead?

See the gist referenced above.

Conclusions

It seems that running the build remotely via Jenkins is my new primary candidate.

@ALTree

This comment has been minimized.

Copy link
Member

commented Feb 20, 2017

Another runtime test timeout: #18442 (this one has more details regarding the probable cause of the issue)
Another runtime test timeout: #19196 (also jenkins)

We probably should keep the first one and close the other (and this one)

@hartzell

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

@ALTree -- what convinces you that the Gentoo sandboxes have the same root problem as running via an ssh connection from Jenkins?

The Common Pitfalls section of the SSH Slaves plugin page says that in the end the Jenkins job script ends up being run like:

[...] but OpenSSH runs this with "bash -c command ..." (or whatever your login shell is.)

#18442 seems to be fixated on the sandboxing restrictions and/but I don't think the shell invocation above is sandboxing anything, though I'm still trying to understand what is happening...

@ALTree

This comment has been minimized.

Copy link
Member

commented Feb 20, 2017

shell invocation above is sandboxing anything

My suspect was that jenkins was transparently sandboxing executions in a similar manner, but I'm absolutely not a jenkins expert so please disregard my comment if you're convinced it's not useful.

@ALTree ALTree changed the title go v1.8 build fails when run as a Jenkins job (was #18551) runtime: go1.8 build fails when run as a Jenkins job (was #18551) Feb 20, 2017

@ALTree ALTree added this to the Go1.9 milestone Feb 20, 2017

@hartzell

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

I'm not convinced either way, just worried. I'd be surprised if Jenkins SSH slaves were intentionally sandboxing things (you'd think they'd brag about it).

@ALTree

This comment has been minimized.

Copy link
Member

commented Feb 20, 2017

Can you add a t.Skip() to TestCrashDumpsAllThreads in go/src/runtime/crash_unix_test.go and try to build?

@ALTree

This comment has been minimized.

Copy link
Member

commented Feb 20, 2017

Yeah this is a dup of #19196. Closing since that one is older.

@ALTree ALTree closed this Feb 20, 2017

@hartzell

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

@ALTree -- Testing as you requested.

The stack traces that I get don't always mention TestCrashDumpAllThreads, e.g. this one (from #18551):

https://gist.github.com/hartzell/232cf1a1624be49e278ea4f641eea01f

@ALTree ALTree reopened this Feb 20, 2017

@hartzell

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

Skipping that test gave me two successful builds in a row from Jenkins.

@ALTree

This comment has been minimized.

Copy link
Member

commented Feb 20, 2017

OK leaving this open until we're sure that the underling issue is the same as the other threads that I linked here.

@hartzell

This comment has been minimized.

Copy link
Author

commented Feb 20, 2017

The gist that does not mention TestCrashDumpAllThreads does mention TestGdbBacktrace, which seems to be involved in #18442.

@aclements

This comment has been minimized.

Copy link
Member

commented Jun 8, 2017

@hartzell, are you still having this issue? If so, I'd like to figure out if the test is actually hanging or just being slow. Could you set the environment variable GO_TEST_TIMEOUT_SCALE=10 and see if you can still reproduce?

The gist that does not mention TestCrashDumpAllThreads does mention TestGdbBacktrace, which seems to be involved in #18442.

It mentions TestGdbBacktrace, but TestGdbBacktrace isn't actually the running test. It's just blocked on testing-internal mechanisms waiting to be skipped.

@aclements aclements self-assigned this Jun 8, 2017

@hartzell

This comment has been minimized.

Copy link
Author

commented Jun 9, 2017

@aclements -- Repeating comment from another thread, I'm on vacation and miles/worlds away from work and access to this. I'll see what I can do when I return in a couple of Mondays.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9 Jun 9, 2017

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jun 9, 2017

Punting to Go 1.10. This isn't a regression from Go 1.8 to Go 1.9 anyway.

@aclements

This comment has been minimized.

Copy link
Member

commented Jun 9, 2017

@hartzell, no worries. Thanks in advance!

@hartzell

This comment has been minimized.

Copy link
Author

commented Jul 16, 2017

@aclements -- I have access to this large machine again, but at the moment it is involved in a bug hunt. I'll keep looking for an opportunity to recreate the symptoms and follow up.

@chlunde

This comment has been minimized.

Copy link
Contributor

commented Jul 16, 2017

I think this can be closed as a duplicate of #19196, as this is also when spawning from a Java process.

@chlunde chlunde marked this as a duplicate of #19196 Jul 16, 2017

@hartzell

This comment has been minimized.

Copy link
Author

commented Jul 16, 2017

Closing it seems reasonable to me too.

@chlunde has a more amenable test case too.

@golang golang locked and limited conversation to collaborators Jul 16, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.