Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry Pi infra failures #1895

Closed
Trott opened this issue Aug 21, 2019 · 8 comments
Closed

Raspberry Pi infra failures #1895

Trott opened this issue Aug 21, 2019 · 8 comments
Labels

Comments

@Trott
Copy link
Member

Trott commented Aug 21, 2019

I posted this to IRC but am posting here for tracking/visibility:

4 of the 10 Pi fanned sub-jobs are perma-red for build-looking reasons but I'm not 100% sure what's going on or how to fix. It's 11PM in eastern Australia, so I'm guessing @rvagg may not be available to take a look right now. Anyone else?

Here's a console from one of the failures:

02:40:43 Started by upstream project "node-test-binary-arm-12+" build number 1947
02:40:43 originally caused by:
02:40:43  Started by upstream project "node-test-commit-arm-fanned" build number 10267
02:40:43  originally caused by:
02:40:43   Started by upstream project "node-test-commit" build number 30883
02:40:43   originally caused by:
02:40:43    Started by upstream project "node-daily-master" build number 1652
02:40:43    originally caused by:
02:40:43     Started by timer
02:40:43 Running as SYSTEM
02:40:43 [EnvInject] - Loading node environment variables.
02:40:43 Building remotely on test-requireio_jasnell-debian9-armv7l_pi2-1 (arm-Raspbian-9.9 pi2-docker Raspbian 9.9 pi2-raspbian-stretch arm-Raspbian arm Raspbian-9.9) in workspace /home/iojs/build/workspace/node-test-binary-arm
02:40:46 [node-test-binary-arm] $ /bin/sh -xe /tmp/jenkins2748324996346902169.sh
02:40:46 + set +x
02:40:46 Wed Aug 21 09:40:46 UTC 2019
02:40:49 + pgrep node
02:40:49 12292
02:40:50 [node-test-binary-arm] $ /bin/bash -ex /tmp/jenkins3993577041375848650.sh
02:40:50 + rm -rf RUN_SUBSET
02:40:50 + case $label in
02:40:50 + REF=cc-armv7
02:40:50 + git --version
02:40:50 git version 2.11.0
02:40:50 + git init
02:40:50 Reinitialized existing Git repository in /home/iojs/build/workspace/node-test-binary-arm/.git/
02:40:50 + git fetch --no-tags file:///home/iojs/.ccache/node.shared.reference +refs/heads/jenkins-node-test-commit-arm-fanned-f70261fb3089b8acf5f68584ef0285a1c11f54fd-binary-pi1p/cc-armv7:refs/remotes/jenkins_tmp
02:41:02 From file:///home/iojs/.ccache/node.shared.reference
02:41:02  + 806988af32...80e4fe38b8 jenkins-node-test-commit-arm-fanned-f70261fb3089b8acf5f68584ef0285a1c11f54fd-binary-pi1p/cc-armv7 -> jenkins_tmp  (forced update)
02:41:02 
02:41:02 real	0m11.807s
02:41:02 user	0m9.117s
02:41:02 sys	0m1.152s
02:41:02 + ps -ef
02:41:02 + awk '{print $2}'
02:41:02 + grep '\[node\] <defunct>'
02:41:02 + xargs -rl kill
02:41:02 + rm -f ****
02:41:03 + git checkout -f refs/remotes/jenkins_tmp
02:41:19 Warning: you are leaving 2 commits behind, not connected to
02:41:19 any of your branches:
02:41:19 
02:41:19   806988af32 added binaries
02:41:19   ee113765a1 deps: update npm to 6.11.1
02:41:19 
02:41:19 If you want to keep them by creating a new branch, this may be a good time
02:41:19 to do so with:
02:41:19 
02:41:19  git branch <new-branch-name> 806988af32
02:41:19 
02:41:19 HEAD is now at 80e4fe38b8... added binaries
02:41:19 
02:41:19 real	0m16.738s
02:41:19 user	0m2.589s
02:41:19 sys	0m13.116s
02:41:19 + git reset --hard
02:41:24 HEAD is now at 80e4fe38b8 added binaries
02:41:24 
02:41:24 real	0m4.732s
02:41:24 user	0m2.337s
02:41:24 sys	0m1.473s
02:41:24 + git clean -fdx
02:41:28 warning: failed to remove out/Release/.nfs000000000038260500000701
02:41:28 Removing out/junit
02:41:28 
02:41:28 real	0m4.485s
02:41:28 user	0m0.587s
02:41:28 sys	0m1.012s
02:41:29 Build step 'Execute shell' marked build as failure
02:41:30 [Text Finder] Looking for pattern '^not ok' in the files at '*.tap'
02:41:30 [Text Finder] File set '*.tap' is empty
02:41:30 Performing Post build task...
02:41:30 Match found for : : True
02:41:30 Logical operation result is TRUE
02:41:30 Running script  : #/bin/bash -x +e
02:41:30 
02:41:30 mkdir out/junit || true
02:41:30 tap2junit -i test.tap -o out/junit/test.xml || true
02:41:30 tap2junit -i cctest.tap -o out/junit/cctest.xml || true
02:41:31 [node-test-binary-arm] $ /bin/sh -xe /tmp/jenkins4434279691825285073.sh
02:41:31 + mkdir out/junit
02:41:31 + tap2junit -i test.tap -o out/junit/test.xml
02:41:32 usage: tap2junit [-h] --input [INPUT] --output [OUTPUT]
02:41:32 tap2junit: error: argument --input/-i: can't open 'test.tap': [Errno 2] No such file or directory: 'test.tap'
02:41:32 + true
02:41:32 + tap2junit -i cctest.tap -o out/junit/cctest.xml
02:41:33 usage: tap2junit [-h] --input [INPUT] --output [OUTPUT]
02:41:33 tap2junit: error: argument --input/-i: can't open 'cctest.tap': [Errno 2] No such file or directory: 'cctest.tap'
02:41:33 + true
02:41:34 POST BUILD TASK : SUCCESS
02:41:34 END OF POST BUILD TASK : 0
02:41:34 Recording test results
02:41:34 ERROR: Step ‘Publish JUnit test result report’ failed: No test report files were found. Configuration error?
02:41:34 Collecting metadata...
02:41:34 Metadata collection done.
02:41:34 Notifying upstream projects of job completion
02:41:34 Finished: FAILURE
@Trott Trott added the incident label Aug 21, 2019
@Trott
Copy link
Member Author

Trott commented Aug 21, 2019

Anyone knowledgable and available to look at this? @nodejs/build

@Trott
Copy link
Member Author

Trott commented Aug 21, 2019

Actually, seems to be confined to specific hosts maybe? I'm going to start taking them offline and listing them here...

@Trott
Copy link
Member Author

Trott commented Aug 21, 2019

Took these offline:

test-requireio_jasnell-debian9-armv7l_pi2-1
test-requireio_svincent-debian9-armv7l_pi2-1
test-requireio_kahwee-debian9-arm64_pi3-1
test-requireio_davglass-debian9-arm64_pi3-1

Starting to doubt taking the nodes offline is going to help, but weirder stuff has happened....

@Trott
Copy link
Member Author

Trott commented Aug 21, 2019

Next up, compiles for the fanned jobs are failing.

16:02:07 Started by upstream project "node-cross-compile" build number 25261
16:02:07 originally caused by:
16:02:07  Started by upstream project "node-test-commit-arm-fanned" build number 10297
16:02:07  originally caused by:
16:02:07   Started by upstream project "node-test-commit" build number 30912
16:02:07   originally caused by:
16:02:07    Started by upstream project "node-test-pull-request" build number 25167
16:02:07    originally caused by:
16:02:07     Started by user Rich Trott
16:02:07     Started by upstream project "node-test-pull-request" build number 25155
16:02:07     originally caused by:
16:02:07      Started by user Rich Trott
16:02:07      Started by upstream project "node-test-pull-request" build number 25147
16:02:07      originally caused by:
16:02:07       Started by user Luigi Pinca
16:02:07 Running as SYSTEM
16:02:07 [EnvInject] - Loading node environment variables.
16:02:07 Building remotely on test-joyent-ubuntu1604_arm_cross-x64-1 (Ubuntu cc-armv7 cross-compiler-armv7-gcc-6 16.04 cross-compiler-armv7-gcc-4.8 cross-compiler-armv7-gcc-4.9 Ubuntu-16.04 amd64-Ubuntu amd64-Ubuntu-16.04 cross-compiler-armv7-gcc-4.9.4 cross-compiler-armv6-gcc-4.8 cross-compiler-armv6-gcc-4.9 cc-armv6 cross-compiler-armv6-gcc-4.9.4 amd64) in workspace /home/iojs/build/workspace/node-cross-compile
16:02:07 using credential dea9092d-214b-471a-be5d-5343dd7755c1
16:02:07  > git rev-parse --is-inside-work-tree # timeout=10
16:02:07 Fetching changes from the remote Git repository
16:02:08  > git config remote.jenkins_tmp.url binary_tmp@169.60.150.88:binary_tmp.git # timeout=10
16:02:08 Cleaning workspace
16:02:08  > git rev-parse --verify HEAD # timeout=10
16:02:08 Resetting working tree
16:02:08  > git reset --hard # timeout=10
16:02:08  > git clean -fdx # timeout=10
16:02:09 Fetching upstream changes from binary_tmp@169.60.150.88:binary_tmp.git
16:02:09  > git --version # timeout=10
16:02:09 using GIT_SSH to set credentials ci@iojs.org GitHub
16:02:09  > git fetch --tags --progress binary_tmp@169.60.150.88:binary_tmp.git +refs/heads/jenkins-node-test-commit-arm-fanned-6c9a53b07ae255c056a5fa3f4f0562006871ee52:refs/remotes/jenkins_tmp/_jenkins_local_branch # timeout=20
16:02:16 Checking out Revision cb92ba333d673a36cdc6dbb3748f92c25166379e (refs/remotes/jenkins_tmp/_jenkins_local_branch)
16:02:16  > git config core.sparsecheckout # timeout=10
16:02:16  > git checkout -f cb92ba333d673a36cdc6dbb3748f92c25166379e # timeout=10
16:02:23 java.lang.OutOfMemoryError: GC overhead limit exceeded
16:02:23 Caused: java.io.IOException: Remote call on JNLP4-connect connection from 165.225.136.6/165.225.136.6:42752 failed
16:02:23 	at hudson.remoting.Channel.call(Channel.java:963)
16:02:23 	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283)
16:02:23 	at com.sun.proxy.$Proxy96.withRepository(Unknown Source)
16:02:23 	at org.jenkinsci.plugins.gitclient.RemoteGitImpl.withRepository(RemoteGitImpl.java:237)
16:02:23 	at hudson.plugins.git.GitSCM.printCommitMessageToLog(GitSCM.java:1263)
16:02:23 	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1236)
16:02:23 	at hudson.scm.SCM.checkout(SCM.java:504)
16:02:23 	at hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
16:02:23 	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
16:02:23 	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
16:02:23 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
16:02:23 	at hudson.model.Run.execute(Run.java:1818)
16:02:23 	at hudson.matrix.MatrixRun.run(MatrixRun.java:153)
16:02:23 	at hudson.model.ResourceController.execute(ResourceController.java:97)
16:02:23 	at hudson.model.Executor.run(Executor.java:429)
16:02:23 Skipped archiving because build is not successful
16:02:23 Collecting metadata...
16:02:23 Metadata collection done.
16:02:23 Notifying upstream projects of job completion
16:02:23 Finished: FAILURE

I'll try rebooting test-joyent-ubuntu1604_arm_cross-x64-1 to see if that fixes that unless someone who knows what they're doing is around and says something before I get to it...

@Trott
Copy link
Member Author

Trott commented Aug 21, 2019

OK, after reboot, the compiler job is getting further along with the compiling, so I'll take that as a good sign....

@Trott
Copy link
Member Author

Trott commented Aug 21, 2019

OMG, it's green! A beautiful green! https://ci.nodejs.org/job/node-test-pull-request/25168/

Not sure if it makes sense to bring the four Pi devices back online and see what happens?

@rvagg
Copy link
Member

rvagg commented Aug 22, 2019

having a bunch of problems with those Pi's and more so let's leave them offline for now. I think some dodgy stuff has come in to recent raspbian updates that are messing up the NFS boot, will have to allocate some quality time to figuring it out but for now I'm even afraid to reboot because they don't come back up!

@Trott
Copy link
Member Author

Trott commented Dec 14, 2019

I don't think this is an issue anymore so I'll close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants