Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5 of 5 linux/ppc64 builders are missing #41742

Closed
dmitshur opened this issue Oct 1, 2020 · 7 comments
Closed

5 of 5 linux/ppc64 builders are missing #41742

dmitshur opened this issue Oct 1, 2020 · 7 comments
Milestone

Comments

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Oct 1, 2020

@laboger
Copy link
Contributor

@laboger laboger commented Oct 2, 2020

I am trying to contact OSU for information on this.

@ceseo
Copy link
Contributor

@ceseo ceseo commented Oct 2, 2020

I see the system is responding to ping:

PING 140.211.169.164 (140.211.169.164): 56 data bytes
64 bytes from 140.211.169.164: icmp_seq=0 ttl=48 time=245.184 ms
64 bytes from 140.211.169.164: icmp_seq=1 ttl=48 time=256.996 ms
64 bytes from 140.211.169.164: icmp_seq=2 ttl=48 time=302.820 ms

Maybe it's just a matter of logging in and restarting the builder?

@laboger
Copy link
Contributor

@laboger laboger commented Oct 2, 2020

osu ticket #31316 was created for this. They are saying the machine is up and can be ssh'ed to. There were some issues yesterday that may have caused some instances to be rebooted. As Carlos noted the ppc64 builder might just need to be restarted. I don't have a key to get on those machines.

@ramereth
Copy link

@ramereth ramereth commented Oct 2, 2020

@bradfitz please let me know if I can add anyone's key to the machine so that we can get it going again.

@cagedmantis
Copy link
Contributor

@cagedmantis cagedmantis commented Oct 2, 2020

@ramereth Thanks for your work on this. We are able to log in.

@ramereth
Copy link

@ramereth ramereth commented Oct 2, 2020

The service seems to be running however it's erroring out with the following:

Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Creating go-be-%d01 ...
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Error creating go-be-%d01: exit status 125,
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: See 'docker run --help'.
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Creating go-be-%d02 ...
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Error creating go-be-%d02: exit status 125,
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: See 'docker run --help'.
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Creating go-be-%d03 ...
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Error creating go-be-%d03: exit status 125,
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: See 'docker run --help'.
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Creating go-be-%d04 ...
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:25 Error creating go-be-%d04: exit status 125,
Oct 02 14:30:25 go-be-xenial-3 rundockerbuildlet[2975]: See 'docker run --help'.
Oct 02 14:30:26 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:26 Creating go-be-%d05 ...
Oct 02 14:30:26 go-be-xenial-3 rundockerbuildlet[2975]: 2020/10/02 14:30:26 Error creating go-be-%d05: exit status 125,
Oct 02 14:30:26 go-be-xenial-3 rundockerbuildlet[2975]: See 'docker run --help'.

Here's what the systemd unit for that service looks like:

[Unit]
Description=Run Buildlets in Docker
After=network.target

[Install]
WantedBy=network-online.target

[Service]
Type=simple
# The (-n * -cpu) values must currently be <= number of host cores.
# The host has 10 cores, so the -n=5 (five containers) * -cpu=2 (two CPUs per container) == 10.
# -memory=3.9g doesn't work with crun; TODO: tiborvass is investigating
ExecStart=/usr/local/bin/rundockerbuildlet -basename=go-be-%d -image=golang/builder -n=5 -cpu=2 -memory= --env=host-linux-ppc64-osu
Restart=always
RestartSec=2
StartLimitInterval=0
@cagedmantis
Copy link
Contributor

@cagedmantis cagedmantis commented Oct 2, 2020

@ramereth I was able to get it back up and running by changing ExecStart to ExecStart=/usr/local/bin/rundockerbuildlet -basename=ppc64_ -image=golang/builder -n=5 -cpu=2 -memory= --env=host-linux-ppc64-osu. Thanks again for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.