Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: linux-arm trybot/slowbot failure #35628

Open
cherrymui opened this issue Nov 15, 2019 · 11 comments
Open

x/build: linux-arm trybot/slowbot failure #35628

cherrymui opened this issue Nov 15, 2019 · 11 comments

Comments

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Nov 15, 2019

What version of Go are you using (go version)?

tip (d856e05)

What did you do?

Run linux-arm trybot on tip (with a dummy change, CL https://go-review.googlesource.com/c/go/+/207440)
It failed with https://storage.googleapis.com/go-build-log/09ec0769/linux-arm_4add410f.log

I also got a few failures on a different CL (https://go-review.googlesource.com/c/go/+/207350)
https://storage.googleapis.com/go-build-log/3c197532/linux-arm_4a2eca13.log
https://storage.googleapis.com/go-build-log/3c197532/linux-arm_e2868c28.log
https://storage.googleapis.com/go-build-log/3472bd16/linux-arm_0d7459e0.log

I cannot reproduce it with gomote. Also the linux-arm builder is happy.

I think it may be related to that the trybot does a cross-compilation and then ships the binaries to the ARM machine.

cc @bradfitz

@gopherbot gopherbot added this to the Unreleased milestone Nov 15, 2019
@gopherbot gopherbot added the Builders label Nov 15, 2019
@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 15, 2019

I suspect this was "broken" by https://go-review.googlesource.com/c/build/+/205603 which upgraded the ARM environment (where tests run) to Buster, but the compilation half on x86 was unchanged.

I say "broken" in quotes because this was never a problem until slowbots which let you do this again.

But it used to work with trybots on by default a few years ago, so it does work if we configure both halves the same.

Will fix.

@bradfitz bradfitz self-assigned this Nov 15, 2019
@cherrymui
Copy link
Contributor Author

@cherrymui cherrymui commented Nov 15, 2019

Thanks @bradfitz

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Nov 16, 2019

Hmm, this isn't as obvious as I'd hoped.

Both environments are Debian buster:

Cross-compilation:
https://github.com/golang/build/blob/master/env/crosscompile/linux-armhf-cross/Dockerfile

Linux-arm on Scaleway:
https://github.com/golang/build/blob/master/env/linux-arm/scaleway/Dockerfile

@rsc, you probably remember me debugging similar issues in the past. I'd love some better tooling to help debug these sorts of issues. It doesn't affect many people, though. But if we could do this sort of cross-compilation (make.bash in one place, run tests elsewhere) for more builders (riscv, mips, arm, etc) we could get much better throughput from our slower builders. But I'm always scared of doing it due to issues like this.

@mundaym
Copy link
Member

@mundaym mundaym commented Feb 24, 2020

Ping. Still seeing this and it means we can't run the trybots on arm targets:

https://storage.googleapis.com/go-build-log/986a3d00/linux-arm_a2dcef07.log

@mundaym
Copy link
Member

@mundaym mundaym commented Feb 24, 2020

Unassigning @bradfitz since I don't think he's working on x/build at all anymore.

@cherrymui
Copy link
Contributor Author

@cherrymui cherrymui commented Feb 24, 2020

In the mean time, I have been using gomote for ARM. Using android-arm is another option, which is not the same as linux-arm but also has many similarities.

@cagedmantis
Copy link
Contributor

@cagedmantis cagedmantis commented Feb 24, 2020

@josharian
Copy link
Contributor

@josharian josharian commented Feb 27, 2020

This cost me a chunk of time this morning. Can we disable linux/arm slowbots until this is fixed? Or have gopherbot emit a warning and reference to this issue when it recognizes linux/arm?

@cherrymui
Copy link
Contributor Author

@cherrymui cherrymui commented Aug 19, 2020

If this cannot be fixed soon, can we just disable/reject linux-arm for trybot? And maybe use android-arm for "TRY=arm"?

The failure has confused people several times (e.g. https://go-review.googlesource.com/c/go/+/248684).

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Aug 19, 2020

I've closed a newer #40872, which seems to be the same issue.

If this cannot be fixed soon, can we just disable/reject linux-arm for trybot? And maybe use android-arm for "TRY=arm"?

Based on #40872 (comment), this is closer to being resolved now.

It seems reasonable to reject/ignore it during TRY= slowbot requests until then.

@gopherbot
Copy link

@gopherbot gopherbot commented Aug 19, 2020

Change https://golang.org/cl/249420 mentions this issue: cmd/coordinator: warn about known linux-arm SlowBot issue

@dmitshur dmitshur changed the title x/build: linux-arm trybot failure x/build: linux-arm trybot/slowbot failure Aug 19, 2020
gopherbot pushed a commit to golang/build that referenced this issue Aug 20, 2020
The current linux-arm builder is known to have trouble when used as
a SlowBot. Start warning about it when the builder is requested via
the TRY= SlowBot UI.

I've considered also removing or disabling the "arm" SlowBot alias,
but that would make it easier to miss that there's an issue, since
SlowBots don't warn about unknown builders:

	If you specify an unknown TRY= token, it'll just ignore it
	and won't report an error.

We can consider making further changes as this situation evolves.
The goal here is to start notifying about a known problem sooner.

For golang/go#35628.
For golang/go#40872.

Change-Id: Ibc1205720c44ec4823c632c04fc2f887368258c1
Reviewed-on: https://go-review.googlesource.com/c/build/+/249420
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.