vendor,partition: fix panics from uenv #3325

Merged
merged 3 commits into from May 16, 2017

Conversation

Projects
None yet
3 participants
Contributor

zyga commented May 16, 2017

This branch aims to fix an error reported by @fgimenez during testing on raspberry pi 2/3.

The relevant part of the panic is:

May 15 14:51:13 localhost snapd[1207]: panic: runtime error: index out of range
May 15 14:51:13 localhost snapd[1207]: goroutine 29 [running]:
May 15 14:51:13 localhost snapd[1207]: panic(0x55471de8, 0x34ade010)
May 15 14:51:13 localhost snapd[1207]: #011/usr/lib/go-1.6/src/runtime/panic.go:481 +0x370
May 15 14:51:13 localhost snapd[1207]: github.com/snapcore/snapd/vendor/github.com/mvo5/uboot-go/uenv.parseData(0x34dfc005, 0x1fffb, 0x3fdfb, 0x7913abd9)
May 15 14:51:13 localhost snapd[1207]: #011/build/snapd-EbqrYA/snapd-2.26.1/_build/src/github.com/snapcore/snapd/vendor/github.com/mvo5/uboot-go/uenv/env.go:95 +0x230
May 15 14:51:13 localhost snapd[1207]: github.com/snapcore/snapd/vendor/github.com/mvo5/uboot-go/uenv.Open(0x34c158e0, 0x15, 0x0, 0x0, 0x0)
May 15 14:51:13 localhost snapd[1207]: #011/build/snapd-EbqrYA/snapd-2.26.1/_build/src/github.com/snapcore/snapd/vendor/github.com/mvo5/uboot-go/uenv/env.go:80 +0x3c0
May 15 14:51:13 localhost snapd[1207]: github.com/snapcore/snapd/partition.(*uboot).GetBootVars(0x557bac6c, 0x34c36298, 0x1, 0x1, 0x34dd7d01, 0x0, 0x0)
May 15 14:51:13 localhost snapd[1207]: #011/build/snapd-EbqrYA/snapd-2.26.1/_build/src/github.com/snapcore/snapd/partition/uboot.go:86 +0x80
May 15 14:51:13 localhost snapd[1207]: github.com/snapcore/snapd/boot.SetNextBoot(0x34cb4000, 0x0, 0x0)

While we don't have the uenv data from the affected devices we suspect this was a data corruption issue and uenv could no longer parse the existing config. In the revision we were using prior to this branch this would cause uenv to panic.

This branch pulls in changes that landed in uboot-go master since last sync, as well as two new improvements that are aimed directly at the issue we discovered mvo5/uboot-go#2 and mvo5/uboot-go#3

The vendor sync is coupled with a small change that makes us use best-effort parser that kindly ignores malformed data without panicking or returning an error. This should hopefully improve reliability in face of certain kinds of data corruption.

Signed-off-by: Zygmunt Krynicki zygmunt.krynicki@canonical.com

zyga added some commits May 16, 2017

vendor: update github.com/mvo5/uboot-go
This patch updates uboot-go to include latest bug fixes.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>
partition: use uenv.OpenWithFlags and OpenBestEffort
We saw some panics in on-hardware testing of raspberry pi 2 and 3 that
look like data corruption on the SD card. Currently the code would just
panic (uenv.Open panicks) but with the new updates it would return an
error instead.

Errors are better but they are still quite fatal since this is not a A/B
resource that we can fall back on. To combat this we can pass the
OpenBestEffort flag. In this case parsing errors are ignored and
malformed data is discarded but other data is retained. This should let
us fare better in cases like this.

Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki@canonical.com>

mvo5 approved these changes May 16, 2017

Collaborator

mvo5 commented May 16, 2017

This is great, thank you. We still need to get to the bottom of the issue that @fgimenez is seeing, i.e. we need a dump of the environment that caused this panic (or the binary file).

Contributor

zyga commented May 16, 2017

We traced the issue down to stale vendor.json that contained a reference to uboot-go that was outdated, pre-dating any changes I made in the two follow-up PRs.

@mvo5 mvo5 merged commit de8ce96 into snapcore:master May 16, 2017

2 of 7 checks passed

artful-amd64 autopkgtest running
Details
xenial-amd64 autopkgtest running
Details
xenial-i386 autopkgtest running
Details
yakkety-amd64 autopkgtest running
Details
zesty-amd64 autopkgtest running
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
xenial-ppc64el autopkgtest finished (success)
Details

@zyga zyga deleted the zyga:fix/panic-in-ubootenv-open branch May 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment