Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init: Add upgrade check to avoid incompatible upgrades [RFC,WIP] #535

Merged
merged 1 commit into from
Sep 1, 2016
Merged

init: Add upgrade check to avoid incompatible upgrades [RFC,WIP] #535

merged 1 commit into from
Sep 1, 2016

Conversation

MilhouseVH
Copy link
Contributor

@MilhouseVH MilhouseVH commented Jul 12, 2016

This is a first stab/RFC/WIP at adding a degree of protection to the upgrade process, so that users can't so easily upgrade using incompatible .tar/.img/.img.gz files.

Note that this will NOT prevent users installing the wrong LibreELEC upgrade file when upgrading from any OpenELEC build, or any current LibreELEC build (to date) for that matter - this ship has already sailed, and there's nothing we can do to prevent users from installing anything of their choice on such systems. The change proposed in this PR will only protect users once they are already booting a version of LibreELEC that includes this PR.

Essentially, this PR uses the LIBREELEC_ARCH (from /etc/os-release) of the currently running system and also the upgrade system, and determines if the Project/Arch of the current system is valid for the new upgrade.

It will do this in one of two ways - first it will compare the current and upgrade Project/Arch values. If they match, the upgrade is considered compatible with no further checks necessary.

If they don't match, then the new SYSTEM image will be mounted and (if it exists) a script called canupdate.sh will be executed - this is passed the current and new Project/Arch values as parameters. The exit code of this script should be 0 for success (ie. "is compatible"), and non-zero for incompatible.

If the upgrade is compatible, the upgrade can proceed as normal. Otherwise, the upgrade will be aborted and the system will boot normally.

The purpose of the canupdate.sh script is to allow more complex validation for a given upgrade, should this be necessary.

As of right now, this script is not required by any project. However let's say in the future we were to introduce an RPi3.aarch64 project/arch. In this case, we would only want users with RPi3 hardware to upgrade to this build, and not users with RPi2 (32-bit only) hardware. The following canupdate.sh proof-of-concept script - when added to packages/tools/bcm2835-bootloader/scripts/canupdate.sh - should accomplish this (although for now, the default validation is entirely sufficient):

#!/bin/sh
#
# Validate RPi upgrades (RPi/RPi2/RPi3)
#

# Detect RPi hardware from revision code
# https://github.com/AndrewFromMelbourne/raspberry_pi_revision
getcpu() {
  cat /proc/cpuinfo | awk '/Revision/ { r=sprintf("%d", "0x" $3); print(and(rshift(r,12),15)) }'
}
is_rpi3() {
  [ "$(getcpu)" = "2" ] && return 0 || return 1
}

CURRENT=${1}
UPGRADE=${2}
COMPATIBLE=no

# If the current project is not RPi*, this upgrade will not succeed
if echo "${CURRENT}" | grep -qE "^RPi.*"; then
  case ${UPGRADE} in
    # Upgrading to RPi3.aarch64 from another project/arch, eg. RPi.arm or RPi2.arm
    # If the hardware supports RPi3, then that's OK
    RPi3.aarch64)
      if [ "${CURRENT}" = "RPi3.aarch64" ]; then
        COMPATIBLE=yes
      elif is_rpi3; then
        COMPATIBLE=yes
      fi
      ;;
    # Upgrading to RPi2.arm, possibly from RPi3.aarch64 or RPi.arm
    RPi2.arm)
      [ "${CURRENT}" = "RPi2.arm" -o "${CURRENT}" = "RPi3.aarch64" ] && COMPATIBLE=yes
      ;;
    # Upgrading to RPi.arm, possibly from RPi3.aarch64 or RPi2.arm
    RPi.arm)
      [ ${CURRENT} = "RPi.arm" ] && COMPATIBLE=yes
      ;;
  esac
fi

[ ${COMPATIBLE} = yes ] && exit 0 || exit 1

About the only downside I can see with this whole approach is that when cross-grading to an older build that lacks canupdate.sh, eg. from RPi3.aarch64 to an older RPi2.arm build, or perhaps a more realistic example is a user downgrading (for test purposes) from Generic.x86_64 to an ancient Nvidia_Legacy.x86_64 build, the change in Project/Arch will result in the upgrade failing. To overcome this issue, the user can create /storage/.update/.nocompat which will disable this entire compatability check for the duration of the next upgrade (the .nocompat file will be automatically removed during clean up).

Definitely need comments on this one... :)

@MilhouseVH
Copy link
Contributor Author

Would be good if projects other than RPi/RPi2/Generic can be tested with this. By default, all current projects (once they include this PR) should only be able to upgrade to builds based on the same project/arch as the current system. So, WeTek_Core.arm can only upgrade to WeTek_Core.arm, it cannot upgrade to WeTek_Play.arm etc.

@lrusak
Copy link
Member

lrusak commented Jul 12, 2016

looks good to me, perhaps a 60s countdown is a bit long though? 20s should be adequate.

@chewitt
Copy link
Member

chewitt commented Jul 12, 2016

I'd approach this differently. Look at the update file to extract project/arch, compare to current. If different write a .cleanup file to /storage/.update, and pause the updater for 30 seconds with an on-screen warning that they are different and if you would like to avoid the update please power cycle now. If the 30 seconds expires, remove the .cleanup file and continue with update. If the system boots with .cleanup present (user power cycled) remove the update files and continue normal boot. I'd be concerned about trying to map project/arch combinations when Amlogic breeds new ones daily.. "there be dragons"

@MilhouseVH
Copy link
Contributor Author

MilhouseVH commented Jul 12, 2016

@lrusak: I chose 60 seconds because I imagined the following scenario: we're dealing with a user who may not have upgraded in a long while, who may not be used to the improved LE upgrade/boot process, may not be a native English speaker, and who might now be a little surprised (ie. spooked/panicked) that their little box is telling them all this stuff about an upgrade not working as expected.

Giving them 60 seconds rather than 20 seconds might be enough time for them to digest what is happening on screen, to calm down, and get it right the second time around. To be honest, I'd even change it to 120 seconds - the frequency of this happening should hopefully be low (so a long delay isn't a big deal) while it's also important the user completely understands what has happened.

@chewitt: That's a very interesting alternative, and I almost like it. :) However my only concern is that if someone attempts an upgrade and isn't paying attention (or is a slow reader - ie. non-native English speaker) we're potentially blowing their system away whether they like it or not, and recovering from such a mess is a total PITA. In the interests of safety I'd much prefer that the overwriting of "incompatible" systems can only happen once the user instructs us to do so by creating the file that disables the compatibility check, much as we do when disabling the MD5 integrity check. We don't continue when the integrity checks fail (unless the user creates the .nocheck file), and likewise I don't think we should automatically continue when there's a possibility the user might end up with an unbootable system.

I could be wrong though, which is why this is an RFC. I've tried to go through the various upgrade/downgrade/crossgrade scenarios in my head and I might have missed something obvious but with our current projects I don't think we should hit this "incompatible" issue unless the user is genuinely making a mistake, or it's a test scenario (installing a really old build - in which case create the .nocompat file). Your proposal would make life easier for the latter but could put the former at risk and protecting the former is the entire purpose of this change.

Note also that any OE Nvidia_Legacy user crossgrading to LE Generic will not see any compatibility problem as they'll be completing the upgrade process using their original OE system which has no compatibility checks. It's only after the upgrade is complete that the LE Generic system will boot for the first time. If the user then decided they wanted to return to OE Nvidia_Legacy then sure, they'd hit the compatibility issue, in which case they'd have to create the .nocompat file (the solution is given to them on-screen).

I'd be concerned about trying to map project/arch combinations when Amlogic breeds new ones daily..

I checked on Slack and apparently users of WeTek_Core won't want to upgrade with WeTek_Play, how likely is it that users of our exisiting (non-RPi/RPi2/Generic) projects will want to crossgrade (that is, upgrade from one project to a different one)? If this is likely, and more importantly, compatible (ie. the updated system will still boot and be usable) then it should be handled by a suitable canupdate.sh script (similar to the RPi POC in the first post).

Also, the canupdate.sh file is in the new upgrade file, so should be able to adapt as new projects are added, which means there's no problem when crossgrading to a new project, but there's still an issue when crossgrading to an older project that isn't aware of the newer project installed on the current system (solution: create .nocompat).

@CvH
Copy link
Member

CvH commented Jul 12, 2016

What about adding some lines in the main foreign languages? A bit messy but maybe helpful.

@MilhouseVH
Copy link
Contributor Author

It's not something we've done before (adding foreign languages) and could make it even more complicated, plus what languages would we choose? There are several reasons why an upgrade (or even boot) might fail - MD5 check failed, fsck failure, missing SYSTEM etc. etc., should we translate these messages too? That could be a lot of text in many different languages, very confusing for the user who might already be a bit panicked that their system has just gone tits up.

Hopefully the English I've used is understandable by someone with a basic grasp of the English language, but it's tough for me to say for sure, being a native speaker.

At the end of the day, with this change if the user boots their system with a duff upgrade file and does absolutely nothing, they'll still be left with a booting system so that they can have another go at trying to break it. If they don't understand what they see on screen they can always post a picture on a forum somewhere, but at the very least they'll still have a working system!

@lrusak
Copy link
Member

lrusak commented Jul 20, 2016

LGTM, any objections?

@chewitt
Copy link
Member

chewitt commented Jul 21, 2016

In the ~5 years that I've been kicking around OE/LE i'm not sure I've seen an example of a user hosing their box by using the wrong update file. Picking the wrong file to install with happens regularly, but that's a different issue. So I have an impression this is trying to fix a problem that doesn't really exist while adding scripts and complexity to the boot/update process. I'm feeling the need to resist the change but can't articulate the specific reason. It's a 6th sense thing.

@lrusak
Copy link
Member

lrusak commented Jul 21, 2016

@chewitt, I think the whole reason @MilhouseVH did this is because a user messed up their system in the first place. I think it is a warranted change.

@CvH
Copy link
Member

CvH commented Jul 21, 2016

It is not happen regularly but it happens and without proper Linux knowledge you cant repair it (it happend me in the past 2times due selecting the wrong file).
I assume that it didn't breaks anything so I would be a nice protection against mistakes without draw backs.

@MilhouseVH
Copy link
Contributor Author

In the ~5 years that I've been kicking around OE/LE i'm not sure I've seen an example of a user hosing their box by using the wrong update file.

@chewitt One recent example of a user upgrading with the wrong tar file: http://forum.kodi.tv/showthread.php?tid=282367

Granted it was from OE to LE so even with this fix we couldn't have prevented it, but once we have this implemented we will at least be able to save LE users from their own mistakes in the future.

@MilhouseVH
Copy link
Contributor Author

Merge or close?

# If the old project/arch is not compatible with the new project/arch then abort...
if ! is_compatible "${2}" "${old_project_arch}" "${new_project_arch}"; then
echo ""
echo "ERROR: This update is not compatible with the current system - update aborted!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to pass an error message and the filename from canupdate.sh back to the update script so the error message is more explicit? e.g.

ERROR: LibreELEC-RPi3.aarch64-9.0.0.tar is not compatible with RPi2 hardware. Update cancelled.

@MilhouseVH
Copy link
Contributor Author

Is it possible to pass an error message and the filename from canupdate.sh back to the update script so the error message is more explicit? e.g.

ERROR: LibreELEC-RPi3.aarch64-9.0.0.tar is not compatible with RPi2 hardware. Update cancelled.

Updated based on comment.

Imgur

@chewitt
Copy link
Member

chewitt commented Aug 26, 2016

I would like this to go into v7.90.005 but the change will be picked-up by Amlogic builds. I can see future challenges with box-named Amlogic projects created by far-east manufacturers who don't keep their builds updated, and users who subsequently discover our generic 'S905' and similar releases are kept updated and have all the missing add-ons they were looking for - the transition will be denied due to the non-matching project names and there's going to be more of these box-named projects than we can keep tabs on. To play safe (at least for now) I think the check should be limited to the official projects we release for and have knowledge of.

I'd also remove the instructions on how to defeat the check from the screen (as people will just use them without thinking) but perhaps show a link to a wiki page that provides more details (the bypass can be listed there, with more details).

@MilhouseVH
Copy link
Contributor Author

Any project, official or unofficial, can include a canupdate.sh script to avoid the check entirely (it simply needs to exit with success regardless of the currently installed project) or widen support for as yet unknown boxes/projects. If a vendor or individual is going to be creating their own projects then they will soon work out the need for the canupdate.sh script. Any user upgrading to an official build/project from one of these unofficial projects will hit the incompatibility issue whether we restrict support to official projects or not, hence the workaround.

As for providing details of the workaround I'd have thought the warning that is shown would have been sufficient for people to not blindly defeat the check, while providing details of the workaround goes some way towards alleviating your concern for random unsupported boxes. A Wiki page would be useful (we could document ..nocheck if it's not already there) but when faced with this error I think we should give the user the basic information up front to resolve the issue without having to load a browser.

echo ""
echo "If you know what you are doing, create \"$UPDATE_ROOT/.nocompat\" to avoid this check."
echo "However, doing so may result in an unbootable system."
echo ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Create /storage/.update/.nocompat to disable compatibility checks and risk a non-booting system"

^^ less wordy one-liner for non-English users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@chewitt chewitt merged commit d416dad into LibreELEC:master Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants