-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boot: helper for setting up a try recover system #9921
boot: helper for setting up a try recover system #9921
Conversation
1c1d4b0
to
63794c5
Compare
Improve some error messages in the sealing code. Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
…BootVars errors This gives slightly more flexibility in mocking different failure modes. Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Add a new helper for setting up a new recovery system to be tried. A request to reboot into given recovery system must be issued separately. Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
63794c5
to
879e3c8
Compare
…overy-mgmt-boot-set-try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, some minor comments and a question
) | ||
|
||
func clearTryRecoverySystem(dev Device, systemLabel string) error { | ||
if !dev.HasModeenv() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we expect to call this in other contexts more standalone to have to do this check again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this eventually becomes ClearTryRecoverySystem and gets called from a devicestate handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if indeed this function will be used in more places than just to specifically cleanup in failure modes of SetTryRecoverySystem we need to think carefully about the order of operations here to make sure we don't accidentally brick things with an unexpected reboot in between operations
boot/systems.go
Outdated
} | ||
vars := map[string]string{ | ||
"try_recovery_system": systemLabel, | ||
"recovery_system_status": "try", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these will not appear on the kernel command line, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only bootenv. Well, at least our boot script is not passing that on the command line.
boot/seal.go
Outdated
@@ -592,6 +592,10 @@ func runModeBootChains(rbl, bl bootloader.Bootloader, model *asserts.Model, mode | |||
// hashes from modeenv plus it returns separately the last BootFile | |||
// which is for the kernel. | |||
func buildBootAssets(bootFiles []bootloader.BootFile, modeenv *Modeenv) (assets []bootAsset, kernel bootloader.BootFile, err error) { | |||
if len(bootFiles) == 0 { | |||
// useful in testing, when mocking is insufficient | |||
return nil, bootloader.BootFile{}, fmt.Errorf("internal error: no boot files") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it doesn't matter because it will show up only in tests, so there's context. I'm just slightly worried that as a message in doesn't make too clear where it is coming from. Though at least atm we don't have any ohter "no boot files" error
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, some questions about the ordering of operations though as it pertains to getting rebooted in the middle of things
boot/systems.go
Outdated
if found { | ||
// we may be repeating the cleanup, in which case the system may | ||
// not be present in modeenv already | ||
if err := m.Write(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if found { | |
// we may be repeating the cleanup, in which case the system may | |
// not be present in modeenv already | |
if err := m.Write(); err != nil { | |
// we may be repeating the cleanup, in which case the system was | |
// already removed from the modeenv and we don't need to rewrite | |
// the modeenv | |
if found { | |
if err := m.Write(); err != nil { |
boot/systems.go
Outdated
resealErr := resealKeyToModeenv(dirs.GlobalRootDir, dev.Model(), m, expectReseal) | ||
|
||
// clear both variables, no matter the values they hold | ||
vars := map[string]string{ | ||
"try_recovery_system": "", | ||
"recovery_system_status": "", | ||
} | ||
// try to clear regardless of reseal failing | ||
blErr := bl.SetBootVars(vars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the ordering here safe? if we end up resealing before we manage to write the boot vars and got rebooted in between, would the system try to try booting back into that recovery system which is no longer trusted since we resealed to remove it? I would think that we would want the opposite order, writing bootloader variables first, then resealing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
speaking of this kind of ordering, what is your opinion on how hard it would be to make something in resealKeyToModeenv
panic()
thus simulating a reboot in the middle of the operation to ensure that operations are in a consistent state? too much work / too little reward since it's a fairly simple set of operations right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Ian is right here about the order unless we are confused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Booting into the system is controlled by snapd_recovery_mode
and snapd_recovery_system
, which are set separately. This cleanup is invoked if SetTryRecoverySystem fails, which is before we try to reboot to a new system, or after we're back to run mode, in which case the bootloader variables selecting the mode/system would have been modified and would no longer boot into the tried system. Need to think about it a bit, maybe I missed something.
boot/systems.go
Outdated
} | ||
}() | ||
const expectReseal = true | ||
if err := resealKeyToModeenv(dirs.GlobalRootDir, dev.Model(), m, expectReseal); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw I think the ordering of operations here is safe, we are resealing first, but only indicate to the bootloader that it is safe to boot into those by setting the vars after resealing is successful
boot/systems_test.go
Outdated
err := boot.SetTryRecoverySystem(s.uc20dev, "1234") | ||
c.Assert(err, IsNil) | ||
|
||
c.Check(mtbl.BootVars, DeepEquals, map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's try and use GetBootVars as much as possible so we can stop exporting BootVars at some point and make the mock bootloaders act more like real bootloaders
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may still need to have something line GetAllBootVars() to make sure nothing unexpected was set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes agreed on that
boot/systems_test.go
Outdated
|
||
modeenvRead, err := boot.ReadModeenv("") | ||
c.Assert(err, IsNil) | ||
c.Check(modeenvRead.CurrentRecoverySystems, DeepEquals, []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick, but I also have slight preference for doing a full DeepEquals comparison to ensure that nothing else was modified accidentally on the object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DeepEquals pick up the read
flag that gets set when it's read from a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right, since we are in the boot pkg though you could use modeenvRead.DeepEqual(exp)
I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint, I've updated the tests.
}) | ||
} | ||
|
||
func (s *systemsSuite) TestSetTryRecoverySystemSetBootVarsErr(c *C) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice test!
}) | ||
} | ||
|
||
func (s *systemsSuite) TestSetTryRecoverySystemCleanupOnErrorAfterReseal(c *C) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also a great test, thanks for this
) | ||
|
||
func clearTryRecoverySystem(dev Device, systemLabel string) error { | ||
if !dev.HasModeenv() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if indeed this function will be used in more places than just to specifically cleanup in failure modes of SetTryRecoverySystem we need to think carefully about the order of operations here to make sure we don't accidentally brick things with an unexpected reboot in between operations
…overy-mgmt-boot-set-try
…ith mocked bootloader Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
boot/systems.go
Outdated
} | ||
|
||
// SetTryRecoverySystem sets up the boot environment for trying out a recovery | ||
// system with given label. Once done, the caller should request switching to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"switching to the..." ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed that when I pushed the first few fixed, but added it now.
…t' into bboozzoo/uc20-recovery-mgmt-boot-set-try
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
…overy-mgmt-boot-set-try
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
} | ||
|
||
const expectReseal = true | ||
return resealKeyToModeenv(dirs.GlobalRootDir, dev.Model(), m, expectReseal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have altered the order of operations here, to match what we have in the document, that is:
- set try_recovery_system, recovery_system_status boot variables
- reseal
If for any reason we reboot, and unexpectedly boot into the tried recovery system before resealing, with the changes in #9940 accessing data will be impossible and we will reboot back to run mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is fairly strange to do it differently than everything else though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine I think but it merits a comment explaining what you said but also that is up to the caller to also switch the mode touching the env var that actually affect where we boto, I also wonder if it means that we should have a helper actually doing both? but maybe that's clearer when one sees the full code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that these things should be called out in comments, but the ordering and logic here makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comments
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is ok but still a bit confusing so if nothing else it needs more comment, I also wonder if we need a helper that actually does more, basically an EnsureNextBootToTryRecoverySystem ? maybe is clear in the full code
boot/systems.go
Outdated
|
||
// SetTryRecoverySystem sets up the boot environment for trying out a recovery | ||
// system with given label. Once done, the caller should request switching to | ||
// the given recovery system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should describe that this adds to the current_recovery_systems and also reseals to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
} | ||
|
||
const expectReseal = true | ||
return resealKeyToModeenv(dirs.GlobalRootDir, dev.Model(), m, expectReseal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine I think but it merits a comment explaining what you said but also that is up to the caller to also switch the mode touching the env var that actually affect where we boto, I also wonder if it means that we should have a helper actually doing both? but maybe that's clearer when one sees the full code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @pedronis about adding a comment, but otherwise lgtm, thanks for the work here. One nitpicky thing about tests that you can take in a followup or disregard as you prefer.
} | ||
|
||
const expectReseal = true | ||
return resealKeyToModeenv(dirs.GlobalRootDir, dev.Model(), m, expectReseal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that these things should be called out in comments, but the ordering and logic here makes sense
boot/systems_test.go
Outdated
|
||
modeenvRead, err := boot.ReadModeenv("") | ||
c.Assert(err, IsNil) | ||
c.Check(modeenvRead.CurrentRecoverySystems, DeepEquals, []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah right, since we are in the boot pkg though you could use modeenvRead.DeepEqual(exp)
I think
Thanks to @anonymouse64 for the suggestion. Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
Add a new helper for setting up a new recovery system to be tried. A request to
reboot into given recovery system must be issued separately.
Based on top of #9918, plus some trivial enhancements in seal and bootloader test.