Shutdown units before issuing a reboot #6570

Merged
merged 2 commits into from Nov 30, 2016

Conversation

Projects
None yet
6 participants
Member

gabriel-samfira commented Nov 15, 2016

The first implementation of the reboot functionality worked
under the assumption that we use file based hook locks. This meant
that the machine agent could acquire a lock and choose not to break
it until it was restarted, thus blocking any unit agents from running
hooks.

This decision was made under the pretext that the unit agent would
soon be merged with the machine agent, and switching from file locks
to mutexes/semaphores would not break it.

Juju has moved away from file locks, and we can no longer guarantee
a lock will be held by the machine agent until the machine is fully
rebooted. On reboot/shutdown the init system can now stop the machine
agent first, allowing unit agents to start executing hooks.

This change stops all deployed juju unit agents prior to issuing the
reboot (or shutdown) command.

cmd/jujud/reboot/reboot.go
+ return err
+ }
+ logger.Debugf("Stopping unit agent: %q", svcName)
+ err = svc.Stop()
@hoenirvili

hoenirvili Nov 15, 2016

Contributor

Could you please, make this in one line ?

 if err = scv.Stop(); err != nil {
    return err
 }
Owner

mitechie commented Nov 15, 2016

!!build!!

Contributor

hoenirvili commented Nov 15, 2016

!!build!!

Thanks for the PR @gabriel-samfira , LGTM !

I don't see any QA steps in the PR, nor any updated/new tests to show it works as expected.

Also, there's no bug or issue referenced in the PR. If it is a bugfix can we add a link to the respective bug?

A couple code review nits, but overall looks pretty straightforward and well executed.

@@ -21,6 +23,8 @@ import (
"github.com/juju/juju/container"
"github.com/juju/juju/container/factory"
"github.com/juju/juju/instance"
+ "github.com/juju/juju/service"
+ "github.com/juju/juju/service/common"
)
var logger = loggo.GetLogger("juju.cmd.jujud.reboot")
@reedobrien

reedobrien Nov 15, 2016

Contributor

This seems to violate the no global variables mandate. However, it predates these changes so not sure it merits an update in this PR.

@howbazaar

howbazaar Nov 15, 2016

Owner

The package level logger has become somewhat idiomatic.

@hoenirvili

hoenirvili Nov 15, 2016

Contributor

@reedobrien This is not the only place where a logger instance is used like this. As @howbazaar said, if you take a look at the juju-core source code, its used pretty often.

@reedobrien

reedobrien Nov 30, 2016

Contributor

Added a note to the checklist. Thanks for clarifying.

cmd/jujud/reboot/reboot.go
+func (r *Reboot) stopDeployedUnits() error {
+ osVersion, err := series.HostSeries()
+ if err != nil {
+ return err
@reedobrien

reedobrien Nov 15, 2016

Contributor

errors.Trace(err) the errors in this func, please.

Member

gabriel-samfira commented Nov 16, 2016

Hi @reedobrien

Will add some tests as soon as I get a chance.

There is no bug opened for this, it is something we ran into while deploying a bunch of co-located charms that need to reboot (Windows loves rebooting). This happens on machines without any co-located charms as well, although there is a higher chance of this happening if you have multiple unit agents running on the same machine.

If you wish I can open a bug and reference this PR in it.

Contributor

reedobrien commented Nov 16, 2016

@gabriel-samfira thanks for the tests and the PR. We appreciate it very much!

No worries about the bug, I've just done a search on reboot and didn't see anything related. @AlexisBruemmer @mitechie should we create a bug for this or just merge it when the tests land?

gabriel-samfira added some commits Nov 15, 2016

Shutdown units before issuing a reboot
The first implementation of the reboot functionallity worked
under the assumption that we use file based hook locks. This meant
that the machine agent could acquire a lock and choose not to break
it until it was restarted, thus blocking any unit agents from running
hooks.

This decission was made under the pretext that the unit agent would
soon be merged with the machine agent, and switching from file locks
to mutexes/semaphores would not break it.

Juju has moved away from file locks, and we can no longer guarantee
a lock will be held by the machine agent until the machine is fully
rebooted. On reboot/shutdown the init system can now stop the machine
agent first, allowing unit agents to start executing hooks.

This change stops all deployed juju unit agents prior to issuing the
reboot (or shutdown) command.
Added tests
* this also redefines service.ListServices and service.NewService as a variable to allow patching
* fixes logic in FakeService.Start() and FakeService.Stop()
Member

gabriel-samfira commented Nov 28, 2016

@reedobrien Please have another look. This includes some fixes for FakeService. I have some doubts about converting ListServices and NewService to variables, but if you have any other suggestions vis-a-vis stubbing out the service package, let me know.

@@ -98,7 +98,7 @@ type RestartableService interface {
// and several helper functions.
// NewService returns a new Service based on the provided info.
-func NewService(name string, conf common.Conf, series string) (Service, error) {
+var NewService = func(name string, conf common.Conf, series string) (Service, error) {
@hoenirvili

hoenirvili Nov 28, 2016

Contributor

Why not simple functions instead of lambdas?

@hoenirvili

hoenirvili Nov 28, 2016

Contributor

Ok, so this sounds reasonable enough.

@@ -137,7 +138,7 @@ func newService(name string, conf common.Conf, initSystem, series string) (Servi
}
// ListServices lists all installed services on the running system
-func ListServices() ([]string, error) {
+var ListServices = func() ([]string, error) {
@hoenirvili

hoenirvili Nov 28, 2016

Contributor

Again, like in the upper comment.

Contributor

reedobrien commented Nov 30, 2016

LGTM. Thanks.

LGTM. Thanks for adding the tests.

Member

gabriel-samfira commented Nov 30, 2016

$$merge$$

Contributor

jujubot commented Nov 30, 2016

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

Contributor

jujubot commented Nov 30, 2016

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/9747

Member

gabriel-samfira commented Nov 30, 2016

$$merge$$

Contributor

jujubot commented Nov 30, 2016

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

@jujubot jujubot merged commit bc3dc0e into juju:develop Nov 30, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment