Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add supervise subcommand (supervisor mode) #327

Merged
merged 46 commits into from
Feb 22, 2017
Merged

Conversation

Songmu
Copy link
Contributor

@Songmu Songmu commented Feb 6, 2017

Usage

mackerel-agent supervise -conf ...

Description

The supervisor mode launches the mackerel-agent as a child process and controls it.
In paticular, it does the following things.

  • reloading mackerel-agent with HUP signal
  • crash recovery
    • However, if the child process dies within 30 seconds, it is regarded as launching failure and terminate the process without crash recovery

Copy link
Contributor

@mattn mattn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[super nit] Let's discuss how we implement this on Windows in later.

fork.go Outdated
signal.Notify(c, os.Interrupt, syscall.SIGTERM, syscall.SIGQUIT, syscall.SIGHUP)
go func() {
for sig := range c {
if sig == syscall.SIGHUP {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syscall.SIGHUP, SIGTERM, SIGQUIT, SIGHUP doesn't work on Windows.

fork.go Outdated
func (cm *cmdManager) reload() error {
// TODO configtest
cm.hupped = true
return cm.cmd.Process.Signal(syscall.SIGTERM)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@Songmu
Copy link
Contributor Author

Songmu commented Feb 7, 2017

@mattn I have no plan to support supervise mode on windows.
Because wix/wrapper should do supervise on windows.

@mattn
Copy link
Contributor

mattn commented Feb 7, 2017

Okay, then please put build-constraint to avoid compilation errors.

@mattn
Copy link
Contributor

mattn commented Feb 7, 2017

Ah, sorry. I thought Windows didn't have definition of SIGQUIT or some signals (which doesn't work on Windows). Then, your change is best for fixing my issue. Thanks!

supervisor.go Outdated
cmd *exec.Cmd
startAt time.Time
signaled bool
hupped bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to guard signaled and hupped with mutex?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and we should run go test with -race in CI, i guess...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at 1356a53

supervisor.go Outdated
}

func (sv *supervisor) supervise() error {
sv.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's better to check err from sv.start() here in case the command is something wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at a225f87


err := <-done
if err != nil {
t.Errorf("error should be nil but: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] should be %v for err format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at 52d2cb1

time.Sleep(50 * time.Millisecond)
ch <- os.Interrupt

err := <-done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can remove done and just use <- sv.wait() instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed interface of supervisor.supervise() at a85cbd2.

How about it?

Copy link
Contributor

@haya14busa haya14busa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, but it works great!

I confirmed child process will be restarted with kill -HUP {parent process} or kill -INT {child process}, kill -INT {parent process} successfully terminates the supervisor processes... and so on.

util/pid.go Outdated
package util

// ExistsPid check if pid exists
func ExistsPid(pid int) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] s/check/checks

And comment sentences is preferred to be end with period. It may help godoc formatting or other tools. but, nevermind.

https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at e13d0fd

util/pid.go Outdated
@@ -0,0 +1,6 @@
package util
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... there were already "util" package, so I think it's ok to some extent...
But, how about creating "pidutil" package which only handles pid files?
Maybe we can puts ExistsPid, createPidFile and removePidFile funcs into pidutil package.

We should avoid almost-meaning less package name and do not make meaning less package bigger as much as possible, i think.
https://github.com/golang/go/wiki/CodeReviewComments#package-names

Second option. How about putting back existsPid to main package and just little copy-pasting existsPid func to supervisor package? supervisor package uses it in only test files and it's really small function.

A little copying is better than a little dependency.
https://go-proverbs.github.io/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at bc8bba7.


// Supervise the mackerel-agent
func (sv *Supervisor) Supervise(c chan os.Signal) error {
err := sv.start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Since if accepts an initialization statement, it's common to see one used to set up a local variable. We can keep "err" variable scope short.

	if err := sv.start(); err != nil {
		return err
	}

https://golang.org/doc/effective_go.html#if

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at e13d0fd

return err
}
if c == nil {
c = make(chan os.Signal, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to close c not to leak go sv.handleSignal(c) goroutine.

defer close(c)

I also confirmed there is a leak with https://github.com/fortytw2/leaktest

Also, it's better to document c chan os.Signal arg can be nil and if it's not nil, the caller has responsibility to close it. Or, we can just say "do not use this package as library", i guess?

e.g.

// Supervise starts a child mackerel-agent process and supervises it.
// 'c' can be nil and it's tipically nil. When you pass signal channel to this
// method, you have to close the channel to stop internal goroutine.
func (sv *Supervisor) Supervise(c chan os.Signal) error {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at 5966779.

}

// Supervise the mackerel-agent
func (sv *Supervisor) Supervise(c chan os.Signal) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interface can be something like this and it looks simple...?

func Supervise(agentProg string, argv []string, c chan os.Signal) error {

Also, with this interface, we don't have to care the caller side modifies sv.Prog or sv.ArgV.

But, maybe.... it's ok as it is since we uses *Supervisor in test and we don't have to care caller side so much because we expects it's not used as library (right?)....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed at 5966779.

Copy link
Contributor

@astj astj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I wrote some questions and comments.

@@ -0,0 +1,61 @@
// +build !windows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For which reason this test is skipped on windows?
Create and Remove is performed in windows too, so it's better test them on windows

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this skip is due to missing implementation of existsPid, but even so it's better there's some tests for Windows (maybe in another file like pidfile_windows_test.go).

logger.Warningf("failed to reload: %s", err.Error())
}
} else {
sv.stop(sig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[imo]
Isn't it better supervisor outputs a log here like L149?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Especially here, logging which signal the supervisor received is maybe good)

}

func (sv *supervisor) supervise(c chan os.Signal) error {
if err := sv.start(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[imo]
same as L155

@@ -0,0 +1,171 @@
package supervisor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since supervisor.go is not used in Windows, so is it possible to add // +build !windows and add dummy supervisor_windows.go?
(I don't know the best way. One might think current way is better because of simplicity).

@@ -0,0 +1,45 @@
package main
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cute stub ♥️

@Songmu Songmu merged commit 01eab3d into master Feb 22, 2017
@Songmu Songmu deleted the mackerel-agent_safe branch February 22, 2017 08:09
haya14busa added a commit to haya14busa/mackerel-agent that referenced this pull request Feb 24, 2017
The comment meant following code

	if c == nil {
		c = make(chan os.Signal, 1)
		defer close(c)
	}

but it actually looks like this

	if c == nil {
		c = make(chan os.Signal, 1)
	}
	defer close(c)

I think both are fine but comment should be true.

ref: mackerelio#327 (comment)
This was referenced Mar 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants