implement Runner.Worker method #2

rogpeppe · 2017-03-03T17:13:20Z

There are places in Juju that implement reasonably
complicated wrappers to gain access to running
workers. The Worker method provides access
to the currently running worker within a Runner,
which should reduce the need for such things.

mhilton

LGTM

mhilton · 2017-03-07T14:40:42Z

runner.go

+	// run goroutine.
+
+	// start holds the function to create the worker.
+	// If this is nil,


please add a then clause.

mhilton · 2017-03-07T14:49:30Z

runner.go

+
+// killWorkerUnlocked is like killWorker except that it expects
+// the runner.mu mutex to be held already.
+func (runner *Runner) killWorkerUnlocked(id string) {


this looks locked

mjs · 2017-03-06T22:54:22Z

runner.go

+
+	// finalError holds the error that will be returned
+	// when the runner finally exits.
+	finalError error


Would you adding some comments that isDying and finalError should only be touched by the run goroutine and/or things it calls.

Added a comment to finalError - isDying already says "isDying is maintained by the run goroutine", and I think that as it's not inside a mutex, it should be clear that "maintained by" implies ownership and hence that it shouldn't be modified outside of that goroutine.

I'm open to alternative comment wordings.

mjs · 2017-03-06T22:56:14Z

runner.go

 	select {
-	case runner.startc <- startReq{id, startFunc}:
+	case runner.startc <- startReq{id, startFunc, reply}:
+		<-reply


This naked channel read makes me nervous. Are you really sure there will always be a reply? Can it be inside a select which also checks the Tomb?

It's OK because the startc channel is synchronous so if we succeed in sending on it, we know that the run goroutine has entered the startc arm of the select, and that calls startWorker which never blocks (trivially verifiable) and then closes the reply channel, so we can be completely sure we'll always get a reply.

I added a comment to that effect.

mjs · 2017-03-06T23:20:21Z

runner.go

+	// Worker should not block almost all of the time.
+	runner.mu.Lock()
+	stopped = true
+	runner.mu.Unlock()


The handling of runner.mu and stopped in this func is somewhat hard to follow (feels bug prone). Can it be structured to be more clear?

mjs · 2017-03-06T23:22:57Z

runner.go

-			info.start = req.start
-			info.restartDelay = 0
+			logger.Debugf("start %q", req.id)
+			runner.startWorker(req)


Nice idea to extract these out. run is now much more readable.

mjs · 2017-03-06T23:45:57Z

runner.go

+	stopped = true
+	runner.mu.Unlock()
+	runner.workersChangedCond.Broadcast()
+	return nil, ErrStopped


As discussed IRL, please explore options for making the synchronising/handling stopped clearer. Also consider making the second half a separate method/func.

I just looked into factoring out the second half into its own function, but it doesn't work out so well - we'd have to call it with the lock held (so asymmetrical locking) and the getWorker function would either need to be duplicated or passed as a parameter, neither of which I'm that keen on.

Can you remind me of the suggestions/conversation around stopped, please?

The issue with the handling of the mu and stopped in this method is that it's quite hard to reason about when the lock is locked and how/when it's unlocked. I feel like there's got to be better way to structure the code.

mjs · 2017-03-06T23:50:20Z

runner_test.go

+
+	// Wait long enough to be reasonably confident that the
+	// worker has been started and registered.
+	time.Sleep(10 * time.Millisecond)


No! This kind of thing always results in flaky tests.

A more robust approach is to poll for up to some much longer time (many seconds) until the desired result is seen.

The result is the same whether we sleep for a long time or not - there's no externally visible difference and the test will pass either way.

We could insert test synchronisation points into the production code, I guess, but I'd prefer not.

If occasionally the code takes longer than 10ms to add the worker to the map, we'll just go through the slow path (the path that TestWorkerWithWorkerNotImmediatelyAvailable tests).

hmmm, ok. if the test will pass without the sleep why bother with it?

Because we test a different path through the code (verifiable with go test -cover).

mjs · 2017-03-06T23:51:39Z

runner_test.go

+	}()
+	// Sleep long enough that we're pretty sure that Worker
+	// will be blocked.
+	time.Sleep(10 * time.Millisecond)


Again, polling here would be good (and elsewhere)

Same deal - there's nothing externally visible that tells us whether Worker has blocked or not.

mjs

On a re-read of the change I'm ok with it, as long as there's a comment explaining the relationship between the mutex and the cond variable.

There are places in Juju that implement reasonably complicated wrappers to gain access to running workers. The Worker method provides access to the currently running worker within a Runner, which should reduce the need for such things.

rogpeppe force-pushed the 002-get-worker branch 4 times, most recently from 7448e46 to a3a99a5 Compare March 7, 2017 14:50

mhilton approved these changes Mar 7, 2017

View reviewed changes

mjs reviewed Mar 7, 2017

View reviewed changes

rogpeppe force-pushed the 002-get-worker branch from a3a99a5 to 50a6cbb Compare March 7, 2017 22:21

mjs approved these changes Mar 7, 2017

View reviewed changes

implement Runner.Worker method

6965b9d

There are places in Juju that implement reasonably complicated wrappers to gain access to running workers. The Worker method provides access to the currently running worker within a Runner, which should reduce the need for such things.

rogpeppe force-pushed the 002-get-worker branch from 50a6cbb to 6965b9d Compare March 8, 2017 00:25

rogpeppe merged commit c279cb0 into juju:v1 Mar 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement Runner.Worker method #2

implement Runner.Worker method #2

rogpeppe commented Mar 3, 2017

mhilton left a comment

mhilton Mar 7, 2017

mhilton Mar 7, 2017

mjs Mar 6, 2017

rogpeppe Mar 7, 2017

mjs Mar 6, 2017

rogpeppe Mar 7, 2017

rogpeppe Mar 7, 2017

mjs Mar 6, 2017

mjs Mar 6, 2017

mjs Mar 6, 2017

rogpeppe Mar 7, 2017

mjs Mar 7, 2017

mjs Mar 6, 2017

rogpeppe Mar 7, 2017

mjs Mar 7, 2017

rogpeppe Mar 7, 2017

mjs Mar 6, 2017

rogpeppe Mar 7, 2017

mjs left a comment

implement Runner.Worker method #2

implement Runner.Worker method #2

Conversation

rogpeppe commented Mar 3, 2017

mhilton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjs left a comment

Choose a reason for hiding this comment