Make the state pool own the main txn log watcher. #8185

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
2 participants
Owner

howbazaar commented Dec 7, 2017

Back in the dawn of time, there was only a single model, and a single State instance. As the controller concept became a thing, we had additional State instances for each model. A core aspect of how the event model works in Juju is watching of the txns.log collection. As documents are added, updated, and removed by the transaction system, documents are added to the txns.log capped collection. Each document refers to the documents that were touched a single transaction. Each State instance has its own txn watcher. Each one of these tails the txns.log collection to watch for changes. As the number of models grows in a single controller, so does the number of things watching the txns.log collection. If the controller is HA, then there is n x m watchers where n is the number of models and m is the number of API servers. This creates quite a bit of idle load on mongo as the number of models grows above a small number. As numbers approach 200, there is a LOT of load. Especially when actions occur that trigger a bunch of transactions that touch a bunch of documents, such as removing an application. Particularly an application that has a bunch of subordinates.

So... this branch changes how the polling of the database is done. The state pool now has a watcher, that publishes all document changes to a hub that is also owned by the state pool. Whenever the state pool creates a new State instance, the base watcher for that State instance listens to the hub for changes rather than polling the database.

This drastically reduces the mongo load.

QA steps

I verified this by starting with a 2.2.6 controller, deploying 55 units to each of four models, and adding 700 empty models. This showed significant i/o load on the juju.txns.log collection.

After upgrading the controller, the load dropped significantly. From ~1000ms read / 10s to 15ms read / 10s.

Documentation changes

This is internal only, no documentation changes are necessary.

Bug reference

https://bugs.launchpad.net/juju/+bug/1733708

I think a lot of this is just copied from the old txn watcher code, so probably is fine.
The general idea of publishing and consuming from a hub to spread it across models is good.
The performance difference is tremendous.
I'm a little concerned that if one piece of this puzzle starts bouncing, does everything come back correctly.

+
+// HubWatcher listents to events from the hub and passes them on to the registered
+// watchers.
+type HubWatcher struct {
@jameinel

jameinel Dec 7, 2017

Owner

I'm going to assume that most of this code is just copied from the old TXN Watcher code.

@howbazaar

howbazaar Dec 8, 2017

Owner

Most of it was yes.

state/watcher/hubwatcher.go
+ }
+ go func() {
+ // TODO: consider an event to restart the hub watcher
+ // if the txnwatcher restarts.
@jameinel

jameinel Dec 7, 2017

Owner

How are we handling this? Just killing the hub watcher completely, and assuming the watcher is restarted at a higher level?

@howbazaar

howbazaar Dec 8, 2017

Owner

Ah.. yes we are doing that now. I'll remove the TODO.

state/watcher/hubwatcher.go
+ // TODO: consider an event to restart the hub watcher
+ // if the txnwatcher restarts.
+ unsub := hub.SubscribeMatch(
+ func(string) bool { return true }, w.recieveEvent,
@jameinel

jameinel Dec 7, 2017

Owner

can we have any sort of filter that would avoid evaluating all events from a bit earlier in the stack?

state/watcher/hubwatcher.go
+ return w, started
+}
+
+func (w *HubWatcher) recieveEvent(topic string, data interface{}) {
@jameinel

jameinel Dec 7, 2017

Owner

receiveEvent

state/watcher/hubwatcher.go
+ switch topic {
+ case txnWatcherStarting:
+ // This message is published when the main txns.log watcher starts. If
+ // this message is recieved here it means that the main watcher has
@jameinel

jameinel Dec 7, 2017

Owner

recieved => received

@howbazaar

howbazaar Dec 7, 2017

Owner

I always get that one wrong.

state/watcher/hubwatcher.go
+// period is the delay between each sync.
+func (w *HubWatcher) loop() error {
+ for {
+ w.logger.Warningf("loop() select")
@jameinel

jameinel Dec 7, 2017

Owner

Warningf?

@howbazaar

howbazaar Dec 7, 2017

Owner

Yeah... I would have marked them critical but my logger interface defined didn't have it. These were temporary logging that should be removed.

state/watcher/hubwatcher.go
+ // syncEvents are stored first in first out.
+ for i := 0; i < len(w.syncEvents); i++ {
+ e := &w.syncEvents[i]
+ w.logger.Warningf("flushing syncEvent %#v", e)
@jameinel

jameinel Dec 7, 2017

Owner

Shouldn't be a Warningf

@howbazaar

howbazaar Dec 7, 2017

Owner

Same as above

+ // we have started.
+ w.hub.Publish(txnWatcherStarting, nil)
+ backoff := PollStrategy.NewTimer(w.clock.Now())
+ next := w.clock.After(txnWatcherShortWait)
@jameinel

jameinel Dec 7, 2017

Owner

why would we always wait for the first event, shouldn't this just be After(0) ?

@howbazaar

howbazaar Dec 8, 2017

Owner

In reality, the difference between waiting for 10ms and 0 is unnoticable, but for testing where you are manually advancing the clock, it gives a lot more control.

state/watcher/txnwatcher.go
+ if qerr, ok := err.(*mgo.QueryError); ok {
+ // CappedPositionLost is code 136.
+ // Just in case that changes for some reason, we'll also check the error message.
+ if qerr.Code == 136 || strings.Contains(qerr.Message, "CappedPositionLost") {
@jameinel

jameinel Dec 7, 2017

Owner

I don't think this is your code, but this is soo lovely...
:(

@howbazaar

howbazaar Dec 7, 2017

Owner

Yeah, it is a bit ick, and to be honest, I don't think we need to care for this now.

- ws.StartWorker(txnLogWorker, func() (worker.Worker, error) {
- return watcher.New(st.getTxnLogCollection()), nil
- })
+ if hub == nil {
@jameinel

jameinel Dec 7, 2017

Owner

would it be better to be passing a func()=>hub rather than the raw hub ?
I'm just wondering about lifetime of various objects and the chance that something gets restarted.

@howbazaar

howbazaar Dec 8, 2017

Owner

The hub is owned by the state pool, so it will always outlast any state object in the pool.

Introduce two new watcher types.
The first watches the txns.log collection and publishes to the hub owned by the state pool
The second watches the hub for changes and handles the watcher code.
Owner

jameinel commented Dec 11, 2017

Unfortunately you rebased your changes and I can't seem to find the old commit hash to be able to easily do a diff of just what has changed since your previous change.
https://bugs.launchpad.net/juju/+bug/1734903
is the bug I was referencing earlier that your patch is likely to make worse. Because now there is only one connection, thus anything going wrong kills everything.
I'll try to do some of my own stress testing of this, but offhand I think it will be very nice.
Adding and removing units would also be something that was over-stressing Prodstack, so it would be good to see if that also seems to be addressed by this change.

Owner

jameinel commented Dec 18, 2017

I'm landing this into 2.3 via 2.2. I think that means this PR can be closed.

@jameinel jameinel closed this Dec 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment