Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Make the state pool own the main txn log watcher. #8185
Conversation
jameinel
reviewed
Dec 7, 2017
I think a lot of this is just copied from the old txn watcher code, so probably is fine.
The general idea of publishing and consuming from a hub to spread it across models is good.
The performance difference is tremendous.
I'm a little concerned that if one piece of this puzzle starts bouncing, does everything come back correctly.
| + | ||
| +// HubWatcher listents to events from the hub and passes them on to the registered | ||
| +// watchers. | ||
| +type HubWatcher struct { |
jameinel
Dec 7, 2017
Owner
I'm going to assume that most of this code is just copied from the old TXN Watcher code.
| + } | ||
| + go func() { | ||
| + // TODO: consider an event to restart the hub watcher | ||
| + // if the txnwatcher restarts. |
jameinel
Dec 7, 2017
Owner
How are we handling this? Just killing the hub watcher completely, and assuming the watcher is restarted at a higher level?
| + // TODO: consider an event to restart the hub watcher | ||
| + // if the txnwatcher restarts. | ||
| + unsub := hub.SubscribeMatch( | ||
| + func(string) bool { return true }, w.recieveEvent, |
jameinel
Dec 7, 2017
Owner
can we have any sort of filter that would avoid evaluating all events from a bit earlier in the stack?
| + return w, started | ||
| +} | ||
| + | ||
| +func (w *HubWatcher) recieveEvent(topic string, data interface{}) { |
| + switch topic { | ||
| + case txnWatcherStarting: | ||
| + // This message is published when the main txns.log watcher starts. If | ||
| + // this message is recieved here it means that the main watcher has |
| +// period is the delay between each sync. | ||
| +func (w *HubWatcher) loop() error { | ||
| + for { | ||
| + w.logger.Warningf("loop() select") |
howbazaar
Dec 7, 2017
Owner
Yeah... I would have marked them critical but my logger interface defined didn't have it. These were temporary logging that should be removed.
| + // syncEvents are stored first in first out. | ||
| + for i := 0; i < len(w.syncEvents); i++ { | ||
| + e := &w.syncEvents[i] | ||
| + w.logger.Warningf("flushing syncEvent %#v", e) |
| + // we have started. | ||
| + w.hub.Publish(txnWatcherStarting, nil) | ||
| + backoff := PollStrategy.NewTimer(w.clock.Now()) | ||
| + next := w.clock.After(txnWatcherShortWait) |
jameinel
Dec 7, 2017
Owner
why would we always wait for the first event, shouldn't this just be After(0) ?
howbazaar
Dec 8, 2017
Owner
In reality, the difference between waiting for 10ms and 0 is unnoticable, but for testing where you are manually advancing the clock, it gives a lot more control.
| + if qerr, ok := err.(*mgo.QueryError); ok { | ||
| + // CappedPositionLost is code 136. | ||
| + // Just in case that changes for some reason, we'll also check the error message. | ||
| + if qerr.Code == 136 || strings.Contains(qerr.Message, "CappedPositionLost") { |
howbazaar
Dec 7, 2017
Owner
Yeah, it is a bit ick, and to be honest, I don't think we need to care for this now.
| - ws.StartWorker(txnLogWorker, func() (worker.Worker, error) { | ||
| - return watcher.New(st.getTxnLogCollection()), nil | ||
| - }) | ||
| + if hub == nil { |
jameinel
Dec 7, 2017
Owner
would it be better to be passing a func()=>hub rather than the raw hub ?
I'm just wondering about lifetime of various objects and the chance that something gets restarted.
howbazaar
Dec 8, 2017
Owner
The hub is owned by the state pool, so it will always outlast any state object in the pool.
|
Unfortunately you rebased your changes and I can't seem to find the old commit hash to be able to easily do a diff of just what has changed since your previous change. |
|
I'm landing this into 2.3 via 2.2. I think that means this PR can be closed. |
howbazaar commentedDec 7, 2017
Back in the dawn of time, there was only a single model, and a single State instance. As the controller concept became a thing, we had additional State instances for each model. A core aspect of how the event model works in Juju is watching of the txns.log collection. As documents are added, updated, and removed by the transaction system, documents are added to the txns.log capped collection. Each document refers to the documents that were touched a single transaction. Each State instance has its own txn watcher. Each one of these tails the txns.log collection to watch for changes. As the number of models grows in a single controller, so does the number of things watching the txns.log collection. If the controller is HA, then there is n x m watchers where n is the number of models and m is the number of API servers. This creates quite a bit of idle load on mongo as the number of models grows above a small number. As numbers approach 200, there is a LOT of load. Especially when actions occur that trigger a bunch of transactions that touch a bunch of documents, such as removing an application. Particularly an application that has a bunch of subordinates.
So... this branch changes how the polling of the database is done. The state pool now has a watcher, that publishes all document changes to a hub that is also owned by the state pool. Whenever the state pool creates a new State instance, the base watcher for that State instance listens to the hub for changes rather than polling the database.
This drastically reduces the mongo load.
QA steps
I verified this by starting with a 2.2.6 controller, deploying 55 units to each of four models, and adding 700 empty models. This showed significant i/o load on the juju.txns.log collection.
After upgrading the controller, the load dropped significantly. From ~1000ms read / 10s to 15ms read / 10s.
Documentation changes
This is internal only, no documentation changes are necessary.
Bug reference
https://bugs.launchpad.net/juju/+bug/1733708