Buffer mongo log record insertions #7502

Merged
merged 7 commits into from Jun 15, 2017

Conversation

Projects
None yet
4 participants
Member

axw commented Jun 14, 2017

Description of change

  • Refactor and simplify the state.DbLogger code.
  • Introduce a state/logdb package, which I intend to move the existing DbLogger and friends into, but first things first. This package initially holds a BufferedLogger, which buffers log records and flushes to an underlying logger (e.g. DbLogger) when either the buffer capacity or a time interval are exceeded.
  • Update state.DbLogger.Log to insert records in bulk.
  • Update the "release" function returned by StatePool.Get, and StatePool.Remove, to report whether or not they remove/close the State.
  • Update the apiserver logsink/logtransfer code to share a single DbLogger for all uses of a shared state pool entry, to cut down on the number of sockets used for logging.
  • Update the apiserver logsink/logtransfer code to use state/logdb.BufferedLogger to buffer log writes. The parameters are currently hard-coded: a buffer capacity of 1024, and flush interval of 2 seconds. We can make these configurable as needed.

Note to reviewers: there are a few refactoring commits, which you may wish to review first. If it makes your life easier, I'll split this up - I just wanted to get it in front of people ASAP.

QA steps

  1. juju bootstrap localhost
  2. juju add-machine
  3. juju debug-log -m controller

Watch the logs roll in; perform some activity and observe that, unless there is a flurry of logs, debug-log will take ~2 seconds to update. The number of mongo sockets and insertions should be reduced from before. I added temporary instrumentation to prove this to myself, but it should be possible to eyeball the improvements via mongo stats with a larger deployment.

Documentation changes

None.

Bug reference

None.

axw added some commits Jun 14, 2017

Don't pass pointers to state.LogTailerParams
Passing around by pointer is a microoptimisation for code
that is called infrequently, at the expense of immutability.
state: drop EntityDbLogger
Drop the EntityDbLogger abstraction, and use DbLogger everywhere,
reusing the LogRecord structure.
Owner

jameinel commented Jun 14, 2017

some comments, and I'd really like to have real perf tests so we know it will have a genuine positive impact vs adding buffering/complexity.

@@ -596,7 +599,7 @@ func (srv *Server) endpoints() []apihttp.Endpoint {
)
add("/model/:modeluuid/applications/:application/resources/:resource", &ResourcesHandler{
- StateAuthFunc: func(req *http.Request, tagKinds ...string) (ResourcesBackend, func(), names.Tag, error) {
+ StateAuthFunc: func(req *http.Request, tagKinds ...string) (ResourcesBackend, state.StatePoolReleaser, names.Tag, error) {
@jameinel

jameinel Jun 14, 2017

Owner

This is also just a drive-by, right? not related to dblogger changes? Just saw we weren't using the state pool and releasing them?

@axw

axw Jun 15, 2017

Member

This is from 5bc3e7a, but no functional change in this part of the code. The StatePoolReleaser has a boolean result that tells the caller whether or not releasing caused the State object to be closed and removed from the pool.

@@ -32,7 +32,7 @@ type debugLogHandler struct {
@jameinel

jameinel Jun 14, 2017

Owner

no actual changes here, just a *struct to struct, right?

@axw

axw Jun 15, 2017

Member

Yep, there's a few refactoring commits up front.

@@ -18,7 +18,7 @@ func newDebugLogDBHandler(ctxt httpContext) http.Handler {
func handleDebugLogDBRequest(
st state.LogTailerState,
- reqParams *debugLogParams,
+ reqParams debugLogParams,
@jameinel

jameinel Jun 14, 2017

Owner

I assume the same thing here, just changing from *struct to struct, along with the tests.

If otherwise, please ask for a deeper review.

@axw

axw Jun 15, 2017

Member

Yep.

@@ -116,7 +120,10 @@ func checkPermissions(tag names.Tag, acceptFunc common.GetAuthFunc) (bool, error
// has admin permissions on the controller model. The method also gets the
// model uuid for the model being migrated from a request header, and returns
// the state instance for that model.
-func (ctxt *httpContext) stateForMigration(r *http.Request, requiredMode state.MigrationMode) (st *state.State, returnReleaser func(), err error) {
+func (ctxt *httpContext) stateForMigration(
+ r *http.Request,
@jameinel

jameinel Jun 14, 2017

Owner

nit: seems odd to wrap the params but not the returns

+ // dbLoggerBufferSize is the capacity of the log buffer.
+ // When the buffer fills up, it will be flushed to the
+ // database.
+ dbLoggerBufferSize = 1024
@jameinel

jameinel Jun 14, 2017

Owner

how did we derive this constant? was there testing that 10k was too big, 100 was too small?
Should we try to expose knobs in agent.conf as we did for other arbitrary constants that we may want to tune?

@axw

axw Jun 15, 2017

Member

It's ~the same as what's used in the juju/txn pruning code. I did do some testing, and saw that in a small deployment the buffer was basically never reached, which means we'll generally wait 2s to flush. Yes, I think we'll want to expose a tuning knob, but I'll probably do that in a followup. I'll do some more analysis first.

@jameinel

jameinel Jun 15, 2017

Owner

I should note that Bulk has a limit of 1000 records (according to txn pruning) so we probably want to limit the chunk size slightly smaller.

+
+// dispose closes all dbloggers in the map, and clears the memory. This
+// must not be called concurrently with any other dbloggers methods.
+func (d *dbloggers) dispose() {
@jameinel

jameinel Jun 14, 2017

Owner

disposeAll ?

@axw

axw Jun 15, 2017

Member

"dispose of the dbloggers" is how it reads to me

- s.dbLogger = state.NewEntityDbLogger(st, s.entity, s.version)
+ s.dblogger = s.dbloggers.get(st)
+ s.releaser = func() {
+ if removed := releaseState(); removed {
@jameinel

jameinel Jun 14, 2017

Owner

so is the return value of releaseState "has this model been deleted" ?

@axw

axw Jun 15, 2017

Member

more that the *State object is no longer usable, but effectively yes; because we don't remove them until the model is dead

+// the specified capacity and duration; after either of which is exceeded,
+// the records will be flushed to the underlying logger.
+func (b *BufferedLogger) Log(in []state.LogRecord) error {
+ b.mu.Lock()
@jameinel

jameinel Jun 14, 2017

Owner

I'm curious if the logic around this would be clearer (or more confusing) if Log just sent a message onto a channel for a worker loop to consume, rather than having locks, etc.
Most of Juju code is based around worker loops.

@axw

axw Jun 15, 2017

Member

I did consider doing it that way, but having a goroutine looping for work means an always-on timer or a more complicated condition variable or something. Using a timer means waking up and burning CPU to find there's nothing to do. Probably doesn't matter much, but I'd like to get away from doing it where we can, and this is pretty simple and contained.

@jameinel

jameinel Jun 15, 2017

Owner

You can actually do the same tricks of not having the timer active when there isn't work, etc.
That said, in anything where we're actually active and care about the buffering, do we really think we'll be burning much CPU on timers checking?

state/logdb/buf.go
+ }
+ b.buf = append(b.buf, in[:n]...)
+ in = in[n:]
+ if len(b.buf) == cap(b.buf) {
@jameinel

jameinel Jun 14, 2017

Owner

while this is arguably correct, I'm always a fan of ">=" just to avoid "something caused it to go past the target, buffer everything forever after and never stop"

@axw

axw Jun 15, 2017

Member

Sure, done.

+ }
+ }
+ }
+ if len(b.buf) > 0 && b.flushTimer == nil {
@jameinel

jameinel Jun 14, 2017

Owner

This generally feels like a re-usable pattern that would be nice to use for Pinger. It is independent of what buf is and what function is being called wrt flushing the buffer.
TimeFlushingBuffer ?
Its not a lot of code, so maybe its easier to have it inlined, but it does seem like a pattern we're going to be repeating.

@axw

axw Jun 15, 2017

Member

It might be possible to do with an interface, but I can't see how at the moment. IMO it's simple and small enough to inline.

state/logdb/buf.go
+ b.flush()
+}
+
+func (b *BufferedLogger) flush() error {
@jameinel

jameinel Jun 14, 2017

Owner

a comment here about "flush assumes you are already holding the mutex" would be helpful for readers

@axw

axw Jun 15, 2017

Member

Done.

state/logdb/buf_test.go
+ case <-time.After(coretesting.LongWait):
+ c.Fatal("timed out waiting for logs to be flushed")
+ }
+}
@jameinel

jameinel Jun 14, 2017

Owner

The other parts you are missing are that it doesn't trigger any future logs
and what time we will be flushing after future logging. (the set back to nil and it getting set again once we see another log)

Also, we are missing a test that if you Log([]1010) items, then we will flush the 1000 immediately, but wait the 1minute to flush the remaining 10, but we guarantee the 10 will get flushed eventually.

And potentially the "log 2 items, then N > buffer". Maybe that is just combined with the previous test. But the general "some stuff is queued, we overflow the buffer, but still have stuff queued".

@axw

axw Jun 15, 2017

Member

Done.

state/logs.go
+ }
+ }
+ // TODO(axw) copy session here and close after?
+ bulk := logger.logsColl.Bulk()
@jameinel

jameinel Jun 14, 2017

Owner

If we have a separate logger per State object, I think re-using just the same session is better than scaling sessions per Log calls.

@axw

axw Jun 15, 2017

Member

Agreed, I should have dropped that - will do so now.

+ // TODO(axw) Use a controller-global int
+ // sequence for Id, so we can order by
+ // insertion.
+ Id: bson.NewObjectId(),
@jameinel

jameinel Jun 14, 2017

Owner

object ids have a counter-per state and an embedded timestamp, not sufficient? I guess the timestamp only has 1s resolution which is why we have nano timestamps. (probably ms is enough, but something more than 1s)

@axw

axw Jun 15, 2017

Member

It needs to be monotonic in the face of HA, so not sufficient if we want to preserve order exactly. There may be an alternative to this, I was just updating the name of the person responsible.

@@ -445,7 +440,7 @@ func (t *logTailer) processReversed(query *mgo.Query) error {
return errors.Errorf("too many lines requested (%d) maximum is %d",
t.params.InitialLines, maxInitialLines)
}
- query.Sort("-e", "-t", "-_id")
+ query.Sort("-t", "-_id")
@jameinel

jameinel Jun 14, 2017

Owner

the index we have required e t i, did that change with the split out but we didn't change the query?

@axw

axw Jun 15, 2017

Member

Yeah, we no longer store the model UUID (what was "e") as part of the record. Each model's logs now go into a separate collection.

state/pool.go
@@ -104,33 +111,34 @@ func (p *StatePool) Get(modelUUID string) (*State, func(), error) {
// release indicates that the client has finished using the State. If the
// state has been marked for removal, it will be closed and removed
// when the final Release is done.
@jameinel

jameinel Jun 14, 2017

Owner

update this comment for what the boolean means?

@axw

axw Jun 15, 2017

Member

Done.

Member

axw commented Jun 15, 2017

Do we have concrete information that this is better?
If we changed logging to aggregate into time-based log objects (rather than
every log being a separate object), then this would seem like a bigger win.
And while Insert(100)*10 is probably better than Insert(1)*1000, it would
be good to have a concrete understanding of what the buffering is going to
get us.

This PR is just about improving the performance of log insertion, and I don't see how aggregating into time-based objects would help that. That would be helpful for querying and pruning.

I've eyeballed a definite CPU usage reduction, but I'll run again with Prometheus and Telegraf and provide a graph.

Member

axw commented Jun 15, 2017

The graphs aren't terribly easy to read without the annotations and such, so here's some highlights. I deployed 10 units of "fill-logs" from the acceptance tests charm repo, and ran "fill-machine" with 500MB on each machine, 10 times in a row, with controller model's log level set to TRACE. This is not supposed to be super realistic, just driving logging up high enough that it impacts CPU.

  • without my changes, number of in-use sockets sits at about 30; with my changes, it fluctuates between 6 and 7.
  • without my changes, CPU ranges between 25-30% user, 6-7% system; with my changes ranges between 20-25% user, 3-5% system. Roughly half iowait% with my changes.
  • without my changes, system load 1 min average is ~1.8; with my changes, just under 1
Member

axw commented Jun 15, 2017

!!build!!

worker/dblogpruner/worker_test.go
@@ -140,11 +140,19 @@ func (s *suite) TestPrunesLogsBySize(c *gc.C) {
}
func (s *suite) addLogs(c *gc.C, t0 time.Time, text string, count int) {
- dbLogger := state.NewEntityDbLogger(s.state, names.NewMachineTag("0"), version.Current)
+ dbLogger := state.NewDbLogger(s.state) // names.NewMachineTag("0"), version.Current)

Based on jam's +1, lgtm with typos fixed

axw added some commits Jun 14, 2017

state: have DbLogger.Log insert records in bulk
DbLogger.Log now takes a slice of LogRecord, and
inserts them in bulk. The callers still only call
with one record for now.
state: introduce StatePoolReleaser
Change StatePool.Get to return a StatePoolReleaser,
rather than func(), and have it report whether or
not releasing caused the State object to be removed
from the pool. Also update the Remove method to
report the same (i.e. that removing actually caused
the removal immediately due to 0 references.)
state/logdb: add BufferedLogger
Add a BufferedLogger type, which buffers log records
and flushes when the buffer capacity is reached, or
when a timer expires.

In a follow-up I intend to move the existing log-related
code out of the state package, and into this new one.
apiserver: use a buffered database log writer
Instead of maintaining a state.DbLogger for
each agent, maintain one for each model, and
buffer writes. This should limit the number
of sockets required and reduce the write
pressure on mongo.
Member

axw commented Jun 15, 2017

$$merge$$

Contributor

jujubot commented Jun 15, 2017

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

@jujubot jujubot merged commit f18d7c0 into juju:2.2 Jun 15, 2017

1 check passed

github-check-merge-juju Ran tests against PR. Use !!.*!! to request another build. IE, !!build!!, !!retry!!
Details
+ // dbLoggerFlushInterval is the amount of time to allow
+ // a log record to sit in the buffer before being flushed
+ // to the database.
+ dbLoggerFlushInterval = 2 * time.Second
@jameinel

jameinel Jun 15, 2017

Owner

I should have noticed this as well, we should definitely have this tweakable as well. This does mean a specific delay in "juju do-something" and "juju debug-log" reporting that the thing is going on. 2s is actually a rather long time when you're doing live debugging.
Given things like watcher limitations, maybe its sane enough.

@axw

axw Jun 15, 2017

Member

Both are configurable in #7505.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment