Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sampler counts faster, and also binned by level #252

Merged
merged 28 commits into from
Feb 8, 2017

Conversation

jcorbin
Copy link
Contributor

@jcorbin jcorbin commented Jan 24, 2017

Towards #176:

  • benchmark the prior counters
  • implement a faster and bounded memory counters
  • add leveled keying

@mention-bot
Copy link

@jcorbin, thanks for your PR! By analyzing the history of the files in this pull request, we identified @akshayjshah, @prashantv and @creiht to be potential reviewers.

@jcorbin jcorbin changed the base branch from master to dev January 24, 2017 20:10
func (c *counters) Reset(key string) {
c.Lock()
count := c.counts[key]
c.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault - I think that this should be an RLock/RUnlock, right? (We're getting rid of this counters implementation anyways, but it's possible that this'll affect some of the benchmarks.)

type bucket2 struct {
sync.Mutex
counts [_counters2PerLock]uint64
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're not keeping track of the unhashed messages, what's the benefit of using counts + mutexes instead of a flat array of atomics?


// xorstring converts a string into a uint64 by xoring together its
// codepoints. It works by accumulating into a 64-bit "ring" which gets
// rotated by the apparent "byte width" of each codepoint.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment's no longer accurate (in case you forget when un-WIP-ing this PR.)


// xshrr computes a "randomly" rotated xorshift; this is the "XSH RR"
// transformation borrowed from the PCG famiily of random generators. It
// returns a 32-bit output from a 64-bit state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably a reasonable stand-in for a user of this library, and I hadn't heard of "PCG random generators" until talking to you. If we're going to name-check this paper, let's include a link :)

"conjecture",
"Arctomys",
"guapilla",
"Sassenach",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps overly sensitive, but let's not include vaguely racist epithets (even if they're randomly selected from the system dictionary). In my understanding, "Sassenach" is in the same ballpark as "honky" and "bouzhoi."

"raphidiferous",
"pooh",
"Tyndallize",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this same test data to generate a much larger set of n-grams, and use that to sanity check the hash function? Just want to make sure that we don't mess up and end up with way more hash collisions than expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 for taking this idea much further than I would have. Dunno about Prashant, but I'm satisfied that XSHRR is a reasonable option.

@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 1, 2017

WIP: updated, broke out, benchmarked, and compared distributions on hash function alternatives; the only clear win to me is taking over the implementation:

  • the real win is dropping the string -> []byte alloc that using core the hash/fnv would incur
  • both fnv, and my quick riff on pcg32-xsh-rr are about as performant, and well distributed
  • xshrr is marginally slower, but may be marginally "better" (not that I think it matters currently)

I'm going to continue on then and actually implement the level bucking work, and reduce locking counters2 itself.

Future directions:

  • a better testing corpus (that does such things as sugaredlogger-esque lines that are 99% the same, but for one number)
  • a count min-sketch would probably be better than a single array of hash buckets; the xshrr hash should be easy to parameterize (by re-introducing the LCG additive constant, and varying it)

Notebook: https://github.com/uber-go/zap/blob/d94151cd4490bb8014fec7d4f3e442b67dd0606e/internal/hash/hash_dist.ipynb

@jcorbin jcorbin force-pushed the jcorbin.sampler_keying branch 2 times, most recently from 54b7c28 to a338d78 Compare February 2, 2017 00:06
@jcorbin jcorbin changed the title WIP: sampler counts and keying Make sampler counts faster, and also binned by level Feb 2, 2017
@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 2, 2017

Updated:

  • now benchmarks at Sampler.Check
  • dropped the old counters locked-map
  • added level binning

return &counters{}
}

type counters [_numLevels][_countersPerLevel]uint64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're using these atomically, can we use atomic.Uint64 here? we already have dependencies on the atomic package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well:

  • I suspect that it will be more memory inefficient, if not actually slower, due to atomic.Uint64 being a struct with a pointer field, rather than the actual uint64 values living in this array
  • the only use that we're getting from go.uber.org/atomic is for the AtomicLevel type, which is already a struct, and arguably would be simpler if it used standard sync/atomic functions directly

I'll benchmark to test any supposed difference, but more towards point 2, I really don't think that using go.uber.org/atomic is adding much value for zap at this point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a struct with a pointer field, the field is a value.
https://github.com/uber-go/atomic/blob/master/atomic.go#L175

The memory layout will be exactly the same, there should be no indirection either way.

The only performance difference will be because of the extra function call, which Go should be inlining but doesn't yet (functions that call other functions are not inlined). Unfortunately that will probably add a handful of nanoseconds.

We should probably have benchmarks in the atomic package itself to measure the overhead.

The atomic package only really provides a couple of things:

  • safety: since you can't access the underlying value directly, you have to use the atomic accessors. This avoids accidentally doing a non-atomic read of an atomic value, which causes a data race.
  • readability: rather than atomic.AddUint64(&c[i][j], 1), the code becomes c[i][j].Inc()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't realize this would be a point for debate. I'd also strongly prefer that we use the atomic package.

func (c *counters) Inc(lvl Level, key string) uint64 {
i := lvl - minLevel
j := c.hash(key) % _countersPerLevel
return atomic.AddUint64(&c[i][j], 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a method to return the specific counter?

func (c *counters) counterFor(lvl Level, key string) *atomic.Uint64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense (modulo return type per my above reservations on atomic packages)


func (c *counters) Inc(lvl Level, key string) uint64 {
i := lvl - minLevel
j := c.hash(key) % _countersPerLevel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're picking the _countersPerLevel, and we've picked a nice power of 2, so we can use bit masking here rather than a mod.

const _countersPerLevelMask = _countersPerLevel - 1

[...]
j = c.hash(key) & _countersPerLevelMask

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I was hoping that the go compiler knew this trick, but I've not checked or verified; will do ;-)

}

func (c *counters) hash(key string) uint32 {
return hash.XSHRR(key) % _countersPerLevel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're modding here, and we're modding in Reset and Inc. I think hash should not do any hashing and just return a uint32 and the counterFor method should be the only place we do any hashing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you mean that hash shouldn't do any modding, and counterFor should be the only place we mod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is a legacy hold over from prior code form, will fix. Once we have counterFor, it can just call hash.Foo directly, rather than have any one-line func (*counters) hash imo

Facility

tick time.Duration
counts *counters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any advantage to using a pointer here rather than just a value, and then being able to remove newCounters?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the counts to be shared between parent and child loggers (so that sampling imposes a predictable limit on global logging load), so a pointer makes sense to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah of course, makes sense.

// FNV32 computes the 32-bit FNV-1 hash of s.
func FNV32(s string) uint32 {
const (
offset32 = 2166136261
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the constants are the same, should we make them top-level constants?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but we'll probably remove the generality of the hash module, and just stick with either XSHRR or FNV32a to taste (and probably inlined in zapcore/sampler.go.

This internal/hash was a demonstrative step for this PR, unless you'd rather we keep it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see, makes sense. This analysis is great, so I'd love to keep it somehow (either in this PR as is, and we use a separate PR to merge just the hash function we want), or in the internal/hash folder.

"go.uber.org/zap/internal/hash"
)

// XXX limit to go 1.7+, or stop using sub-benchmarks...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I'm comfortable with a separate pre-PR to drop Go 1.6 support. 1.8 is just around the corner, so we'll then support latest release and one older.

rand100s4 = randCross(rand100s3, rand100, 100000)
)

var testCorpii = []struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latin nit: "testCorpora"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I didn't check or verify my pluralism, I'll borg yours ;-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or render it into English, "corpses". Or "bodies" in the Anglish.

zapcore/level.go Outdated
@@ -49,6 +49,9 @@ const (
PanicLevel
// FatalLevel logs a message, then calls os.Exit(1).
FatalLevel

minLevel = DebugLevel
maxLevel = FatalLevel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not always good about remembering, but we should underscore-prefix unexported globals.

}

func (c *counters) hash(key string) uint32 {
return hash.XSHRR(key) % _countersPerLevel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming you mean that hash shouldn't do any modding, and counterFor should be the only place we mod.


package hash

// XSHRR hashes a string using the by threading each byte of it thru the PCG32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing word here.

Facility

tick time.Duration
counts *counters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the counts to be shared between parent and child loggers (so that sampling imposes a predictable limit on global logging load), so a pointer makes sense to me.

"raphidiferous",
"pooh",
"Tyndallize",
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 for taking this idea much further than I would have. Dunno about Prashant, but I'm satisfied that XSHRR is a reasonable option.

}
return s.Facility.Check(ent, ce)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also move the unit tests for the sampling facility to sampler_test.go?

@akshayjshah
Copy link
Contributor

Love the IPython notebook - I had no idea GitHub rendered them so well. Some small nits aside, I'm satisfied with using either an internal FNV hash or XSHRR.

@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 2, 2017

Okay, so unless @prashantv disagrees, I'll just roll with xshrr then; I don't feel strongly, other than in theory:

  • the MCG constant is bigger, so probably leads to more mixing for short strings
  • has a final xorshift and random-rotation "hardening" step (the entire difference between an [LM]CG and a PCG btw)
  • parameterizes easily, if we ever need to move up to a "min-sketch"

@prashantv
Copy link
Collaborator

Thanks for the awesome IPython notebook @jcorbin!

I don't have any strong opinions on hashes, I've seen fnv32 used more, which was why I originally suggested that. Looking at the output, it seems to me like:

  • fnv32a has a better distribution than xshrr for this corpus (higher min, lower max, lower stddev)
  • fnv32a is very slightly faster

The data seems to point towards fnv32a -- do you believe that xshrr will be better in more real world log scenarios Josh?

@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 3, 2017

Yup, the ordering of fnv32a and xshrr varies per experiment run (recall the notebook randomly creates a new shuffling of /usr/share/dict/words every time you run it); so on distribution I'd call them "competitive".

Micro perf certainly is a ding on xshrr for sure.

While the larger constant and extra hardening step help my "sleep better at night", they've not really proven to be material for the (arguably unrealistic) test corpora.

I'm more than happy to just use fnv32a for now, writing down the pcg32-xsh-rr trick for later consideration if we decide that a min-sketch is better (since I'd be more confident parameterizing pcg32-xsh-rr to yield a family of hash functions, than fnv).

@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 3, 2017

Updated:

  • dropped xshrr, switch to and inlined fnv32
  • switched to go.uber.org/atomic
  • addressed other feedback


func (c *counters) get(lvl Level, key string) *atomic.Uint64 {
i := lvl - _minLevel
j := fnv32a(key) % _countersPerLevel
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prashantv confirmed that the 1.7 compiler is sufficiently clever here.

The modulo compiles to & 4096-1:

         .          .      8a701: ANDL $0xfff, AX

As seen in:

(pprof) disasm counters..get
Total: 34.89s
ROUTINE ======================== go.uber.org/zap/zapcore.(*counters).get
     1.33s      3.81s (flat, cum) 10.92% of Total
     120ms      120ms      8a6a0: GS MOVQ GS:0x8a0, CX
      10ms       10ms      8a6a9: CMPQ 0x10(CX), SP
         .          .      8a6ad: JBE 0x8a728
      80ms       80ms      8a6af: SUBQ $0x28, SP
      40ms       40ms      8a6b3: MOVQ BP, 0x20(SP)
      10ms       10ms      8a6b8: LEAQ 0x20(SP), BP
         .          .      8a6bd: MOVQ 0x40(SP), AX
      60ms       60ms      8a6c2: MOVQ AX, 0(SP)
      50ms       50ms      8a6c6: MOVQ 0x48(SP), AX
      60ms       60ms      8a6cb: MOVQ AX, 0x8(SP)
      70ms       70ms      8a6d0: MOVZX 0x38(SP), AX
      30ms       30ms      8a6d5: INCL AX
      40ms       40ms      8a6d7: MOVB AL, 0x1f(SP)
      60ms      2.54s      8a6db: CALL go.uber.org/zap/zapcore.fnv32a(SB)
      50ms       50ms      8a6e0: MOVL 0x10(SP), AX
     240ms      240ms      8a6e4: MOVQ 0x30(SP), CX
      20ms       20ms      8a6e9: TESTB AL, 0(CX)
     250ms      250ms      8a6eb: MOVZX 0x1f(SP), DX
         .          .      8a6f0: MOVSX DL, DX
         .          .      8a6f4: CMPQ $0x6, DX
         .          .      8a6f8: JAE 0x8a721
         .          .      8a6fa: SHLQ $0xf, DX
      40ms       40ms      8a6fe: ADDQ CX, DX
         .          .      8a701: ANDL $0xfff, AX
         .          .      8a706: CMPQ $0x1000, AX
         .          .      8a70c: JAE 0x8a721
         .          .      8a70e: LEAQ 0(DX)(AX*8), AX
     100ms      100ms      8a712: MOVQ AX, 0x50(SP)
         .          .      8a717: MOVQ 0x20(SP), BP
         .          .      8a71c: ADDQ $0x28, SP
         .          .      8a720: RET
         .          .      8a721: CALL runtime.panicindex(SB)
         .          .      8a726: UD2
         .          .      8a728: CALL runtime.morestack_noctxt(SB)
         .          .      8a72d: JMP go.uber.org/zap/zapcore.(*counters).get(SB)
         .          .      8a732: INT $0x3
         .          .      8a733: INT $0x3
         .          .      8a734: INT $0x3
         .          .      8a735: INT $0x3
         .          .      8a736: INT $0x3
         .          .      8a737: INT $0x3
         .          .      8a738: INT $0x3
         .          .      8a739: INT $0x3
         .          .      8a73a: INT $0x3
         .          .      8a73b: INT $0x3
         .          .      8a73c: INT $0x3
         .          .      8a73d: INT $0x3
         .          .      8a73e: INT $0x3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming Josh!

Copy link
Contributor

@akshayjshah akshayjshah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally looks great, just import order and that commented-out test.

"lithotint",
"bargander",
"opisthographical",
"cacography",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding "cacography" to my day-to-day vocabulary.

"github.com/stretchr/testify/assert"
"go.uber.org/zap/internal/observer"
"go.uber.org/zap/testutils"
. "go.uber.org/zap/zapcore"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import ordering. assert should be in a separate group.


// assert.Equal(t, []spy.Log(nil), sink.Logs(), "Unexpected output when sampling with Check.")
// }
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we restore this test now? We should be able to use the internal exit package to make sure that we're actually sampling messages at panic and fatal too.

@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 3, 2017

Turns out that the comment out test is just wholly obsolete:

  • always panicing or fatal exiting is zap.Logger's concern
  • we already have test for that, e.g. TestLoggerAlwaysPanics


func (c *counters) get(lvl Level, key string) *atomic.Uint64 {
i := lvl - _minLevel
j := fnv32a(key) % _countersPerLevel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming Josh!

Facility

tick time.Duration
counts *counters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah of course, makes sense.


func TestSampler(t *testing.T) {
for _, lvl := range []Level{DebugLevel, InfoLevel, WarnLevel, ErrorLevel, DPanicLevel} {
sampler, logs := fakeSampler(DebugLevel, time.Minute, 2, 3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have any tests that ensure that separate levels use separate counters. I was able checkout the code, make the following patch:

--- a/zapcore/sampler.go
+++ b/zapcore/sampler.go
@@ -48,6 +48,7 @@ func (c *counters) Reset(lvl Level, key string) {
 func (c *counters) get(lvl Level, key string) *atomic.Uint64 {
        i := lvl - _minLevel
        j := fnv32a(key) % _countersPerLevel
+       j = 0
        return &c[i][j]
 }

And all the tests still passed. Can we add a test for this case?

}

if n := s.counts.Inc(ent.Level, ent.Message); n > s.first {
if n == s.first+1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this s.first+1 to s.first doesn't affect any tests, which means:

  • we should add more test coverage
  • we should better define what the behaviour should be for "first". I assume "first log every second" means if we log once a second, all logs would be output. However that's not the behaviour I'm seeing:
func TestSamplerReset(t *testing.T) {
       sampler, logs := fakeSampler(DebugLevel, time.Second, 1, 10000)
       writeIter(sampler, 1, DebugLevel)
       time.Sleep(2 * time.Second)
       writeIter(sampler, 2, DebugLevel)

       expected := buildExpectation(DebugLevel, 1, 2)
       assert.Equal(t, expected, logs.All())
}

For this PR, I think we should just add a bit more test coverage on the statement (so we know exactly what behaviour we currently have), and then open an issue with the behaviour we want. The ideal behaviour may be expensive with the current implementation since we'd have to start the timer as soon as we see the first message, which is definitely not ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eyes - this is wrong.

In practice, I don't think that the correct behavior will be so bad, since logging only the first message each second is pretty extreme. I'm planning to make the default 100 first and 1000 thereafter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on which correct behaviour we're talking about I think. We have a few options:

  1. current option of s.first+1 which I don't think has any benefits
  2. s.first which will start the timer as soon as we're about to start dropping and will handle my above example better.
  3. n == 1

Let's consider another example: we're logging every 0.5 seconds, and the parameters are s.first = 2, s.tick = time.Second s.thereafter = 100 (or some high number).

Option 3 would log every message (which seems like the ideal behaviour), option 2 would drop every 3rd message, option 1 would drop every 3rd and 4th message.

Option 3 will also start a timer every s.tick for every log message, which seems really bad. I think Option 2 is a good compromise between performance and sampling.

@akshayjshah
Copy link
Contributor

I don't think that the test is obsolete, since the other tests aren't exercising Panic and Fatal at all (though level coverage is pretty spotty). We should be able to use the internal exit package to safely ensure that we're not exiting.

@akshayjshah
Copy link
Contributor

akshayjshah commented Feb 4, 2017 via email

@prashantv
Copy link
Collaborator

Option 3 is definitely my preferred solution, but making it efficient will require rethinking the sampling logic (using timestamps). However, I don't think that change should be made as part of this PR. The refactoring to use bins + hash is unrelated to the sampling tick logic, so we should fix that separately.

@jcorbin
Copy link
Contributor Author

jcorbin commented Feb 6, 2017

Updated TestSamplerCheck to cover all levels (added ObservedLogs.TakeAll in passing)

@jcorbin jcorbin merged commit 10589ed into dev Feb 8, 2017
@jcorbin jcorbin deleted the jcorbin.sampler_keying branch February 8, 2017 01:05
prashantv added a commit that referenced this pull request Feb 13, 2017
See discussion on how the current sampler is wrong:
#252 (comment)

Instead of using `time.AfterFunc` and resetting timers, use timestamps
on the entries, and maintain a sliding window for the current "tick"
with atomics to track when the counters should be reset.
prashantv added a commit that referenced this pull request Feb 13, 2017
See discussion on how the current sampler is wrong:
#252 (comment)

Instead of using `time.AfterFunc` and resetting timers, use timestamps
on the entries, and maintain a sliding window for the current "tick"
with atomics to track when the counters should be reset.

The alternate approach is to start the timer as soon as the first
message for any bucket is logged, but that would cause more timers to be
started, which costs an allocation as well as requiring a global lock on
the timer heap.
prashantv added a commit that referenced this pull request Feb 13, 2017
See discussion on how the current sampler is wrong:
#252 (comment)

Instead of using `time.AfterFunc` and resetting timers, use timestamps
on the entries, and maintain a sliding window for the current "tick"
with atomics to track when the counters should be reset.

The alternate approach is to start the timer as soon as the first
message for any bucket is logged, but that would cause more timers to be
started, which costs an allocation as well as requiring a global lock on
the timer heap.
prashantv added a commit that referenced this pull request Feb 13, 2017
See discussion on how the current sampler is wrong:
#252 (comment)

Instead of using `time.AfterFunc` and resetting timers, use timestamps
on the entries, and maintain a sliding window for the current "tick"
with atomics to track when the counters should be reset.

The alternate approach is to start the timer as soon as the first
message for any bucket is logged, but that would cause more timers to be
started, which costs an allocation as well as requiring a global lock on
the timer heap.
akshayjshah pushed a commit that referenced this pull request Feb 15, 2017
* break out zapcore/sampler module
* Benchmark sampling
* Re-implement counts table to be lockless and static memory
* Use two-level count binning to separate levels
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants