Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different counts, documentation? #22

Closed
cjbottaro opened this issue Jun 20, 2011 · 15 comments
Closed

different counts, documentation? #22

cjbottaro opened this issue Jun 20, 2011 · 15 comments

Comments

@cjbottaro
Copy link

So there are 3 different counts:

stats.mystat
stats.timers.mystat.count
stats_counts.mystat

What the difference between all of them? Quick look at the source code suggests the first two are "occurrences per second" and the latter is "occurrences per minute", but I'm not so well versed in Javascript or what Graphite expects to be sent.

Is that right? Can someone update the readme to explain the differences between different counts?

Thanks.

@octoberman
Copy link

I was wondering the same exact thing...

@loe
Copy link

loe commented Aug 11, 2011

I do not understand the difference between stats and stats_counts.

@blysik
Copy link

blysik commented Sep 13, 2011

I'm in the same boat. I'm using a client increment function, but i see floats in the graph output for stats.mystat. stats_counts.mystat looks like a more sane value. I really would like to know what each of these is.

@cwu
Copy link

cwu commented Oct 27, 2011

From a quick look it seems like stats.counter normalises by the flush interval so it's a counts per second. While stats_counts is the total count received in the flushInterval (default 10s).

It's a bit confusing since its not explained anywhere...

@cjbottaro
Copy link
Author

I think the count under stats.timers (i.e. stats.timers.blah.count) is per flush interval. So you have to divide by 10 (the default) the get per second.

@cwu
Copy link

cwu commented Oct 27, 2011

From what I understand:

Statsd sends different values for these 'counts':

stats.timers.foo.count from issuing a "foo:1|ms" is the absolute count for a flushInterval
stats.foo from issuing "foo:1|c" is the rate of count / second.
stats_counts.foo from issuing "foo:1|c" is the absolute count for a flushinterval.

However the stats count values differs from the counter values when displayed on graphite.

Unlike timings. counters are constantly fed into graphite (with a 0 value if nothing happens). However graphite averages values it receives which can be a pain since it really should sum counts. Since it averages by default, a rate works fine since it would just be taking an average of an average. However how points get binned based on the storage schema at longer time ranges messes up with the absolute counts displayed by graphite.

So basically with the two values graphed you get:

stats.foo bin value = counts / second 
stats_counts.foo.bin value = counts / flushInterval.

With stats.timers.foo.count what count you are measuring is slightly different since it doesn't send a 0 every flush interval. Since null values won't be used in the average you only divide by the number of timing measurements you took. So this would be

stats.timers.foo bin value = `counts / # timing measurements`. 

If you constantly time every interval this will equal counts/flushInterval however that is not always the case.

tl;dr
statsd sends sums counts but graphite graphs averages

How to interpret graphite graphs:

  • stats.foo = counts of foo / second
  • stats_counts.foo = counts of foo / flushInterval
  • stats.timers.foo.count = counts of foo / # timing measurments in bin

This is just speculation from my usage so correct me if anything seems wrong =).

@recursify
Copy link

@cwu - Your comments were helpful...

Let's take the example of user logins. I want to log every time a user logs in. I'd like to be able to view the rate of logins/sec, as well as do aggregate rollups to see things like "how many users logged into the system, every hour, for the last 2 days". I'm unable to do the latter, and I think it's because whisper uses an average function to aggregate data.

As a test, write a script that logs a stat every 1 second. View the stat_count.<stat_name> in graphite. Now apply the "Summarize" function to the data ( Apply Function -> Transform -> Summarize). I would expect that summarizing to 1min would result in a straight line at y=60. Instead I see a straight line at y=10.

Looks like you can configure Whisper with a different rollup function... Might be worth testing out...

http://readthedocs.org/docs/graphite/en/latest/whisper.html#rollup-aggregation

@recursify
Copy link

And the new version of Graphite supports this... see this issue:

https://bugs.launchpad.net/graphite/+bug/853955

@recursify
Copy link

Update: Here is what you need in your storage-aggregation.conf

[stats_counts]
pattern = ^stats_counts\..*
aggregationMethod = sum

Which means that everything in stats_counts will get summed when moved to larger retention periods.

@dropwhile
Copy link

@recursify great info! Thanks.

@mrtazz
Copy link
Member

mrtazz commented Apr 13, 2012

This seems resolved. Thanks folks. If there is still something unclear, please reopen the ticket.

@mrtazz mrtazz closed this as completed Apr 13, 2012
@runa
Copy link

runa commented Jul 19, 2012

Am I crazy or @recursify 's comment should be written in huge letters in the README?

@mrtazz
Copy link
Member

mrtazz commented Jul 19, 2012

An example with different aggregation methods is already in the README.

@runa
Copy link

runa commented Jul 19, 2012

yep, but it's not including the default 'pattern = ^stats_counts..*'

@spolu
Copy link

spolu commented Jul 19, 2012

This is why we built that. Stats aggregation as a service. Just use it and it works without configuration / deployment / maintenance. Though it would make sense to comment on this here since we are the biggest fans of statsd (depsite its limits namely the maintenance it incurs) and how it inspired and helped numbers of Devs solve their day to day problems! Of course, already sorry for squatting that closed discussion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants