Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting "last 10" or "last 30" minutes in Timeseries gauge gives incorrect window of data #155

Closed
JangoSteve opened this issue Aug 4, 2013 · 1 comment

Comments

@JangoSteve
Copy link

I think I figured out why going from one range to the next sometimes excludes data. I updated my FnordMetric::GaugeCalculations#fraction_values_in method with some debugging code as this:

  def fraction_values_in(range, _append=nil)
    Hash.new{ |h,k| h[k] = [0,0] }.tap do |vals|
      allticks = ticks_in(range, retention)
      puts "Ticks range: #{allticks.min} - #{allticks.max}"
      ticks_in(range, retention).each do |_tick|
        puts "retention key: #{retention_key(_tick, _append)}"
        sync_redis.hgetall(retention_key(_tick, _append)).each do |k, v|
          puts "k: #{k}, v: #{v}"
          kx = k.split("-")
          vals[kx.first.to_i][kx.last == "denominator" ? 1 : 0] += v.to_f
        end
      end
    end
  end

Now, when I click "last 10 minutes", I see this:

append: all_events
Ticks range: 1375602600 - 1375603200
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375602600-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375603200-all_events
k: 1375603680-numerator, v: 1
k: 1375603740-numerator, v: 7

Then when I click "last 15 minutes", I see:

append: all_events
Ticks range: 1375602600 - 1375603800
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375602600-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375603200-all_events
k: 1375603680-numerator, v: 1
k: 1375603740-numerator, v: 7
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375603800-all_events
k: 1375604160-numerator, v: 5

I may be wrong, but shouldn't going from last 10 minutes to last 15 minutes make the time range go back 5 more minutes, not forward?

Then I go up to "last 30 minutes", and I get:

append: all_events
Ticks range: 1375601400 - 1375603200
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375601400-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375602000-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375602600-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375603200-all_events
k: 1375603680-numerator, v: 1
k: 1375603740-numerator, v: 7

And then "last 45 minutes" and the same thing happens again:

append: all_events
Ticks range: 1375600800 - 1375603800
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375600800-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375601400-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375602000-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375602600-all_events
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375603200-all_events
k: 1375603680-numerator, v: 1
k: 1375603740-numerator, v: 7
retention key: fnordmetric-mysite-gauge-events_per_minute-60-1375603800-all_events
k: 1375604160-numerator, v: 5

In other words, it's 4:29, and...

selectinggives range
last 10 minutes4:00-4:10
last 15 minutes4:00-4:20
last 30 minutes3:40-4:10
last 45 minutes3:30-4:20

So, the problem is that "last 10 minutes" actually selects 10 minutes starting from 15 minutes ago, and "last 30 minutes" actually selects 30 minutes starting from 45 minutes ago, rather than selecting the last 10 or 30 minutes, respectively.

@JangoSteve
Copy link
Author

I finally figured out why the ranges are being selected as they are. It's from the GaugeCalculations#ticks_in method. It's doing something strange:

(((r.last-r.first)/_tick.to_f).ceil+1+overflow).times.map{ |n| tick_at(r.first + _tick*(n-1), _tick) }

The first part calculates the number of tick intervals contained in the current range, and the .ceil rounds up. Then it loops through that number of intervals from to call tick_at each interval within the range. But I think it's calling tick_at for the wrong values.

It should be calling tick_at for the range from r.first up to r.first + [number of tick intervals rounded up], but instead it's calling it from r.first - [1 tick interval] up to r.first + [one less than number of tick intervals rounded up].

This explains why it was working for the "last 15" and "last 45" minute intervals in my example above. Since tick was set to 60, then retention=600 (which is 10 minutes), meaning the selections in which retention divided unevenly would get rounded up by .ceil, causing ticks_in to loop one extra time, making up for the fact that it was starting the range loop one interval too early.

At first I thought it was intentionally starting at one interval less than the start of the range, to be sure we aren't leaving out values in the range. But this doesn't make since, because tick_at finds the closest tick at or below the value given, so there'd never be a need to go one interval lower, since tick_at will always find the floor to start with.

On the other hand, since we're starting from 0, we would need to add one to n in order to go up to the number of tick intervals in the loop (which explains why it's doing .ceil+1 before looping.

So, if we just change _tick*(n-1) to _tick*n, it solves the issue described in this ticket.

I'll see if I can create a test for this and submit a patch. Basically though, I had the intervals as such (with a tick of 60 seconds => retention of 600 seconds):

30 minutes:

actual range: 1375752990 to 1375754790
expected tick range: 1375752600 to 1375754400
calculated tick range: 1375752000 to 1375753800

45 minutes:

actual range: 1375752034 to 1375754734
expected tick range: 1375752000 to 1375754400
calculated tick range: 1375751400 to 1375754400

Making the above change fixes both of these cases, returning the expected range.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants