Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Add functions resample() and smooth() #92

Open
wants to merge 3 commits into from

5 participants

@danielbeardsley

Functions: add resample() for performance and sanity

  • Drastically speeds up further calcuations on the returned series
  • Makes it much easier to have a consistent datapoints / pixels ratio for movingAverage() and friends.

Functions: add smooth() as a movingAverage over pixels

  • Internally does a movingAverage(resample()).
  • Provides consistent smoothing over a given number of graph pixels, instead of a number of datapoints of arbitrary time width.
  • Still provides consistent smoothing when there are fewer datapoints than pixels.
danielbeardsley added some commits
@danielbeardsley danielbeardsley Functions: Add resample() for performance and sanity
* Drastically speeds up further calcuations on the returned series
* Makes it much easier to have a consistent datapoints / pixels ratio
  for movingAverage() and friends.
3c9f856
@danielbeardsley danielbeardsley Fucntions: add smooth() as a movingAverage over pixels
* Internally does a movingAverage(resample()).
* Provides consistent smoothing over a given number of graph pixels,
  instead of a number of datapoints of arbitrary time width.
* Still provides consistent smoothing when there are fewer datapoints
  than pixels.
f22fc25
@danielbeardsley danielbeardsley Composer: add resample() and smooth() functions 51909b8
@danielbeardsley

We've (iFixit) been using this in production for a while and it works great.
using smooth(some.stat.rate) makes way more sense (and is way faster) than movingAverage(some.stat.rate, 40) and then having to constantly adjust the 40 number as you view different time-scales.

@Dieterbe

looks interesting, but I wonder if there's any overlap with consolidateBy()

@client9

FYI --
@Dieterbe Right now changing the width of a graph (&width) has no effect on JSON output. Therefore consolidateBy will also have no effect, as the resampling for images is done in another path, only for images.

With a large number of points, I'm able to crash a few browsers using client-side rendering, so a re-sampling would be great.

If width and current resampling code could be moved out and re-used in the json output (i.e. no width = current system, if width, then resample to that many number of points.). Then consolidate by could be re-used, and no new functions are needed.

thoughts? I might be able to find some time to do this.

nickg

@client9

See also #153

@Dieterbe

I was having the same problem, I'm using https://github.com/vimeo/timeserieswidget/ to do client side rendering, and over a long period of time there's indeed too many datapoints and performance degrades and can indeed lead to browser crashes.

I think it's totally reasonable to have a width argument for json output (and other data outputs), as it gives a hint as to how many pixels/datapoints the client side renderer wants to draw.

I think all current consolidation should also work for raw outputs (see #153). Your suggestion of sampling instead of consolidating seems reasonable (for both png and raw) as that will allow even better (backend) performance, at the expense of some accuracy.

@drawks
Collaborator

The consolidation portion of this for raw/json outputs appears to be addressed in 7adc7f4 I think that offering a sample mode vs avg could be useful though. I'll tag this for the 0.10.0 milestone. I'd like to see some tests for this before merging though.

@drawks drawks added this to the 0.10.0 milestone
@danielbeardsley

We've been using these functions in production for a long time now and they've been great and address a real issue: speeding up slow calculations with large numbers of points, and time-scale independent smoothing of a series over a given number of pixels.

Three things I would like to find the time to do:

  • Properly rewrite the series names so they come out as smooth(blah) instead of movingAverage(resample(blah))
  • Add some tests
  • Have the smooth() function respect a query param like smoothPixels=5 so smoothing can be adjusted on the whole graph at once instead of editing the argument in each data series.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 3, 2012
  1. @danielbeardsley

    Functions: Add resample() for performance and sanity

    danielbeardsley authored
    * Drastically speeds up further calcuations on the returned series
    * Makes it much easier to have a consistent datapoints / pixels ratio
      for movingAverage() and friends.
  2. @danielbeardsley

    Fucntions: add smooth() as a movingAverage over pixels

    danielbeardsley authored
    * Internally does a movingAverage(resample()).
    * Provides consistent smoothing over a given number of graph pixels,
      instead of a number of datapoints of arbitrary time width.
    * Still provides consistent smoothing when there are fewer datapoints
      than pixels.
Commits on Nov 4, 2012
  1. @danielbeardsley
This page is out of date. Refresh to see the latest.
View
6 webapp/content/js/composer_widgets.js
@@ -917,7 +917,8 @@ function createFunctionsMenu() {
{text: 'Absolute Value', handler: applyFuncToEach('absolute')},
{text: 'timeShift', handler: applyFuncToEachWithInput('timeShift', 'Shift this metric ___ back in time (examples: 10min, 7d, 2w)', {quote: true})},
{text: 'Summarize', handler: applyFuncToEachWithInput('summarize', 'Please enter a summary interval (examples: 10min, 1h, 7d)', {quote: true})},
- {text: 'Hit Count', handler: applyFuncToEachWithInput('hitcount', 'Please enter a summary interval (examples: 10min, 1h, 7d)', {quote: true})}
+ {text: 'Hit Count', handler: applyFuncToEachWithInput('hitcount', 'Please enter a summary interval (examples: 10min, 1h, 7d)', {quote: true})},
+ {text: 'Smooth', handler: applyFuncToEachWithInput('smooth', 'Please enter the number of pixels to average for each potin')}
]
}, {
text: 'Calculate',
@@ -930,7 +931,8 @@ function createFunctionsMenu() {
{text: 'Holt-Winters Aberration', handler: applyFuncToEach('holtWintersAberration')},
{text: 'As Percent', handler: applyFuncToEachWithInput('asPercent', 'Please enter the value that corresponds to 100% or leave blank to use the total', {allowBlank: true})},
{text: 'Difference (of 2 series)', handler: applyFuncToAll('diffSeries')},
- {text: 'Ratio (of 2 series)', handler: applyFuncToAll('divideSeries')}
+ {text: 'Ratio (of 2 series)', handler: applyFuncToAll('divideSeries')},
+ {text: 'Resample', handler: applyFuncToEachWithInput('resample', 'Please enter the desired points-per-pixel for the new resolution')}
]
}, {
text: 'Filter',
View
108 webapp/graphite/render/functions.py
@@ -645,6 +645,112 @@ def consolidateBy(requestContext, seriesList, consolidationFunc):
series.name = 'consolidateBy(%s,"%s")' % (series.name, series.consolidationFunc)
return seriesList
+
+def resample(requestContext, seriesList, pointsPerPx = 1):
+ """
+ Resamples the given series according to the requested graph width and
+ $pointsPerPx aggregating by average. Total number of points after this
+ function == graph width * pointsPerPx.
+
+ This has two significant uses:
+
+ * Drastically speeds up render time when graphing high resolution data
+ or many metrics.
+ * Allows movingAverage() to have a consistent smoothness across timescales.
+ * Example: movingAverage(resample(metric,2),20) would end up with
+ a 10px moving average no matter what the scale of your graph.
+ * Allows a consistent number-of-samples to be returned from JSON requests
+ * the number of samples returned == graph width * points per pixel
+
+ Example:
+
+ .. code-block:: none
+
+ &target=resample(metric, 2)
+ &target=movingAverage(resample(metric, 2), 20)
+ """
+ newSampleCount = requestContext['width']
+
+ for seriesIndex, series in enumerate(seriesList):
+ newValues = []
+ seriesLength = (series.end - series.start)
+ newStep = (float(seriesLength) / float(newSampleCount)) / float(pointsPerPx)
+
+ # Leave this series alone if we're asked to do upsampling
+ if newStep < series.step:
+ continue
+
+ sampleWidth = 0
+ sampleCount = 0
+ sampleSum = 0
+
+ for value in series:
+ if (value is not None):
+ sampleCount += 1
+ sampleSum += value
+ sampleWidth += series.step
+
+ # If the current sample covers the width of a new step, add it to the
+ # result
+ if (sampleWidth >= newStep):
+ if sampleCount > 0:
+ newValues.append(sampleSum / sampleCount)
+ else:
+ newValues.append(None)
+ sampleWidth -= newStep
+ sampleSum = 0
+ sampleCount = 0
+
+ # Process and add the left-over sample if it's not empty
+ if sampleCount > 0:
+ newValues.append(sampleSum / sampleCount)
+
+ newName = "resample(%s, %s)" % (series.name, pointsPerPx)
+ newSeries = TimeSeries(newName, series.start, series.end, newStep, newValues)
+ newSeries.pathExpression = newName
+ seriesList[seriesIndex] = newSeries
+
+ return seriesList
+
+
+def smooth(requestContext, seriesList, windowPixelSize = 5):
+ """
+ Resample and smooth a set of metrics. Provides line smoothing that is
+ independent of time scale (windowPixelSize ~ movingAverage over pixels)
+
+ An shorter and safer way of calling:
+ movingAverage(resample(seriesList, 2), smoothFactor * 2)
+
+ The windowPixelSize is effectively the number of pixels over which to perform
+ the movingAverage.
+
+ Note: This is safer in that if a series has fewer data points than pixels,
+ the metric won't be upsampled. Instead the movingAverage window size will be
+ adjusted to cover the same number of pixels.
+ """
+ pointsPerPixel = 2
+ resampled = resample(requestContext, seriesList, pointsPerPixel)
+
+ sampleSize = int(windowPixelSize * pointsPerPixel)
+ expectedSamples = requestContext['width'] * pointsPerPixel
+
+ for index, series in enumerate(resampled):
+ # if we have fewer samples than expected, adjust the movingAverage sample
+ # size so it covers the same number of pixels
+ if (len(series) < expectedSamples * 0.95):
+ movingAverageSize = int((float(len(series)) / (expectedSamples)) * sampleSize)
+ else:
+ movingAverageSize = sampleSize
+
+ # If we are being asked to do a movingAverage over one point or less,
+ # don't bother
+ if (movingAverageSize <= 1):
+ continue
+
+ resampled[index] = movingAverage(requestContext, [series], movingAverageSize)[0]
+
+ return resampled
+
def derivative(requestContext, seriesList):
"""
This is the opposite of the integral function. This is useful for taking a
@@ -2488,6 +2594,8 @@ def pieMinimum(requestContext, series):
'summarize' : summarize,
'smartSummarize' : smartSummarize,
'hitcount' : hitcount,
+ 'resample' : resample,
+ 'smooth' : smooth,
'absolute' : absolute,
# Calculate functions
View
2  webapp/graphite/render/views.py
@@ -49,6 +49,8 @@ def renderView(request):
'startTime' : requestOptions['startTime'],
'endTime' : requestOptions['endTime'],
'localOnly' : requestOptions['localOnly'],
+ 'width' : graphOptions['width'],
+ 'height' : graphOptions['height'],
'data' : []
}
data = requestContext['data']
Something went wrong with that request. Please try again.