Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tag: v1.2.0
Fetching contributors…

Octocat-spinner-32-eaf2f5

Cannot retrieve contributors at this time

file 94 lines (66 sloc) 2.838 kb

Trimmed aggregates

This PostgreSQL extension provides several aggregate functions that trim the input data set before applying the function, i.e. remove lowest/highest values. The number of values to be removed is configured using the parameters.

WARNING: Those aggregates require the whole set, as they need to collect and sort the whole data set ((to trim low/high values). This may be a time consuming process and require a lot of memory. Keep this in mind when using those functions.

Available aggregates

The extension implements aggregates that resemble those described here: http://www.postgresql.org/docs/9.1/static/functions-aggregate.html, i.e. AVG, VARIANCE, VAR_POP, VAR_SAMP, STDDEV, STDDEV_POP and STDDEV_SAMP

  • AVG

    avg_trimmed(value, low_cut, high_cut)
    
  • VARIANCE

    var_trimmed(value, low_cut, high_cut);
    var_pop_trimmed(value, low_cut, high_cut)
    var_samp_trimmed(value, low_cut, high_cut)
    
  • STDDEV (standard deviation)

    stddev_trimmed(value, low_cut, high_cut)
    stddev_pop_trimmed(value, low_cut, high_cut)
    stddev_samp_trimmed(value, low_cut, high_cut)
    
  • combined aggregate (computes all seven values at once)

    trimmed(value, low_cut, high_cut)
    

All those functions are overloaded for numeric, double precision, int32 and int64 data types.

Using the aggregates

All the aggregates are used the same way so let's see how to use the avg_trimmed aggregate. For example this

SELECT avg_trimmed(i, 0.1, 0.1) FROM generate_series(1,1000) s(i);

means 10% of the values will be removed on both ends, and the average will be computed using the middle 80%. On the other hand this

SELECT avg_trimmed(i, 0.2, 0.1) FROM generate_series(1,1000) s(i);

means 20% of the lowest and 10% of the highest values will be removed, so the average will be computed using the remaining 70% of values.

The combined aggregate computes and returns all values at once as an array. The values are stored in this order

  • average
  • var_pop
  • var_samp
  • variance
  • stddev_pop
  • stddev_samp
  • stddev

If you need more of the values at once this may be much more efficient as it shares the memory and can compute the values with only two passes through the data (to compute exact variance and stddev).

Installation

Installing this is very simple - if you're on 9.1 you can install it like any other extension, i.e.

$ make install
$ psql dbname -c "CREATE EXTENSION trimmed_averages"

and if you're on an older version, you have to run the SQL script manually

$ psql dbname < trimmed_averages--1.0.sql

That's all.

License

This software is distributed under the terms of BSD 2-clause license. See LICENSE or http://www.opensource.org/licenses/bsd-license.php for more details.

Something went wrong with that request. Please try again.