Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: add various statistical functions #2454

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

SQL: add various statistical functions #2454

monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2008-06-08 08:32:08 +0200
From: @mlkersten
To: SQL devs <>
Version: -- development
CC: ajdamico, alexandr.kulik1988, hs.addington, @njnes, percy, yzan

Last updated: 2019-01-21 15:36:55 +0100

Comment 13624

Date: 2008-06-08 08:32:08 +0200
From: @mlkersten

MonetDB should support the median and quantile functions
over groups. Perhaps even more of the standard statistical operators found in e.g. spreadsheets.

Comment 13625

Date: 2008-06-15 12:22:51 +0200
From: @mlkersten

Logged In: YES
user_id=490798
Originator: YES

MySQL and Oracle also provide stddev() and variance()

Comment 13626

Date: 2010-05-04 09:32:09 +0200
From: Pseudo user for Sourceforge import <>

This bug was previously known as tracker item 1987928 at http://sourceforge.net/support/tracker.php?aid=1987928

Comment 14933

Date: 2010-09-23 23:03:13 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

Created attachment 36
Patch for OCT-2010

Implementation of standard deviation for sample (stdev) and population (stdevp); variance for sample(vars) and population(varp). This function also available from sql as aggregation function.

Attached file: stdev+var (text/plain, 13050 bytes)
Description: Patch for OCT-2010

Comment 14934

Date: 2010-09-24 09:40:58 +0200
From: @njnes

The patch looks nice, but it misses the very important version's for group-ed aggregation. For example, there is an implementation in group.mx (GRPvar*), which seems close.

Comment 14935

Date: 2010-09-24 09:45:05 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

(In reply to comment 4)

The patch looks nice, but it misses the very important version's for group-ed
aggregation. For example, there is an implementation in group.mx (GRPvar*),
which seems close.

So should i create something similar to GRPvar_ for stdev and variance for sample?

Comment 14936

Date: 2010-09-24 09:58:39 +0200
From: @grobian

Problem with the patch is that it changes the catalog. Doesn't that cause trouble Niels? Probably the functions should be added with a .sql file which declares them as external (=MAL) functions.

Comment 14937

Date: 2010-09-24 10:13:30 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

(In reply to comment 6)

Problem with the patch is that it changes the catalog. Doesn't that cause
trouble Niels? Probably the functions should be added with a .sql file which
declares them as external (=MAL) functions.

I try do do it from sql, but for a some reason i can't do it. SQL doesn't allow to create aggregation function.

Comment 15004

Date: 2010-10-06 23:31:38 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

Created attachment 43
Standard deviation and variance function for MonetDB2010

http://bugs.monetdb.org/attachment.cgi?id=36 was updated now it support group by.

Attached file: stdev_var_oct2010_v2 (text/plain, 108675 bytes)
Description: Standard deviation and variance function for MonetDB2010

Comment 15005

Date: 2010-10-06 23:33:15 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

I updated functions Standard deviation and variance, so now they support group by, could you please review it. http://bugs.monetdb.org/attachment.cgi?id=43

(In reply to comment 4)

The patch looks nice, but it misses the very important version's for group-ed
aggregation. For example, there is an implementation in group.mx (GRPvar*),
which seems close.

Comment 15125

Date: 2010-10-27 16:33:55 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

Any comments on latest patch?

Comment 15429

Date: 2011-01-25 15:24:22 +0100
From: @grobian

Niels, ping?

Comment 15505

Date: 2011-03-03 01:03:05 +0100
From: hsaddington <<hs.addington>>

Fabian:

We have been using the statistical functions that Alex wrote for us and they are working well for us. When the functions were used with "group by", the Oct2010 Release instantly crashed. But, the functions work just fine in Oct2010-SP1 release. Is it possible to have these functions included in the next release? Thanks.

Comment 15506

Date: 2011-03-03 08:52:29 +0100
From: @grobian

Niels has to review it. It won't make it for the Mar2011 release, perhaps the Apr2011 release.

(Don't CC me, I'm on the alias)

Comment 15678

Date: 2011-03-28 17:36:29 +0200
From: @sjoerdmullender

The Mar2011 version has been released.

Comment 15712

Date: 2011-04-01 10:03:17 +0200
From: Alexandr Kulik <<alexandr.kulik1988>>

Did you already commit this patch into repo? Because i try to find functions by name and can't find them.

Comment 16033

Date: 2011-07-29 11:00:02 +0200
From: @sjoerdmullender

Apr2011-SP2 has been released.

Comment 16271

Date: 2011-09-16 15:10:37 +0200
From: @sjoerdmullender

The Aug2011 version has been released.

Comment 17246

Date: 2012-05-09 17:38:34 +0200
From: @njnes

stddev and median have been added.

Comment 17877

Date: 2012-11-05 12:32:22 +0100
From: Anthony Damico <>

Hi Niels,

The stddev() function does not appear to be available in the Windows version of MonetDB?

When I run these commands, I get the following error when I hit stddev..

sql>select * from tables limit 3;
...
3 tuples (21.684ms)
sql>select sum(id) from tables;
...
1 tuple (22.006ms)
sql>select avg(id) from tables;
...
1 tuple (76.176ms)
sql>select median(id) from tables;
...
1 tuple (7.618ms)
sql>select stddev(id) from tables;
SELECT: no such unary operator 'stddev(int)'

Here's a copy of my server window, which does have a release date after comment 18 below (when this issue was marked as resolved) --

MonetDB 5 server v11.11.11 "Jul2012-SP2"
Serving database 'demo', using 2 threads
Compiled for x86_64-pc-winnt/64bit with 64bit OIDs dynamically linked
Found 3.905 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2012 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
MonetDB/SQL module loaded

On this topic, is it possible to also add some sort of quantile() function so that users can access something like the 25th, 75th, 90th percentiles quickly? This seems like just a small variation on the median() function (which does work), since quantile( colname , 50 ) would be the same as median( colname )?

My apologies if I'm overlooking something obvious, I am pretty new to MonetDB.

Thanks for the amazing software!!

Anthony Damico

Comment 17920

Date: 2012-11-08 12:42:22 +0100
From: Percy Wegmann <>

It appears that the problem is in sql/scripts/39_analytics.sql. It's missing bindings for stdev and stdevp.

For stdev, at least, it seems that one can mimic median with something like the following:

create aggregate stdev(val TINYINT) returns TINYINT
external name "aggr"."stddev";
create aggregate stdev(val SMALLINT) returns SMALLINT
external name "aggr"."stddev";
create aggregate stdev(val INTEGER) returns INTEGER
external name "aggr"."stddev";
create aggregate stdev(val BIGINT) returns BIGINT
external name "aggr"."stddev";
create aggregate stdev(val REAL) returns REAL
external name "aggr"."stddev";
create aggregate stdev(val DOUBLE) returns DOUBLE
external name "aggr"."stddev";
create aggregate stdev(val DATE) returns DATE
external name "aggr"."stddev";
create aggregate stdev(val TIME) returns TIME
external name "aggr"."stddev";
create aggregate stdev(val TIMESTAMP) returns TIMESTAMP
external name "aggr"."stddev";

This can of course be run through the mclient, so anyone needing to patch their environment prior to this bug being fixed might be able to just run these manually.

Comment 20760

Date: 2015-04-04 11:09:49 +0200
From: yzan

I hate to reopen this bug but statistical functions stopped to work in MonetDB-11.19.9-x86_64-Darwin-9-bin.tar.bz2.

For example following queries:
CREATE TABLE t1 ( col1 INT, col2 DOUBLE );
INSERT INTO t1 VALUES ( 1, 3.0 ) , ( 2, 4.0 );
SELECT stddev_samp( col1 ) FROM t1;
SELECT stddev_samp( col2 ) FROM t1;
SELECT stddev_pop( col1 ) FROM t1;
SELECT stddev_pop( col2 ) FROM t1;

Return:
Error: SELECT: no such unary operator 'stddev_samp(int)'
SQLState: 22000
ErrorCode: 0
Respectively:
Error: SELECT: no such unary operator 'stddev_samp(double)'
SQLState: 22000
ErrorCode: 0

Similar error messages appear with other statistical messages mentioned at:
https://www.monetdb.org/Documentation/SQLreference/StatisticFunctions

The queries were tested on:
Version: MonetDB Database Server Toolkit v1.1 (Oct2014-SP2)
OS: OS X 10.9.5
The database was installed from archive: MonetDB-11.19.9-x86_64-Darwin-9-bin.tar.bz2

Interestingly, the queries worked with MonetDB-11.19.7-x86_64-Darwin-9-bin.tar.bz2.

Comment 20761

Date: 2015-04-04 11:22:31 +0200
From: Anthony Damico <>

this re-introduced bug is platform-specific. that code works fine for me on

MonetDB 5 server v11.19.9 "Oct2014-SP2"
Serving database 'demo', using 8 threads
Compiled for x86_64-pc-winnt/64bit with 64bit OIDs dynamically linked
Found 15.914 GiB available main-memory.
Copyright (c) 1993-July 2008 CWI.
Copyright (c) August 2008-2015 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
MonetDB/SQL module loaded

Comment 26820

Date: 2019-01-21 15:36:55 +0100
From: @sjoerdmullender

Closing since as far as I am aware, the statistical functions work on all supported platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant