Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow execution of COUNT in combination with DISTINCT #2890

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Slow execution of COUNT in combination with DISTINCT #2890

monetdb-team opened this issue Nov 30, 2020 · 0 comments

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2011-09-26 15:56:49 +0200
From: Joris Slob <>
To: SQL devs <>
Version: 11.5.3 (Aug2011-SP1) [obsolete]
CC: @njnes

Last updated: 2011-10-26 13:22:05 +0200

Comment 16329

Date: 2011-09-26 15:56:49 +0200
From: Joris Slob <>

User-Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.186 Safari/535.1
Build Identifier:

For the following table:
CREATE TABLE "sys"."facts" (
"subject" CHAR(8192),
"predicate" CHAR(8192),
"object" CHAR(8192)
);

With 10M rows a query posed like this:
SELECT COUNT(DISTINCT(predicate)) from facts;
takes 4.5s

While a query posed like this:
SELECT COUNT(*) FROM (SELECT DISTINCT(predicate) from facts) AS b;
takes 142ms

I asked about this on the mailing list and they said I should file a bug report, because it is probably the optimisers taking a wrong turn.

======================
Additional information

PLAN SELECT COUNT(DISTINCT(predicate)) from facts;
+-------------------------------------------------------------------------------+
| rel |
+===============================================================================+
| project ( |
| | group by ( |
| | | table(sys.facts) [ facts.subject, facts.predicate, facts.%TID% NOT NULL ] |
| | ) [ ] [ sys.count unique no nil (facts.predicate) as L4 ] |
| ) [ L4 ] |
+-------------------------------------------------------------------------------+

PLAN SELECT COUNT(*) FROM (SELECT DISTINCT(predicate) from facts) AS b;
+------------------------------------------------------------------+
| rel |
+==================================================================+
| project ( |
| | group by ( |
| | | distinct project ( |
| | | | table(sys.facts) [ facts.predicate, facts.%TID% NOT NULL ] |
| | | ) [ facts.predicate as b.predicate ] |
| | ) [ ] [ sys.count() NOT NULL as L5 ] |
| ) [ L5 NOT NULL ] |
+------------------------------------------------------------------+

EXPLAIN SELECT COUNT(DISTINCT(predicate)) from facts;
+---------------------------------------------------------------------------------------------------------------------------+
| mal |
+===========================================================================================================================+
| function user.s2_1{autoCommit=true}():void; |
| X_17 := nil:wrd; |
| barrier X_61 := language.dataflow(); |
| X_2 := sql.mvc(); |
| X_44:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",0,0,2); |
| X_48 := algebra.selectNotNil(X_44); |
| X_39:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",2,0,2); |
| X_50 := algebra.kdifference(X_48,X_39); |
| X_46 := algebra.selectNotNil(X_39); |
| X_52 := algebra.kunion(X_50,X_46); |
| X_6:bat[:oid,:oid] := sql.bind_dbat(X_2,"sys","facts",1); |
| X_9 := bat.reverse(X_6); |
| X_54 := algebra.kdifference(X_52,X_9); |
| X_55 := bat.reverse(X_54); |
| X_45:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",0,1,2); |
| X_49 := algebra.selectNotNil(X_45); |
| X_43:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",2,1,2); |
| X_51 := algebra.kdifference(X_49,X_43); |
| X_47 := algebra.selectNotNil(X_43); |
| X_53 := algebra.kunion(X_51,X_47); |
| X_57 := bat.reverse(X_53); |
| X_10:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",1); |
| X_11 := algebra.selectNotNil(X_10); |
| X_58 := bat.reverse(X_11); |
| X_14 := mat.pack(X_55,X_57,X_58); |
| X_15 := algebra.kunique(X_14); |
| X_16 := bat.reverse(X_15); |
| X_17 := aggr.count(X_16); |
| exit X_61; |
| sql.exportValue(1,"sys.facts","L6","wrd",32,0,6,X_17,""); |
| end s2_1; |
+---------------------------------------------------------------------------------------------------------------------------+

EXPLAIN SELECT COUNT(*) FROM (SELECT DISTINCT(predicate) from facts) AS b;
+---------------------------------------------------------------------------------------------------------------------------+
| mal |
+===========================================================================================================================+
| function user.s3_1{autoCommit=true}():void; |
| X_17 := nil:wrd; |
| barrier X_71 := language.dataflow(); |
| X_2 := sql.mvc(); |
| X_44:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",0,0,2); |
| X_39:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",2,0,2); |
| X_46 := algebra.kdifference(X_44,X_39); |
| X_48 := algebra.kunion(X_46,X_39); |
| X_6:bat[:oid,:oid] := sql.bind_dbat(X_2,"sys","facts",1); |
| X_9 := bat.reverse(X_6); |
| X_50 := algebra.kdifference(X_48,X_9); |
| (X_51,X_52) := group.done(X_50); |
| X_57 := bat.mirror(X_51); |
| X_59 := algebra.leftjoin(X_57,X_50); |
| X_45:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",0,1,2); |
| X_43:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",2,1,2); |
| X_47 := algebra.kdifference(X_45,X_43); |
| X_49 := algebra.kunion(X_47,X_43); |
| (X_53,X_54) := group.done(X_49); |
| X_60 := bat.mirror(X_53); |
| X_62 := algebra.leftjoin(X_60,X_49); |
| X_10:bat[:oid,:str] := sql.bind(X_2,"sys","facts","predicate",1); |
| (X_55,X_56) := group.done(X_10); |
| X_63 := bat.mirror(X_55); |
| X_65 := algebra.leftjoin(X_63,X_10); |
| X_66 := mat.pack(X_59,X_62,X_65); |
| (ext30,grp28) := group.done(X_66); |
| X_15 := bat.mirror(ext30); |
| X_68 := algebra.leftjoin(X_15,X_50); |
| X_69 := algebra.leftjoin(X_15,X_49); |
| X_70 := algebra.leftjoin(X_15,X_10); |
| X_67 := mat.pack(X_68,X_69,X_70); |
| X_16 := algebra.leftjoin(X_15,X_67); |
| X_17 := aggr.count(X_16); |
| exit X_71; |
| sql.exportValue(1,"sys.b","L7","wrd",32,0,6,X_17,""); |
| end s3_1; |
+---------------------------------------------------------------------------------------------------------------------------+

Reproducible: Always

Steps to Reproduce:

  1. Start mclient with the database with the table described in the details.
  2. Execute the queries (as stated in the details).
  3. Compare the time it takes.

Actual Results:

Query 1
+------+
| L4 |
+======+
| 277 |
+------+
1 tuple (4.5s)

Query 2
+------+
| L5 |
+======+
| 277 |
+------+
1 tuple (141.970ms)

Expected Results:

I hoped that the two queries would be answered in the same time.

mclient, the MonetDB interactive terminal (Aug2011)
support for command-line editing compiled-in

MonetDB 5 server v11.5.1 "Aug2011" (32-bit, 32-bit oids)
Copyright (c) 1993-July 2008 CWI
Copyright (c) August 2008-2011 MonetDB B.V., all rights reserved
Visit http://www.monetdb.org/ for further information
Found 1.0GiB available memory, 2 available cpu cores
Libraries:
libpcre: 7.8 2008-09-05 (compiled with 7.8)
openssl: OpenSSL 0.9.8k 25 Mar 2009 (compiled with OpenSSL 0.9.8k 25 Mar 2009)
libxml2: 2.7.6 (compiled with 2.7.6)
Compiled by: root@dev.monetdb.org (i486-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -Wall
Linking : /usr/bin/ld -Wl,-Bsymbolic-functions

Comment 16341

Date: 2011-09-28 14:22:30 +0200
From: @njnes

checked in a fix in Aug2011, please test.

Comment 16358

Date: 2011-09-30 11:26:32 +0200
From: @sjoerdmullender

Please reopen if you still encounter issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant