Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count ( DISTINCT ... ) is too slow [CORE214] #542

Closed
firebird-issue-importer opened this issue Jan 17, 2003 · 7 comments
Closed

Count ( DISTINCT ... ) is too slow [CORE214] #542

firebird-issue-importer opened this issue Jan 17, 2003 · 7 comments

Comments

@firebird-issue-importer

Submitted by: skopalik (skopalik)

SFID: 669770#⁠
Submitted By: skopalik

select distinct is about 10x to 100x faster then
select count(distinct ).
Tested version: FB 1.0.2.908 and FB 1.5.0.1761

test case:
create table a ( a int);

create procedure aset( cnt int) as
declare variable i int;
begin
i=0;
while(i<cnt)do begin
insert into a VALUES (:cnt/1000)
i=i+1;
end
end

execute procedure aset (10000)

PC: Notebokk Celeron 500/ 128 MB RAM
There are result captured from wisql:

select distinct * from a

PLAN SORT ((A NATURAL))

      A 

===========

     10 

Current memory = 0
Delta memory = 0
Max memory = 0
Elapsed time= 0.13 sec
Buffers = 75
Reads = 0
Writes 0
Fetches = 20072

select count (distinct a) from a

PLAN (A NATURAL)

  COUNT 

===========

      1 

Current memory = 0
Delta memory = 0
Max memory = 0
Elapsed time= 10.86 sec
Buffers = 75
Reads = 0
Writes 0
Fetches = 20071

Commits: 329fe9d 14ba585 FirebirdSQL/fbt-repository@3d35567 FirebirdSQL/fbt-repository@90e412b

====== Test Details ======

Multiple measures show that ratio in 2.5.5 and 3.0 is about 1.05 - 1.10 (this is how query "select count(*) from ( select distinct id from ... )" is SLOWER than "select count(distinct id) from ..."). In rare cases, when number of unique values is 100'000, this ratio was ~1.5.
Threshold was selected to be 1.8.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jun 14, 2006

Commented by: Alice F. Bird (firebirds)

Date: 2005-10-14 18:27
Sender: fuerchau
Logged In: YES
user_id=1300926

The time of "count(distinct xxx)" depends on the number of
different values in the database.
So i have now tested with a table of 30000 records.

If each record consists only 1 value, the time is ca. 10
seconds.
If i have perhaps 20 different values, the time is ca. 6
seconds.
if i have 4000 different values the result is immediatly
avalable !

Firebird 1.5.2.4731 and tested with IBExpert.

Because my application must work with millions of records
and i need this function, i would by happy to get a solution.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jun 14, 2006

Commented by: Alice F. Bird (firebirds)

Date: 2005-10-14 18:10
Sender: fuerchau
Logged In: YES
user_id=1300926

The time of "count(distinct xxx)" depends on the number of
different values in the database.
So i have now tested with a table of 30000 records.

If each record consists only 1 value, the time is ca. 10
seconds.
If i have perhaps 20 different values, the time is ca. 6
seconds.
if i have 4000 different values the result is immediatly
avalable !

Firebird 1.5.2.4731 and tested with IBExpert.

Because my application must work with millions of records
and i need this function, i would by happy to get a solution.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jul 19, 2006

Modified by: @dyemanov

assignee: Dmitry Yemanov [ dimitr ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jan 28, 2008

Modified by: @pcisar

Workflow: jira [ 10238 ] => Firebird [ 14466 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented May 31, 2015

Modified by: @dyemanov

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 3.0 Beta 2 [ 10586 ]

Fix Version: 2.5.5 [ 10670 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jun 2, 2015

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Resolved [ 5 ]

QA Status: Done successfully

Test Details: Multiple measures show that ratio in 2.5.5 and 3.0 is about 1.05 - 1.10 (this is how query "select count(*) from ( select distinct id from ... )" is SLOWER than "select count(distinct id) from ..."). In rare cases, when number of unique values is 100'000, this ratio was ~1.5.
Threshold was selected to be 1.8.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jun 2, 2015

Modified by: @pavel-zotov

status: Resolved [ 5 ] => Closed [ 6 ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants