Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong result on using limit after order by on null data #3015

Closed
2 tasks
Jedi18 opened this issue Feb 1, 2022 · 7 comments · Fixed by #3040
Closed
2 tasks

Wrong result on using limit after order by on null data #3015

Jedi18 opened this issue Feb 1, 2022 · 7 comments · Fixed by #3040
Assignees

Comments

@Jedi18
Copy link

Jedi18 commented Feb 1, 2022

What happens?

Several row values are missing (when compared with the same query on other SQL engines) when using limit in the below given query. I'm not a 100% sure whether this is a bug or this is the intended behaviour. Note that this happens on data with null values.

To Reproduce

        select o_orderkey, o_clerk, o_orderstatus, o_totalprice from orders
                    order by o_orderkey NULLS FIRST,
                    o_clerk NULLS FIRST, o_orderstatus NULLS FIRST,
                    o_totalprice DESC NULLS LAST limit 540

This gives the wrong result. To get the correct result, I had to change the query to the one given below

        WITH result as (
          select o_orderkey, o_clerk, o_orderstatus, o_totalprice from orders
            order by o_orderkey NULLS FIRST,
            o_clerk NULLS FIRST, o_orderstatus NULLS FIRST,
            o_totalprice DESC NULLS LAST
        )
        SELECT * from result limit 540

Environment (please complete the following information):

  • OS: Ubuntu 20.04
  • DuckDB Version: 0.3.1
  • DuckDB Client: Python

Before Submitting

  • Have you tried this on the latest master branch?
  • Python: pip install duckdb --upgrade --pre
  • R: install.packages("https://github.com/duckdb/duckdb/releases/download/master-builds/duckdb_r_src.tar.gz", repos = NULL)
  • Other Platforms: You can find binaries here or compile from source.
  • Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
@lnkuiper
Copy link
Contributor

lnkuiper commented Feb 1, 2022

Several row values are missing
Do you mean by this that you are getting an incorrect number of rows , or that some of the values in the rows that are returned are missing, i.e., have become NULL?

I have tried some stuff to reproduce, but haven't succeeded yet.

There is a difference between the two queries: the first one is optimized into a TopN operator, the second one is an Order with a Limit on top. It could be that there is an issue with the TopN operator with NULLS.

@Mytherin
Copy link
Collaborator

Mytherin commented Feb 1, 2022

@Jedi18 could you try to make a reproducible example?

@Jedi18
Copy link
Author

Jedi18 commented Feb 1, 2022

Sure, here you go https://drive.google.com/file/d/1exn7shxeXxUQWmIVpXZjSljtZK4cDuhX/view?usp=sharing . I've included the data I encountered this bug on. If you diff the two output files, you'll see they have a few different rows.

@Mytherin
Copy link
Collaborator

Mytherin commented Feb 5, 2022

Thanks for the reproducible example! I think I have tracked down the bug and fixed it in #3040.

@Jedi18
Copy link
Author

Jedi18 commented Feb 6, 2022

@Mytherin That's great! Thanks a lot for fixing it!

@Jedi18 Jedi18 closed this as completed Feb 6, 2022
@Jedi18 Jedi18 reopened this Feb 6, 2022
@Jedi18
Copy link
Author

Jedi18 commented Feb 6, 2022

Whoops sorry for closing the issue, I thought the PR was merged.

Mytherin added a commit that referenced this issue Feb 7, 2022
Fix #3015: fix edge case issue with many null values in physical Top N
@Mytherin
Copy link
Collaborator

Mytherin commented Feb 7, 2022

Should be fixed now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants