Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subqueries keep their orderBy clauses and the query optimizer is .. #305

Merged
merged 2 commits into from Jan 11, 2020

Conversation

joka921
Copy link
Member

@joka921 joka921 commented Jan 9, 2020

aware of cheap filters

  • Previously all OrderBy clauses that occured in subqueries were deleted. This gained speed when their were superfluous orderBys that were overwritten by ancestor's orderBy clauses
    but plain wrong if there is an orderBy clause in a subquery but not in the main query

  • Also makes the query planner aware that filters are even cheapier if their input is sorted.

Fixes #304

…e of cheap filters

- Previously all OrderBy clauses that occured in subqueries were deleted. This gained speed when their were superfluous orderBys that were overwritten by ancestor's orderBy clauses
  but plain wrong if there is an orderBy clause in a subquery but not in the main query

- Also makes the query planner aware that filters are even cheapier if their input is sorted.
@joka921
Copy link
Member Author

joka921 commented Jan 9, 2020

I have build a Docker container called qlever-ordered-subqueries on galera, so if
@hannahbast or @niklas88 could (at least temporarily) replace the running wikidata-full with this
image we could immediately verify if this

  • fixes the problem
  • enables us to speed up the simply autocompletion queries that currently take ~150ms to basically 0.

@niklas88
Copy link
Member

niklas88 commented Jan 9, 2020

My account seems to have been deactivated, I'll clear that with Hannah and Frank

@niklas88
Copy link
Member

niklas88 commented Jan 9, 2020

Do you have a test query on scientists, this would be a good candidate for the end to end tests.

@hannahbast
Copy link
Member

I started it and first got an Exception: ParseException, cause: The literal: "^"Xylo" was not terminated properly. I then replaced the \" in the regular expression by \\\" and then it worked. It seems you are not working with the master but with some more advanced branch. Anyway, the good news is:

  1. The query above now produces results in sorted order.

  2. The time needed for the filter operation is now 0ms.

Great!

@joka921
Copy link
Member Author

joka921 commented Jan 9, 2020

@hannahbast This is indeed the master branch, we currently fixed this Escaping in literals because it is actually required.
There is also the problem, that it should be FILTER regex(?x, "^something") that should match literals,
but this is something I will tackle after merging the unicode branch to not get too many conflicts

@joka921
Copy link
Member Author

joka921 commented Jan 10, 2020

@hannahbast Thanks for the feedback, Indeed you found a bug in the current master branch for which I partially take responsibility. I just added a commit that should fix this issue to this PR.

It is a symptom of a much bigger problem: Currently everything is a std::string inside QLever.
This makes it hard to reason about "Is this a Literal, an entity, A Prefix regex that was escaped on a certain level, a floating point value" etc. which has led to several bugs in the past and makes maintaining the code hard.
I will work on encoding this information into the Type system soon by making all those distinct types without implicit conversion.

@joka921
Copy link
Member Author

joka921 commented Jan 10, 2020

Ok, so now we are back to:

FILTER regex(?x, "^\"xylo) works. (previous status of the code, fixing this is a separate issue.
FILTER regex(?x, "^\\\"xylo) throws an understandable Error that the user overescaped the " which is not necessary and not supported in QLever.

isSimple = false;
} else if (escaped && !isControlChar) {
const std::string error =
"Escaping the character "s + c +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you have the std::literals::string_literals::operator""s only on the first literal?

Copy link
Member

@niklas88 niklas88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with just a small clariffication question, feel free to merge

- The previous escaping fix had the internal SparqlParser::parseLiteral function called after parsing the literal inside
  the Lexer. This is unwanted and led to strange behavior. This is just a small fix for a much bigger issue:
  All kinds of different types of Triple and Query elements are encoded als std::string, this makes it very easy to mix things up and create bugs.

- The prefix filter now for ages has retrieved one element too much.
@joka921 joka921 merged commit f8eac5b into ad-freiburg:master Jan 11, 2020
@joka921 joka921 deleted the f.FilterCostEstimates branch January 11, 2020 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ORDER BY in subquery is wrongly ignored
3 participants