New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement IF and COALESCE #1070
Conversation
…ents and possibly some performance improvements.
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #1070 +/- ##
==========================================
+ Coverage 78.47% 78.59% +0.11%
==========================================
Files 275 277 +2
Lines 25917 26088 +171
Branches 3174 3190 +16
==========================================
+ Hits 20338 20503 +165
+ Misses 4372 4369 -3
- Partials 1207 1216 +9
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of comments and questions. It already runs on Wikidata and works like a charm :-)
PS: Having this functions will make our AC queries much simpler, which is great!
namespace detail::conditional_expressions { | ||
using namespace sparqlExpression::detail; | ||
[[maybe_unused]] auto ifImpl = [](EffectiveBooleanValueGetter::Result condition, | ||
auto&& i, auto&& e) -> IdOrString { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need auto&&
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check.
: children_{std::move(children)} {}; | ||
ExpressionResult evaluate(EvaluationContext* context) const override { | ||
std::vector<ExpressionResult> childResults; | ||
std::ranges::for_each(children_, [&](const auto& child) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't the algorithm in the following simply as follows: evaluate one child after the other until the result is undefined, then return that as a result, UNDEF otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, some of the children return vectors of elements. So your algorithm would have to work basically on rows of (one entry per vector). And this evaluation would be inefficient.
What we can do and probably should (also for the other expressions) is to not evaluate the complete children, but for examples always chunks of N. Then we have both the efficient evaluation of the single children as well as bounded space consumption. Then we can do something like "If in the current block everything is already defined, then we can skip it entirely".
return results; | ||
} | ||
|
||
std::string getCacheKey(const VariableToColumnMap& varColMap) const override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly that the main reason why the implementation of COALESCE
is so much more involved than that of IF
is that COALESCE
has a variable number of arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And as soon as that is the case, we have to choose a column based implementation for efficiency reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once I also implement CONCAT (will look somewhat similar concerning the boilerplate structure) I will reason on whether it is beneficial to factor out some common code for such variadic expressions.
// This rule is only used by the `RelationExpression` and `BuiltInCall` rules | ||
// which also are not supported and should already have thrown an exception. | ||
AD_FAIL(); | ||
std::vector<ExpressionPtr> Visitor::visit(Parser::ExpressionListContext* ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this not needed so far? Because we didn't have any functions yet with a variable number of arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this grammar rule is only used by a few expressions, all of which were unsupported so far (IN
, NOT IN
, COALESCE
, CONCAT
).
return createTernary(&makeIfExpression); | ||
} else if (functionName == "coalesce") { | ||
AD_CORRECTNESS_CHECK(ctx->expressionList()); | ||
return makeCoalesceExpression(visit(ctx->expressionList())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a more generic createVariadic(&makeCoalesceExpression)
? This connects to the earlier question about why the CoalesceExpression
implementation is so (relatively) complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be, maybe I'll do it like this. But there is only one other variadic expression in this function (CONCAT), because ([NOT] IN
) is part of a different grammar rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks a lot!
Maybe consider adding BOUND to this PR.
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
This is a rough draft so far.