Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparqlExpressions! And a properParser! (both at least partially, but with a general, extendable framework) #405

Closed
wants to merge 72 commits into from

Conversation

joka921
Copy link
Member

@joka921 joka921 commented May 18, 2021

A first stub of a proper Parser based on ANTLR4, used to parse Expressions.

A first implementation of expressions with an extendable framework.

Currently, we only support + - * / ! || and && , and those only in BIND expressions, but this is a good start.

joka921 added 30 commits May 7, 2021 17:59
TODO:
- Actually implement a parser
- Write a readme on how to reproduce the .h and .cpp files once the Grammar changes or ANTLR is updated.
…e manually created files in separate folders.
TODO: proper AND and OR bor Binary Ranges.
Next step: Actually parse/Create these...
First draft of the Visitor rules for sparql expressions.
Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Began review in 1-1 with Johannes

}

/// Construct at runtime from a string_view
ConstexprSmallString(std::string_view input) : _size(input.size()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make explicit and have another operator== for string_view?

Comment on lines 32 to +35
#define DEBUG 4
#define INFO 3
#define WARN 2
#define ERROR 1
static constexpr size_t ERROR = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do it for all then? Add comment

// This used to be #define, but there was a collision with a #define for ERROR in our third-party code.

Comment on lines +15 to +26
struct ParserAndVisitor {
private:
string input;
ANTLRInputStream stream{input};
SparqlAutomaticLexer lexer{&stream};
CommonTokenStream tokens{&lexer};

public:
SparqlAutomaticParser parser{&tokens};
SparqlQleverVisitor visitor;
explicit ParserAndVisitor(string toParse) : input{std::move(toParse)} {}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need this several times?

Comment on lines +112 to +118
LOG(INFO) << context->getText() << std::endl;
LOG(INFO) << p.parser.getTokenStream()
->getTokenSource()
->getInputStream()
->toString()
<< std::endl;
LOG(INFO) << p.parser.getCurrentToken()->getStartIndex() << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG(DEBUG) or LOG(TRACE) ?

auto result = p.visitor.visitExpression(context);
auto expr = std::move(result.as<sparqlExpression::SparqlExpression::Ptr>());

QueryExecutionContext* ctxt = nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context


using namespace sparqlExpression;

struct DummyExpression : public SparqlExpression {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TextExpression

Comment on lines 19 to 20
std::vector<double> d{1.0, 2.0, 0.0, 0.0};
std::vector<bool> b{false, false, true, false};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d -> input1
b -> input2


// Generated from SparqlAutomatic.g4 by ANTLR 4.9.2

#include "SparqlQleverVisitor.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in the header?

: input{std::move(toParse)}, visitor{std::move(prefixMap)} {}

template <typename ResultType, typename ContextType>
auto parseStuff(const std::string& input,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseStuff -> parseWithGivenRule

auto actualResult = p.parseStuff<sparqlExpression::SparqlExpression::Ptr>(
input, &SparqlAutomaticParser::expression);

return ParseResultAndRemainingText{sparqlExpression::SparqlExpressionWrapper{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparqlExpressionPimpl

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments from 1-1 with Johannes

COPY . /app/

# Check formatting with the .clang-format project style
WORKDIR /app/
ENV DEBIAN_FRONTEND=noninteractive

WORKDIR /app/build/
RUN cmake -DCMAKE_BUILD_TYPE=Release -DLOGLEVEL=DEBUG -DUSE_PARALLEL=true .. && make -j $(nproc) && make test
RUN cmake -DCMAKE_BUILD_TYPE=Release -DLOGLEVEL=DEBUG -DUSE_PARALLEL=true .. && make -j $(nproc)
# RUN make test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncomment

# - num_rows: 97
# - selected: [ "?x", "?y", "?z" ]
# - contains_row: [ "<Albert_Einstein>", "<Mileva_Marić>", "<Al_Gore>" ]
# - order_string: { "dir": "ASC", "var": "?x" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this query?

Comment on lines +20 to +22
/// Virtual base class for an arbitrary Sparql Expression which holds the
/// structure of the expression as well as the logic to evaluate this expression
/// on a given intermediate result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move comment

/// structure of the expression as well as the logic to evaluate this expression
/// on a given intermediate result

template <typename T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Like std::vector, but obtaining memory via AllocatorWithLimits and without copy constructor.

Suggestion: rename to VectorWithLimit (for consistency)

Comment on lines +29 to +30
// Copying is expensive, we never want to do this in the expression
// evaluation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Disable copy constructor and copy assignment operator (copying is too expensive in the setting where we want to use this class and not necessary).

TODO: move the two lines with = delete together

"&&">;

/// Unary Negation
inline auto unaryNegate = [](bool a) -> bool { return !a; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no range?

return a * b;
};
inline auto divide = [](const auto& a, const auto& b) -> double {
return static_cast<double>(a) / b;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when dividing by zero? quiet_NaN or error?

decltype(detail::noop), "SUM">;

inline auto averageFinalOp = [](const auto& aggregation, size_t numElements) {
return numElements ? static_cast<double>(aggregation) /
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above (regarding division by zero)

template <>
struct hash<sparqlExpression::StrongId> {
size_t operator()(const sparqlExpression::StrongId& x) const {
return std::hash<Id>{}(x._value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use absl::hash instead of std::hash

Comment on lines +769 to +771
return std::hash<sparqlExpression::StrongId>{}(x._id) ^
std::hash<size_t>{}(static_cast<size_t>(x._type));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dito

Copy link
Member

@hannahbast hannahbast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second round of comments from 1-1 with Johannes

EvaluationInput* input) const {
const Id id = strongId._value;
// This code is borrowed from the original QLever code.
if (type == ResultTable::ResultType::VERBATIM) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch instead of if - else - else ... ?

// Every knowledge base value that is bound converts to "True"
// TODO<joka921> check for the correct semantics of the error handling and
// implement it in a further version.
return type != ResultTable::ResultType::KB || strongId._value != ID_NO_VALUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!(... && ...) would be clearer, I think

}
};

// Convert a NaryOperation that works on single elements (e.g. two doubles ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n-ary operation

};

// Convert a NaryOperation that works on single elements (e.g. two doubles ->
// double) into an operation works on all the possible input types (single
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that

// with arbitrary n and is currently used for unary and binary operations.
template <typename RangeCalculation, typename ValueExtractor,
typename NaryOperation>
auto liftBinaryCalculationToEvaluateResults(RangeCalculation rangeCalculation,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

liftNary...


std::optional<std::string>
SparqlExpressionWrapper::isNonDistinctCountOfSingleVariable() const {
// TODO<joka921> pass this down to enable pattern trick!!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that the basic version of the pattern trick (as we used it so far) still works

struct EvaluationInput;
class SparqlExpressionWrapper {
public:
static constexpr const char* Name = "ComplexArithmeticExpression";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed?

class SparqlExpressionWrapper {
public:
static constexpr const char* Name = "ComplexArithmeticExpression";
std::string getDescriptor() const { return "Arithmetic Bind"; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper descriptor for runtime tree

@@ -50,7 +50,7 @@ class Engine {
dynFilter.asStaticView<FILTER_WIDTH>();
IdTableStatic<IN_WIDTH> result = dynResult->moveToStatic<IN_WIDTH>();

// Intersect both lists.
// Intersection both lists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intersect

@@ -523,7 +523,7 @@ void Join::join(const IdTable& dynA, size_t jc1, const IdTable& dynB,
doGallopInnerJoin(RightLargerTag{}, a, jc1, b, jc2, &result);
} else {
auto checkTimeoutAfterNCalls = checkTimeoutAfterNCallsFactory();
// Intersect both lists.
// Intersection both lists.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intersect

@joka921 joka921 closed this May 14, 2022
@joka921 joka921 deleted the f.escapingInSparqlParser branch May 14, 2022 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants