-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minus #340
Merged
Merged
Minus #340
Changes from all commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
6064447
Initial (faulty) implementation of minus.
floriankramer 66274a5
Switched to a correct hashing based implementation for minus.
floriankramer fd630ae
Added Minus to the QueryPlanner.
floriankramer 5a51e14
Added Minus to the lexer.
floriankramer 303d321
Added an e2e query.
floriankramer 8f33bd7
Fixed formatting.
floriankramer 8b11c0e
Updated group and order by in the lexer test.
floriankramer fc34aaa
Removed support for ID_NO_VALUE from Minus.
floriankramer fe25ef6
Addressed review comments fixing several bugs.
floriankramer aac286f
Some cleanup for the review.
floriankramer 149c1d6
Switched minus to curly brace initialization for the Id type.
floriankramer 532d17b
Reordered the minus header.
floriankramer 9b34c48
Merge branch 'master' into minus
floriankramer db5c96f
Merge branch 'minus' of https://github.com/floriankramer/QLever into …
joka921 387c563
Clang-format
joka921 6c6543f
Added Werror AND normal build.
joka921 4ca96a7
correct json version againg
joka921 9337b9a
No fast fail on github actions.
joka921 aa3100b
correct single include json version
joka921 d6c7f36
Input to Minus is now enforced to be sorted!
joka921 c11f208
clang-format.
joka921 8f2d24e
We now have a single-header nlohmann
joka921 c8a40d5
Fixed the runtime INFO single json.
joka921 c94c622
tyring g++-10 and clang-10
joka921 b790137
g++ 10 and clang 11
joka921 d296ea1
Fixed Compiler warning.
joka921 d7dacb1
Changes from self-review + included Timeout support
joka921 46fd433
Fixed compilation
joka921 642a6fb
Fixed Compilation (checkTimeout requires a non-static compute method)
joka921 File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,230 @@ | ||
// Copyright 2018, University of Freiburg, | ||
// Chair of Algorithms and Data Structures. | ||
// Author: Florian Kramer (florian.kramer@netpun.uni-freiburg.de) | ||
|
||
#include "Minus.h" | ||
|
||
#include "../util/Exception.h" | ||
#include "CallFixedSize.h" | ||
|
||
using std::string; | ||
|
||
// _____________________________________________________________________________ | ||
Minus::Minus(QueryExecutionContext* qec, | ||
std::shared_ptr<QueryExecutionTree> left, | ||
std::shared_ptr<QueryExecutionTree> right, | ||
std::vector<array<size_t, 2>> matchedColumns) | ||
: Operation{qec}, | ||
_left{std::move(left)}, | ||
_right{std::move(right)}, | ||
_matchedColumns{std::move(matchedColumns)} { | ||
// Check that the invariant (inputs are sorted on the matched columns) holds. | ||
auto l = _left->resultSortedOn(); | ||
auto r = _right->resultSortedOn(); | ||
AD_CHECK(_matchedColumns.size() <= l.size()); | ||
AD_CHECK(_matchedColumns.size() <= r.size()); | ||
for (size_t i = 0; i < _matchedColumns.size(); ++i) { | ||
AD_CHECK(_matchedColumns[i][0] == l[i]); | ||
AD_CHECK(_matchedColumns[i][1] == r[i]); | ||
} | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
string Minus::asString(size_t indent) const { | ||
std::ostringstream os; | ||
for (size_t i = 0; i < indent; ++i) { | ||
os << " "; | ||
} | ||
os << "MINUS\n" << _left->asString(indent) << "\n"; | ||
os << _right->asString(indent) << " "; | ||
return os.str(); | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
string Minus::getDescriptor() const { return "Minus"; } | ||
|
||
// _____________________________________________________________________________ | ||
void Minus::computeResult(ResultTable* result) { | ||
AD_CHECK(result); | ||
LOG(DEBUG) << "Minus result computation..." << endl; | ||
|
||
RuntimeInformation& runtimeInfo = getRuntimeInfo(); | ||
result->_sortedBy = resultSortedOn(); | ||
result->_data.setCols(getResultWidth()); | ||
|
||
const auto leftResult = _left->getResult(); | ||
const auto rightResult = _right->getResult(); | ||
|
||
runtimeInfo.addChild(_left->getRootOperation()->getRuntimeInfo()); | ||
runtimeInfo.addChild(_right->getRootOperation()->getRuntimeInfo()); | ||
|
||
LOG(DEBUG) << "Minus subresult computation done." << std::endl; | ||
|
||
// We have the same output columns as the left input, so we also | ||
// have the same output column types. | ||
result->_resultTypes = leftResult->_resultTypes; | ||
|
||
LOG(DEBUG) << "Computing minus of results of size " << leftResult->size() | ||
<< " and " << rightResult->size() << endl; | ||
|
||
int leftWidth = leftResult->_data.cols(); | ||
int rightWidth = rightResult->_data.cols(); | ||
CALL_FIXED_SIZE_2(leftWidth, rightWidth, computeMinus, leftResult->_data, | ||
rightResult->_data, _matchedColumns, &result->_data); | ||
LOG(DEBUG) << "Minus result computation done." << endl; | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
ad_utility::HashMap<string, size_t> Minus::getVariableColumns() const { | ||
return _left->getVariableColumns(); | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
size_t Minus::getResultWidth() const { return _left->getResultWidth(); } | ||
|
||
// _____________________________________________________________________________ | ||
vector<size_t> Minus::resultSortedOn() const { return _left->resultSortedOn(); } | ||
|
||
// _____________________________________________________________________________ | ||
float Minus::getMultiplicity(size_t col) { | ||
// This is an upper bound on the multiplicity as an arbitrary number | ||
// of rows might be deleted in this operation. | ||
return _left->getMultiplicity(col); | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
size_t Minus::getSizeEstimate() { | ||
// This is an upper bound on the size as an arbitrary number | ||
// of rows might be deleted in this operation. | ||
return _left->getSizeEstimate(); | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
size_t Minus::getCostEstimate() { | ||
size_t costEstimate = _left->getSizeEstimate() + _right->getSizeEstimate(); | ||
return _left->getCostEstimate() + _right->getCostEstimate() + costEstimate; | ||
} | ||
|
||
// _____________________________________________________________________________ | ||
template <int A_WIDTH, int B_WIDTH> | ||
void Minus::computeMinus(const IdTable& dynA, const IdTable& dynB, | ||
const vector<array<Id, 2>>& joinColumns, | ||
IdTable* dynResult) const { | ||
// Substract dynB from dynA. The result should be all result mappings mu | ||
// for which all result mappings mu' in dynB are not compatible (one value | ||
// for a variable defined in both differs) or the domain of mu and mu' are | ||
// disjoint (mu' defines no solution for any variables for which mu defines a | ||
// solution). | ||
|
||
// The output is always the same size as the left input | ||
constexpr int OUT_WIDTH = A_WIDTH; | ||
|
||
// check for trivial cases | ||
if (dynA.size() == 0) { | ||
return; | ||
} | ||
|
||
if (dynB.size() == 0 || joinColumns.empty()) { | ||
// B is the empty set of solution mappings, so the result is A | ||
// Copy a into the result, allowing for optimizations for small width by | ||
// using the templated width types. | ||
*dynResult = dynA; | ||
floriankramer marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return; | ||
} | ||
|
||
IdTableView<A_WIDTH> a = dynA.asStaticView<A_WIDTH>(); | ||
IdTableView<B_WIDTH> b = dynB.asStaticView<B_WIDTH>(); | ||
IdTableStatic<OUT_WIDTH> result = dynResult->moveToStatic<OUT_WIDTH>(); | ||
|
||
std::vector<size_t> rightToLeftCols(b.cols(), | ||
std::numeric_limits<size_t>::max()); | ||
for (const auto& jc : joinColumns) { | ||
rightToLeftCols[jc[1]] = jc[0]; | ||
} | ||
|
||
/** | ||
* @brief A function to copy a row from a to the end of result. | ||
* @param ia The index of the row in a. | ||
*/ | ||
auto writeResult = [&result, &a](size_t ia) { | ||
result.template push_back(a[ia]); | ||
}; | ||
|
||
auto checkTimeout = checkTimeoutAfterNCallsFactory(); | ||
|
||
size_t ia = 0, ib = 0; | ||
while (ia < a.size() && ib < b.size()) { | ||
// Join columns 0 are the primary sort columns | ||
while (a(ia, joinColumns[0][0]) < b(ib, joinColumns[0][1])) { | ||
// Write a result | ||
writeResult(ia); | ||
ia++; | ||
checkTimeout(); | ||
if (ia >= a.size()) { | ||
goto finish; | ||
} | ||
} | ||
while (b(ib, joinColumns[0][1]) < a(ia, joinColumns[0][0])) { | ||
ib++; | ||
checkTimeout(); | ||
if (ib >= b.size()) { | ||
goto finish; | ||
} | ||
} | ||
|
||
while (b(ib, joinColumns[0][1]) == a(ia, joinColumns[0][0])) { | ||
// check if the rest of the join columns also match | ||
RowComparison rowEq = isRowEqSkipFirst(a, b, ia, ib, joinColumns); | ||
switch (rowEq) { | ||
case RowComparison::EQUAL: { | ||
ia++; | ||
if (ia >= a.size()) { | ||
goto finish; | ||
} | ||
} break; | ||
case RowComparison::LEFT_SMALLER: { | ||
// ib does not discard ia, and there can not be another ib that | ||
// would discard ia. | ||
writeResult(ia); | ||
ia++; | ||
if (ia >= a.size()) { | ||
goto finish; | ||
} | ||
} break; | ||
case RowComparison::RIGHT_SMALLER: { | ||
ib++; | ||
if (ib >= b.size()) { | ||
goto finish; | ||
} | ||
} break; | ||
default: | ||
AD_CHECK(false); | ||
} | ||
checkTimeout(); | ||
} | ||
} | ||
finish: | ||
result.reserve(result.size() + (a.size() - ia)); | ||
while (ia < a.size()) { | ||
writeResult(ia); | ||
ia++; | ||
} | ||
floriankramer marked this conversation as resolved.
Show resolved
Hide resolved
Comment on lines
+207
to
+211
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doesn't the |
||
*dynResult = result.moveToDynamic(); | ||
} | ||
|
||
template <int A_WIDTH, int B_WIDTH> | ||
Minus::RowComparison Minus::isRowEqSkipFirst( | ||
const IdTableView<A_WIDTH>& a, const IdTableView<B_WIDTH>& b, size_t ia, | ||
size_t ib, const vector<array<size_t, 2>>& joinColumns) { | ||
for (size_t i = 1; i < joinColumns.size(); ++i) { | ||
Id va{a(ia, joinColumns[i][0])}; | ||
Id vb{b(ib, joinColumns[i][1])}; | ||
if (va < vb) { | ||
return RowComparison::LEFT_SMALLER; | ||
} | ||
if (va > vb) { | ||
return RowComparison::RIGHT_SMALLER; | ||
} | ||
} | ||
return RowComparison::EQUAL; | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ADD Some more MINUS tests.