-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored the IdTable class to make interfaces simpler + more correct #320
Conversation
src/engine/Join.cpp
Outdated
// this fixes a bug that was not discovered by testing due to 0 | ||
// initialization of IdTables used for testing and should not occur in | ||
// typical use cases but it is still wrong. | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we maybe want to do this as an independent PR now? Seems simple enough
87232e3
to
2fb6a45
Compare
- Removed a lot of code and implemented const_iterators and read-only views.
2fb6a45
to
29b1579
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with just a few nitpicks and questions
src/engine/Comparators.h
Outdated
@@ -16,8 +16,8 @@ class OBComp { | |||
: _sortIndices(sortIndices) {} | |||
|
|||
template <int I> | |||
bool operator()(const typename IdTableImpl<I>::Row& a, | |||
const typename IdTableImpl<I>::Row& b) const { | |||
bool operator()(const typename IdTableStatic<I>::Row& a, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's rename the I
to WIDTH
as in Engine.h
@@ -231,7 +231,7 @@ void CountAvailablePredicates::computePatternTrick( | |||
const CompactStringVector<Id, Id>& hasPredicate, | |||
const CompactStringVector<size_t, Id>& patterns, const size_t subjectColumn, | |||
RuntimeInformation* runtimeInfo) { | |||
const IdTableStatic<I> input = dynInput.asStaticView<I>(); | |||
const IdTableView<I> input = dynInput.asStaticView<I>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
src/engine/IdTable.h
Outdated
return *this; | ||
} | ||
// This class cannot use move semantics if at least one of the two | ||
// rows invovlved in an assigment does not manage it's data, but rather |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"its data"
return *this; | ||
// Copy over the data from the other row to this row | ||
if (_cols == other._cols && _data != nullptr && other._data != nullptr) { | ||
std::memcpy(_data, other._data, sizeof(Id) * _cols); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't oter._data
be the one we previously set nullptr
in line 418? I.e. we should only do the std::memcpy
if other was a view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because line 420 returns.
I just checked again (this logic has been here for quite some time)
and all 4 cases (this and other can both be allocated or views) are handled correctly.
If both are allocated, we just redirect pointers and return (line 411).
In any other case we have to do the memcpy
either to the IdTable directly (this is a view) or to newly
allocated storage (this is allocated).
src/engine/IdTable.h
Outdated
return matches; | ||
bool operator==(const Row& other) const { | ||
bool matches = _cols == other._cols; | ||
for (size_t i = 0; matches && i < _cols; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh this is one of these cases that make you wonder if a shortcut exit at first difference is faster or slower, depending on how clever the hardware people made branch prediction and pipelining. Maybe we should use stdd:memcmp
and have the libc deal with that? Fun fact s390x of course has a CPU instruction for that :D https://github.com/torvalds/linux/blob/8f3d9f354286745c751374f5f1fcafee6b3f3136/arch/s390/lib/string.c#L343
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memcmp
is not the solution, because it immediately breaks the code as soon as Ids
are not plain ints and have a custom comparison.
return _cols == other._cols && std::equal(_data, _data + _cols, other._data);
(std::equal
should become std::memcmp
for builtin types.)
Also this fixes the bug, that in case the two sizes don't match (which should never happen in calling code), there is an out of bounds read.
src/engine/IdTable.h
Outdated
|
||
const value_type* operator->() const { return &_rowView; } | ||
|
||
// access the element the is i steps ahead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"that is"
src/engine/IdTable.h
Outdated
template <int NEW_COLS, typename = std::enable_if_t<DATA::ManagesStorage>> | ||
IdTableTemplated<NEW_COLS, DATA> moveToStatic() { | ||
return IdTableTemplated<NEW_COLS, DATA>( | ||
std::move(*this)); // let the private conversions do all the work | ||
}; | ||
|
||
/** | ||
* @brief Creates an IdTable that now owns this | ||
* id tables data. This is effectively a move operation that also | ||
* changes the type to an equibalent one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equivalent
What's the status of this? |
@joka921 @floriankramer ping for my last question |
I just fixed all the small nitpicks from your last review and fixed the conflict. |
The underlying storage is now
std::vector<Id>
instead ofId*
+malloc
+reinterpret_cast
(whichwas as I understand it technically
UB!
We now have two really separate types
IdTableStatic<COLS>
which owns its storage andIdTableView<COLS>
which is a readOnly, non-owning view of an IdTable. this makes the interface much more explicit and helped me track a small bug. (Previously both wereIdTableStatic
where some where internallymanaging
there storage and some not.Cleaned up the code of the
IdTable
module by introduing someusing
directives to get rid of a lot ofIdTableStatic<COLS>::
namespace usages which made the code somewhat hard to read.