Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored the IdTable class to make interfaces simpler + more correct #320

Merged
merged 4 commits into from
Nov 29, 2020

Conversation

joka921
Copy link
Member

@joka921 joka921 commented Mar 10, 2020

  • The underlying storage is now std::vector<Id> instead of Id* + malloc + reinterpret_cast (which
    was as I understand it technically UB!

  • We now have two really separate types IdTableStatic<COLS> which owns its storage and
    IdTableView<COLS> which is a readOnly, non-owning view of an IdTable. this makes the interface much more explicit and helped me track a small bug. (Previously both were IdTableStatic where some where internally managing there storage and some not.

  • Cleaned up the code of the IdTable module by introduing some using directives to get rid of a lot of IdTableStatic<COLS>:: namespace usages which made the code somewhat hard to read.

// this fixes a bug that was not discovered by testing due to 0
// initialization of IdTables used for testing and should not occur in
// typical use cases but it is still wrong.
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we maybe want to do this as an independent PR now? Seems simple enough

- Removed a lot of code and implemented const_iterators
  and read-only views.
Copy link
Member

@niklas88 niklas88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with just a few nitpicks and questions

@@ -16,8 +16,8 @@ class OBComp {
: _sortIndices(sortIndices) {}

template <int I>
bool operator()(const typename IdTableImpl<I>::Row& a,
const typename IdTableImpl<I>::Row& b) const {
bool operator()(const typename IdTableStatic<I>::Row& a,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename the I to WIDTH as in Engine.h

@@ -231,7 +231,7 @@ void CountAvailablePredicates::computePatternTrick(
const CompactStringVector<Id, Id>& hasPredicate,
const CompactStringVector<size_t, Id>& patterns, const size_t subjectColumn,
RuntimeInformation* runtimeInfo) {
const IdTableStatic<I> input = dynInput.asStaticView<I>();
const IdTableView<I> input = dynInput.asStaticView<I>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

return *this;
}
// This class cannot use move semantics if at least one of the two
// rows invovlved in an assigment does not manage it's data, but rather
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"its data"

return *this;
// Copy over the data from the other row to this row
if (_cols == other._cols && _data != nullptr && other._data != nullptr) {
std::memcpy(_data, other._data, sizeof(Id) * _cols);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't oter._data be the one we previously set nullptr in line 418? I.e. we should only do the std::memcpy if other was a view.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because line 420 returns.
I just checked again (this logic has been here for quite some time)
and all 4 cases (this and other can both be allocated or views) are handled correctly.

If both are allocated, we just redirect pointers and return (line 411).
In any other case we have to do the memcpy either to the IdTable directly (this is a view) or to newly
allocated storage (this is allocated).

return matches;
bool operator==(const Row& other) const {
bool matches = _cols == other._cols;
for (size_t i = 0; matches && i < _cols; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh this is one of these cases that make you wonder if a shortcut exit at first difference is faster or slower, depending on how clever the hardware people made branch prediction and pipelining. Maybe we should use stdd:memcmp and have the libc deal with that? Fun fact s390x of course has a CPU instruction for that :D https://github.com/torvalds/linux/blob/8f3d9f354286745c751374f5f1fcafee6b3f3136/arch/s390/lib/string.c#L343

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcmp is not the solution, because it immediately breaks the code as soon as Ids are not plain ints and have a custom comparison.

return _cols == other._cols && std::equal(_data, _data + _cols, other._data);

(std::equal should become std::memcmp for builtin types.)
Also this fixes the bug, that in case the two sizes don't match (which should never happen in calling code), there is an out of bounds read.


const value_type* operator->() const { return &_rowView; }

// access the element the is i steps ahead
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"that is"

template <int NEW_COLS, typename = std::enable_if_t<DATA::ManagesStorage>>
IdTableTemplated<NEW_COLS, DATA> moveToStatic() {
return IdTableTemplated<NEW_COLS, DATA>(
std::move(*this)); // let the private conversions do all the work
};

/**
* @brief Creates an IdTable that now owns this
* id tables data. This is effectively a move operation that also
* changes the type to an equibalent one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equivalent

@niklas88
Copy link
Member

What's the status of this?

@niklas88
Copy link
Member

@joka921 @floriankramer ping for my last question

@joka921
Copy link
Member Author

joka921 commented Nov 29, 2020

I just fixed all the small nitpicks from your last review and fixed the conflict.
So I think this is now ready to roll.

@niklas88 niklas88 merged commit 93db17f into ad-freiburg:master Nov 29, 2020
@joka921 joka921 deleted the f.SerializeResults branch May 8, 2021 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants