-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sprint3 #3
Sprint3 #3
Conversation
…ribute vectors value range
- Adding some comments - Giving longer names - Starting the names of protected members with an underscore - Removing the unused / unimplemented `TableScan::check_scan_condition`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
namespace opossum { | ||
|
||
AbstractOperator::AbstractOperator(const std::shared_ptr<const AbstractOperator> left, | ||
const std::shared_ptr<const AbstractOperator> right) | ||
: _left_input(left), _right_input(right) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice, maybe we should have something similar
|
||
void AbstractOperator::execute() { | ||
Assert(!_was_executed, "Operators shall not be executed twice."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be a debug assert, as users must never see this error and correct code should not trigger this.
const auto value_segment = std::dynamic_pointer_cast<ValueSegment<Type>>(segment); | ||
const auto dictionary_segment = std::dynamic_pointer_cast<DictionarySegment<Type>>(segment); | ||
const auto reference_segment = std::dynamic_pointer_cast<ReferenceSegment>(segment); | ||
|
||
DebugAssert(value_segment || dictionary_segment || reference_segment, | ||
"Segment has to be ValueSegment, DictionarySegment or ReferenceSegment."); | ||
|
||
if (value_segment) { | ||
_scan_value_segment(chunk_id, *value_segment, *pos_list); | ||
} else if (dictionary_segment) { | ||
_scan_dictionary_segment(chunk_id, *dictionary_segment, *pos_list); | ||
} else if (reference_segment) { | ||
_scan_reference_segment(*reference_segment, *pos_list); | ||
referenced_table = reference_segment->referenced_table(); | ||
++reference_segment_count; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird structure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? How do you think it can be improved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this comment seems to have slipped through. We meant to remove it as we could not think of better ways to structure it. The initial comment was made by me due to the placement of the assert between casts and accesses. It could also be moved to another else-branch with a Fail()
but we decided to omit the comment.
}); | ||
} | ||
|
||
Assert(reference_segment_count == 0 || (chunk_count == 1 && reference_segment_count == 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider changing this to a debug assert.
ValueID DictionarySegment<T>::null_value_offset() const { | ||
return _is_nullable ? ValueID{1} : ValueID{0}; | ||
// The maximum value representable within the attribute vector's width. | ||
return ValueID{(1u << attribute_vector()->width() * 8) - 1}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using the length of the dictionary and refactoring your table scan. This works because this is cast to INVALID_VALUE_ID
which is not matched by your table scan predicate in any case. This solution relies a lot on implicit behavior. Also: When using 32 bit, you return INVALID_VALUE_ID
in both branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using this value as the value id for null is fine, I would go as far saying that I prefer it over all other approaches. As for the behavior in the table scan: We at some point just made the decision to live with this behavior, because the task didn't require any particular handling yet. I think a proper solution would involve checking for null explicitly in the scan, regardless of what particular value id is used for it. I am not sure I understand what you mean by implicit behavior - we did not intend to hide some math around this special value id anywhere (note that, in case you meant that place, the comment in _scan_dictionary_segment
is referring to the documented interface of DictionarySegment::{lower,upper}_bound
). Did we use this anywhere without noticing?
} | ||
} | ||
|
||
void TableScan::_scan_reference_segment(const ReferenceSegment& segment, PosList& pos_list) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, no check for NULL of search value.
|
||
void TableScan::_scan_reference_segment(const ReferenceSegment& segment, PosList& pos_list) { | ||
const auto predicate = predicate_for_scantype<AllTypeVariant>(_scan_type); | ||
for (const auto row_id : *segment.pos_list()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads a bit strange. Consider moving this reference into its own variable.
return NULL_VALUE; | ||
} | ||
const auto chunk = _referenced_table->get_chunk(row_id.chunk_id); | ||
return (*chunk->get_segment(_referenced_column_id))[row_id.chunk_offset]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use operator[]()
(it is meant for debugging only).
Also consider using your accessor methods here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you suggest instead? We are in a function that is supposed to return AllTypeVariant
, so we used the accessor that does just that, delegating the conversion from T
to AllTypeVariant
to whoever has the T
.
} | ||
|
||
size_t ReferenceSegment::estimate_memory_usage() const { | ||
// Implementation goes here | ||
Fail("Implementation is missing."); | ||
return size() * sizeof(RowID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PosList
is shared between ReferenceSegment
s, beware of counting its memory twice. You also do not count this
into the size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are counting the entries of the one PosList
just once, or am I not seeing this correctly? As for adding the size of this
: We just stuck to what the tests for ValueSegment
demanded back in sprint 1 which was just to account for the contained tuples. Either way seems reasonable to me
|
||
TEST_F(OperatorsTableScanTest, ScanType) { | ||
auto scan = std::make_shared<TableScan>(_table_wrapper, ColumnID{0}, ScanType{-1}, 90000); | ||
EXPECT_THROW(scan->execute(), std::logic_error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test.
Merging to get this out of my PR list :^) |
Note to reviewers: In addition to the task, we also changed the
DictionarySegment
to use the highest possible index as the ValueID instead of 0.