Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better result tables #193

Merged
merged 12 commits into from Mar 9, 2019
Merged

Conversation

floriankramer
Copy link
Member

This pr replaces std::vector<std::array> and std::vector<std::vector> as the data type used by result tables to a new IdTable type. This ensures that the result data is always stored dense and makes the templating of operations easier.

Copy link
Member

@niklas88 niklas88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's what I call a giant Pull Request. Reviewing and testing this will take some time but here is what I noticed on a quick first pass. Overall this already looks very nice

src/engine/CallFixedSize.h Outdated Show resolved Hide resolved
src/engine/CallFixedSize.h Outdated Show resolved Hide resolved
src/engine/CallFixedSize.h Outdated Show resolved Hide resolved
src/engine/ResultTable.h Show resolved Hide resolved
class OBComp {
public:
OBComp(const vector<pair<size_t, bool>>& sortIndices)
: _sortIndices(sortIndices) {}

bool operator()(const E& a, const E& b) const {
template <int I>
bool operator()(const typename IdTableImpl<I>::Row& a,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should take the Row type from IdTableStatic instead of from the …Impl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I tried out initially, but because the Row type in IdTableStatic is declared using a using statement the compiler had trouble inferring the value of I (which must be found using template argument deduction). Templating the entire comparator on the other hand leads to the problem, that the I needs to be determined within the OrderBy operation, which would require a set of 5 if clauses to determine the value of I for the comparator (as the CALL_WITH_FIXED... macros don't support this case).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that using changes the behavior like that, do you think that is intended/correct or rather a compiler problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually am not certain if this should work or not, but I think it might fall into the 'Aliases are never deduced' rule (cppreference).

src/engine/IdTable.h Show resolved Hide resolved
src/engine/MultiColumnJoin.cpp Outdated Show resolved Hide resolved
src/engine/OptionalJoin.cpp Outdated Show resolved Hide resolved
src/engine/ResultTable.h Outdated Show resolved Hide resolved
src/engine/Sort.cpp Outdated Show resolved Hide resolved
src/engine/IdTable.h Outdated Show resolved Hide resolved
if (IdTableImpl<COLS>::_size + 1 >= IdTableImpl<COLS>::_capacity) {
grow();
}
std::memcpy(IdTableImpl<COLS>::_data +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a normal assignment might be preferable to forcing a memcpy here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using memcpy avoids the less efficient row based access here (which would construct and object). As this is part of the internals of the IdTable I went for efficiency over simplicity here.

src/engine/IdTable.h Outdated Show resolved Hide resolved
src/engine/IdTable.h Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented Mar 1, 2019

DeepCode encountered a problem when analyzing this pull request. If you want to retry, create a comment: "Retry Deepcode".

/**
* This is simply an Id* which has some additional methods.
**/
class ConstRow {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this have a Id* field so that as a struct it does indeed have the same size as an Id*? Also I think it should be specified final then we can also drop the virtual destructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this and then realized that the custom Row type for tables with a fixed with doesn't make any sense. The latest commit changes it to std::array<Id, COLS>, which should make the operator[] perform significantly more standard conform for IdTableStatic with fixed width.

@niklas88
Copy link
Member

niklas88 commented Mar 1, 2019

@floriankramer somehow your commits look like GitHub sees two identies for you. You might want to check if you have all your mail addresses in your GitHub profile

Copy link
Member

@niklas88 niklas88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nitpicks/questions other than that I think this is almost ready to merge. Great work!

iterator(Id* data, size_t row, size_t cols)
: _data(data), _row(row), _rowView(data + (cols * row), cols) {}
iterator() : _data(nullptr), _row(0) {}
iterator(Id* data, size_t row, size_t) : _data(data), _row(row) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with the last unnamed parameter here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator has to take the number of columns as a parameter to allow for compatibility between the fixed width and variable with IdTable. To prevent the compiler from complaining about an unused parameter we can either not give the parameter a name, or cast it to void in the constructors body (e.g. (void)cols;).
I've added a comment explaining the unnamed parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also use

iterator(Id* data, size_t row, size_t  /* cols */) : _data(data), _row(row) {}

}
}

Row& operator=(const Row& other) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a check for self assignment here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the check

return *this;
}

Row& operator=(Row&& other) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self assignment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the check

IdTableImpl<COLS>::setCols(cols);
}

void emplace_back() { push_back(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the no param emplace_back() if we don't have one with params?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the no params version for convenience (and because it was used in several places in the code and made porting that to IdTables easier). I haven't yet found a nice way of implementing an emplace with arguments that doesn't rely on c style va_args.

}

// _____________________________________________________________________________
void IndexScan::computeSOPfreeO(ResultTable* result) const {
result->_nofColumns = 2;
result->_data.setCols(result->_nofColumns);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could remove the _nofColumns member from ResultTable that would only save 8 bytes but also remove the line above this line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

src/engine/OrderBy.cpp Outdated Show resolved Hide resolved
src/engine/OrderBy.cpp Show resolved Hide resolved
@niklas88
Copy link
Member

niklas88 commented Mar 8, 2019

@floriankramer any idea what killed the QLever instance during the end-to-end tests with the latest changes?

Locally it seems to hang at

Sort result computation...

of the first query

@niklas88
Copy link
Member

niklas88 commented Mar 8, 2019

@floriankramer I was able to reproduce this running under gdb

Fri Mar  8 17:14:42.340 - INFO:  ---------- WAITING FOR QUERY AT PORT "7,001" ...
Fri Mar  8 17:14:45.820 - INFO:  Incoming connection, processing...
Fri Mar  8 17:14:45.822 - INFO:  Query:
SELECT ?x SCORE(?t) TEXT(?t) WHERE {
          ?x <is-a> <Scientist> .
          ?t ql:contains-entity ?x .
          ?t ql:contains-word "relati*"
      }
      ORDER BY DESC(SCORE(?t))
ServerMain: /home/schnelle/projects/QLever/src/engine/../index/../engine/IdTable.h:643: void IdTableStatic<COLS>::insert(const iterator&, const iterator&, const iterator&) [with int COLS = 0; IdTableStatic<COLS>::iterator = IdTableImpl<0>::iterator]: Assertion `begin.cols() == cols()' failed.

Thread 2 "ServerMain" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff6395700 (LWP 26864)]
0x00007ffff7a91d7f in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7a91d7f in raise () from /usr/lib/libc.so.6
#1  0x00007ffff7a7c672 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff7a7c548 in __assert_fail_base.cold.0 () from /usr/lib/libc.so.6
#3  0x00007ffff7a8a396 in __assert_fail () from /usr/lib/libc.so.6
#4  0x0000555555dd73fa in IdTableStatic<0>::insert (this=0x7ffff0007b60, pos=..., begin=..., end=...)
    at /home/schnelle/projects/QLever/src/engine/../index/../engine/IdTable.h:643
#5  0x0000555556044245 in Sort::computeResult (this=0x7ffff000bdd0, result=0x7ffff0007b40)
    at /home/schnelle/projects/QLever/src/engine/Sort.cpp:52
#6  0x0000555555ae81b9 in Operation::getResult (this=0x7ffff000bdd0)
    at /home/schnelle/projects/QLever/src/engine/./././Operation.h:52
#7  0x0000555555ae8b81 in QueryExecutionTree::getResult (this=0x7ffff000d560)
    at /home/schnelle/projects/QLever/src/engine/././QueryExecutionTree.h:81
#8  0x0000555555e80c84 in Join::computeResult (this=0x7ffff000b9c0, result=0x7ffff00067a0)
    at /home/schnelle/projects/QLever/src/engine/Join.cpp:103
#9  0x0000555555ae81b9 in Operation::getResult (this=0x7ffff000b9c0)
    at /home/schnelle/projects/QLever/src/engine/./././Operation.h:52
#10 0x0000555555ae8b81 in QueryExecutionTree::getResult (this=0x7ffff000cf20)
    at /home/schnelle/projects/QLever/src/engine/././QueryExecutionTree.h:81
#11 0x0000555556062fca in OrderBy::computeResult (this=0x7ffff00096a0, result=0x7ffff0006590)
    at /home/schnelle/projects/QLever/src/engine/OrderBy.cpp:54
#12 0x0000555555ae81b9 in Operation::getResult (this=0x7ffff00096a0)
    at /home/schnelle/projects/QLever/src/engine/./././Operation.h:52
#13 0x0000555555ae8b81 in QueryExecutionTree::getResult (this=0x7ffff6394ac0)
    at /home/schnelle/projects/QLever/src/engine/././QueryExecutionTree.h:81
#14 0x0000555555adf919 in Server::composeResponseJson[abi:cxx11](ParsedQuery const&, QueryExecutionTree const&, unsigned long) const (this=0x7fffffffaff0, query=..., qet=..., maxSend=100) at /home/schnelle/projects/QLever/src/engine/Server.cpp:324
#15 0x0000555555adda1e in Server::process (this=0x7fffffffaff0, client=0x7ffff6394db0, qec=0x7fffffffadd0)
    at /home/schnelle/projects/QLever/src/engine/Server.cpp:183
#16 0x0000555555add068 in Server::runAcceptLoop (this=0x7fffffffaff0, qec=0x7fffffffadd0)
    at /home/schnelle/projects/QLever/src/engine/Server.cpp:80
#17 0x0000555555b00762 in std::__invoke_impl<void, void (Server::*)(QueryExecutionContext*), Server*, QueryExecutionContext*> (__f=
    @0x5555569fa4d8: (void (Server::*)(Server * const, QueryExecutionContext *)) 0x555555adcf2e <Server::runAcceptLoop(QueryExecutionContext*)>, __t=@0x5555569fa4d0: 0x7fffffffaff0, __args#0=@0x5555569fa4c8: 0x7fffffffadd0)
    at /usr/include/c++/8.2.1/bits/invoke.h:73
#18 0x0000555555afc732 in std::__invoke<void (Server::*)(QueryExecutionContext*), Server*, QueryExecutionContext*> (__fn=
    @0x5555569fa4d8: (void (Server::*)(Server * const, QueryExecutionContext *)) 0x555555adcf2e <Server::runAcceptLoop(QueryExecutionContext*)>, __args#0=@0x5555569fa4d0: 0x7fffffffaff0, __args#1=@0x5555569fa4c8: 0x7fffffffadd0)
    at /usr/include/c++/8.2.1/bits/invoke.h:95
#19 0x0000555555b09865 in std::thread::_Invoker<std::tuple<void (Server::*)(QueryExecutionContext*), Server*, QueryExecutionContext*> >::_M_invoke<0ul, 1ul, 2ul> (this=0x5555569fa4c8) at /usr/include/c++/8.2.1/thread:244
#20 0x0000555555b09732 in std::thread::_Invoker<std::tuple<void (Server::*)(QueryExecutionContext*), Server*, QueryExecutionContext*> >::operator() (this=0x5555569fa4c8) at /usr/include/c++/8.2.1/thread:253
#21 0x0000555555b094aa in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (Server::*)(QueryExecutionContext*), Server*, QueryExecutionContext*> > >::_M_run (this=0x5555569fa4c0) at /usr/include/c++/8.2.1/thread:196
#22 0x00007ffff7e79063 in std::execute_native_thread_routine (__p=0x5555569fa4c0)
    at /build/gcc/src/gcc/libstdc++-v3/src/c++11/thread.cc:80
#23 0x00007ffff7f66a9d in start_thread () from /usr/lib/libpthread.so.0
#24 0x00007ffff7b55b23 in clone () from /usr/lib/libc.so.6

This seems to indicate that somewhere the columns did indeed get mixed up

../engine/IdTable.h:643: 
void IdTableStatic<COLS>::insert(const iterator&, const iterator&, const iterator&) 
[with int COLS = 0; IdTableStatic<COLS>::iterator = IdTableImpl<0>::iterator]: 
Assertion `begin.cols() == cols()' failed

@niklas88 niklas88 merged commit 6b71b66 into ad-freiburg:master Mar 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants