Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat subquery #155

Merged
merged 5 commits into from Dec 5, 2018
Merged

Conversation

floriankramer
Copy link
Member

This pr adds support for subqueries. The handling of subqueries in the optimizer is currently very simplistic, as they are simply optimized separately, and the resulting QueryExecutionTree then added to the parents optimization. A simple improvement would be to compute the result of a separately optimized query part before optimizing the parent query to get correct size and multiplicity of the result (the same way scans are currently handled). As that is a change not only related to subqueries I wanted to create a separate pr for it.

Copy link
Member

@niklas88 niklas88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some clarification questions mainly because I'm not too familiar with C++ class instances in union. I've only ever used union in pure C.

Also congrats you really did get this done fast, color me truly impressed.

- selected: ["?a", "?o", "?h", "?avg"]
- contains_row: ['<Steve_Backshall>', '<Actor>', 1.8, 1.76627]
- contains_row: ['<Carl_Sagan>', '<Astrobiologist>', 1.8, 1.8]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is nothing in scientists which would allow to test this with ql:has-predicate? I'm currently looking into creating a smallish Wikidata subset which should allow us to create much better test queries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used bash (for p in $(cat ./scientists.nt | cut -d' ' -f 2 | sort | uniq); do cat ./scientists.nt | cut -d' ' -f 1 | uniq | grep $p; done) to check for predicates that occur as subjects int the scientists nt file, but didn't find any.

src/engine/QueryPlanner.cpp Show resolved Hide resolved
_subquery = other._subquery;
other._subquery = nullptr;
} else {
new (&_childGraphPatterns) std::vector<GraphPattern*>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need placement new to initialize an object in a union can't one just call the constructor explicitly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is, that only the vector or the subquery may be initialized due to the union. As far as I know, any form of calling the constructor explicitly includes then using the assignment operator, which would then operate on uninitialized memory (as _childGraphPattern does not get default constructed in the union). I haven't done anything like this before though, so I'm happy for feedback on how to handle this in a better way.

Copy link
Member

@niklas88 niklas88 Dec 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the second example here the std::string in the union is constructed using operator=() wand the std::vector<int> is constructed with placement new just as in your code. So maybe both works and operator=() doesn't care if the memory is uninitialized but maybe that's only true for std::string? Looking at that code we may also need to add explicit destructor calls for the std::vector<GraphPattern*> though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like S s = "str"; only works because it also constructs. The following fails:

#include <iostream>
#include <string>
#include <vector>
 
union S
{
    std::string str;
    std::vector<int> vec;
};

int main()
{
    S s;
    s.str = {"Hello, world"};
    // at this point, reading from s.vec is undefined behavior
    std::cout << "s.str = " << s.str << '\n';
    s.str.~basic_string();
    new (&s.vec) std::vector<int>;
    // now, s.vec is the active member of the union
    s.vec.push_back(10);
    std::cout << s.vec.size() << '\n';
    s.vec.~vector();
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the missing destructor calls for the vector

src/parser/ParsedQuery.cpp Show resolved Hide resolved
src/parser/SparqlParser.cpp Outdated Show resolved Hide resolved
@niklas88
Copy link
Member

niklas88 commented Dec 5, 2018

@floriankramer great. I think it's ready to merge

@floriankramer floriankramer merged commit 63a4212 into ad-freiburg:master Dec 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants