Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi threaded version breaks on some queries (even with 1 thread) #56

Closed
niklas88 opened this issue May 23, 2018 · 7 comments
Closed

Comments

@niklas88
Copy link
Member

niklas88 commented May 23, 2018

Working on a query set among about 50 queries that work fine with the multi threaded version I found the following query that gives no results on the multi threaded version but 3580 results on a8b495b (just before the addition of multi threading)

PREFIX fb: <http://rdf.freebase.com/ns/>
PREFIX fbk: <http://rdf.freebase.com/key/>
PREFIX rdf: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?diocese_name WHERE {
  ?type rdf:label "Religious Jurisdiction"@en .
  ?diocese fb:type.object.type ?type .
  ?diocese fbk:wikipedia.en_title ?diocese_name .
}

I'm still investigating but it seems that somehow join of fixed size tables is broken.

Here is debug output from the single thread version
singlethread.txt

And from the multi thread version
multithread.txt

Apart from the more verbose output because of the LRUCache changes and it being in TRACE mode the problems seems to occur at the following point (Note the end):

Single:

DEBUG: Performing PSO scan for full relation: <http://rdf.freebase.com/ns/type.object.name>
DEBUG: Scan done, got 47,236,131 elements.
DEBUG: IndexScan result computation done.
DEBUG: Getting sub-result for Sort result computation...
DEBUG: Getting sub-results for join result computation...
DEBUG: IndexScan result computation...
DEBUG: Performing POS scan for full relation: <http://rdf.freebase.com/ns/type.object.type>
DEBUG: Scan done, got 266,316,909 elements.
DEBUG: IndexScan result computation done.
DEBUG: Join result computation...
DEBUG: Performing join between two fixed width tables.
DEBUG: A: witdth = 2, size = 266,316,909
DEBUG: B: witdth = 1, size = 3
DEBUG: Galloping case.
DEBUG: Join done.
DEBUG: Result: width = 2, size = 4,791

Multi:

DEBUG: Performing PSO scan for full relation: <http://rdf.freebase.com/ns/type.object.name>
DEBUG: Scan done, got 47,236,131 elements.
…
SCAN POS with P = "<http://rdf.freebase.com/ns/type.object.type>"
TRACE: We were the first to emplace, need to compute result
DEBUG: Performing POS scan for full relation: <http://rdf.freebase.com/ns/type.object.type>
DEBUG: Scan done, got 266,316,909 elements.
TRACE: Try to atomically emplace a new empty ResultTable
TRACE: Using key:
SCAN POS with P = "<http://www.w3.org/2000/01/rdf-schema#label>", O = ""Religious Jurisdiction"@en"
TRACE: Result already (being) computed
DEBUG: Performing join between two fixed width tables.
DEBUG: A: witdth = 2, size = 266,316,909
DEBUG: B: witdth = 1, size = 3
DEBUG: Galloping case.
DEBUG: Join done.
DEBUG: Result: width = 2, size = 0

Looking at the code there is a somewhat dangerous looking const-removal in the galloping code here, especially since the code is called with _fixedSizeData casted to std::vector<std::array> here but I'm not sure about the root cause yet

@niklas88
Copy link
Member Author

So I think the const_cast in the Join is somewhat fragile and might not be compatible with the multi threading approach but I really can't see how this breaks the case with a single thread. @floriankramer any ideas?

That said those whole shenanigans with _fixedSizeData vs _varSizeData looks like premature optimization to me and I'd like to know what kind of performance benefit this really brings (considering we are doing slow IO anyway) @Buchhold do you have performance data on this? The cost in terms of maintainability is certainly quite significant.

@Buchhold
Copy link
Member

This depends, but extensive experiments have been performed before I initially implemented it with the distinction. The speedup for joins once all data is in memory was significant. I don't have archived results but the code should be in the Broccoli repository somewhere (I think the folder is called "trials" or something like that)

Sure, if you only consider benchmarks that have to read everything from disk (e.g. every queries touches completely different relations), the speedup may be rather insignificant. But there are many cases where that isn't the bottleneck. To give a few examples:

(1) The initial motivation: Broccoli-style queries. Some lists (just think of type.object.name) are taken from cache across queries and intersected / merged with tiny lists that are read freshly. More than that: queries are built incrementally.

(2) Text operations that become leaves in the query tree. Writing variable size results somehow lead

(3) Queries building cartesian products and thus producing long outputs form rather small (disk-read) input lists.

I'm sure there are more cases, I cannot think of right now. If anything one COULD re-evaluate the feature and maybe hope that compiler optimizations for fixed-sized but not explicitly marked std::vectors became even better and somehow the feature has less impact overall now.
Anyway, I would disagree that the feature optimizes operations that never take a considerable amount of time.

@Buchhold
Copy link
Member

PS: Unlike the split, though, the changing the sentinels-based intersections (and thus the const cast) to do classic bounds checking, could maybe be accomplished without significant time lost

@niklas88
Copy link
Member Author

niklas88 commented May 23, 2018

Interesting, yes I realize that there are significant in-memory performance considerations, still the impact on maintainability seems very very high. That said std::vector does have an additional indirection and probably much worse memory access patterns as they aren't contiguous in memory. The latter should IMHO be possible without complicating the code so much though. For example by using manual row-major storage in a vector + some wrapping code or a special allocator.

Anyway back on topic, I've tried adding debug code here to print the 0-th values (both via the const and non-const refs). And it's really weird:

Single:

May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:07.679        - DEBUG: Performing POS scan for full relation: <http://rdf.freebase.com/ns/type.object.type>
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: Scan done, got 266,316,909 elements.
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: IndexScan result computation done.
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: Join result computation...
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: Performing join between two fixed width tables.
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: A: witdth = 2, size = 266,316,909
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: B: witdth = 1, size = 3
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: A[0][I] 287,364,587
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: B[0][J] 344,407,827
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: l1[0][I] 287,364,587
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: l2[0][J] 344,407,827
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.074        - DEBUG: Galloping case.
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.075        - DEBUG: Join done.
May 23 18:32:14 vulcano ServerMain[1876]: Wed May 23 18:32:14.075        - DEBUG: Result: width = 2, size = 4,791

Multi:

May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: Performing join between two fixed width tables.
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: A: witdth = 2, size = 266,316,909
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: B: witdth = 1, size = 3
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: A[0][I] 287,364,587
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: B[0][J] 344,407,827
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: l1[0][I] 287,364,587
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: l2[0][J] 344,407,827
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: Galloping case.
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: Join done.
May 23 18:05:07 vulcano ServerMain[32559]: Wed May 23 18:05:07.352        - DEBUG: Result: width = 2, size = 0

so the data looks to be at least partially the same and not completely garbage and yet the Join yields no results in the multi thread case. I also tried forcing the non-galloping case this also doesn't yield a result. Interestingly when debugging that it does find the sentinal value.

@floriankramer
Copy link
Member

I did some testing with the freebase dataset and the query, and was able to reproduce the bug.
It appears that the data returned by the IndexScan for the type predicate is mostly zeroes, apart from a small part at the beginning of the vector.
Sometimes the query returns the correct results though, independent of any modifications to the code I made during testing. so something outside of QLever must be influencing this (maybe something caching related?).

@floriankramer
Copy link
Member

It appears that pread does not read the entire object.type relation, leaving part of the ResultTable unfilled as the File class currently does not check the return value of the pread call.

@niklas88
Copy link
Member Author

niklas88 commented Jun 5, 2018

Looking at the return value of File.read() (which is ignored in scanPSO()) I think we are running into https://stackoverflow.com/questions/36565209/pread-for-very-large-files what a weird behavior on 64 bit systems..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants