New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange behaviour with aggregate function #213
Comments
A few thoughts from Michael:
Tough one… best guess: hash code collision issue, where we are doing
materialization at a different time in the case of LIMIT (and hash code
over materialized vs unmaterialized values somehow differ)?
There is this logic in AST2BOp of using materializing iterator vs
materializing explicitly via an operator, which depends on LIMIT and order
preservation requirements. I’d guess that this choice causes different plan
structures for non-LIMIT vs LIMIT, which triggers the hash problem.
You could likely verify this by verifying the as generated plan
variation and then drilling into the actual hash codes being used in the
tail segment of the plan for the variant without the LIMIT.
Bryan
…On Wed, Nov 17, 2021 at 10:30 AM Giovanni Moretti ***@***.***> wrote:
Hi,
I'm using a local instance of Blazegraph 2.1.6 RC and I've noticed that if
I submit a query with an aggregate function like group_concat it hangs
unless I add a limit instruction.
The following query hangs
SELECT ?lemma (GROUP_CONCAT(DISTINCT ?wr ;separator=", ") AS ?wrs)
where{
?lemma ontolex:writtenRep ?wr
} group by ?lemma
This other one does not hang
SELECT ?lemma (GROUP_CONCAT(DISTINCT ?wr ;separator=", ") AS ?wrs)
where{
?lemma ontolex:writtenRep ?wr
} group by ?lemma
limit 1000000000
Obviously my dataset is smaller than 1000000000: it contains 138245
triples.
Any suggestion ?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#213>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATW7YH4R6NOPMA2E63KAE3UMPYEHANCNFSM5IHWAWYA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Thank you for your reply and ok I understand the problem. |
Ok it is definitely a materialization problem. I set the |
I have another strange thing here.
but this other one stucks again:
to fix this problem I have to write the query like this:
I think there is something wrong with the nesting optimization or something related to the Can you please help me ? |
Adding Michael.
Bryan
…On Mon, Nov 22, 2021 at 11:02 AM Giovanni Moretti ***@***.***> wrote:
I have another strange thing here.
The following query now works with the previously mentioned trick :
PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#>
SELECT ?lemma (GROUP_CONCAT(DISTINCT ?wr ;separator=", ") AS ?wrs)
where{
?lemma ontolex:writtenRep ?wr
} group by ?lemma
but this other one stucks again:
PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#>
SELECT ?lemma (GROUP_CONCAT(DISTINCT ?wr ;separator=", ") AS ?wrs)
where{
?lemma ontolex:writtenRep ?wr
} group by ?lemma
order by ?wrs
to fix this problem I have to write the query like this:
PREFIX ontolex: <http://www.w3.org/ns/lemon/ontolex#>
Select * {
SELECT ?lemma (GROUP_CONCAT(DISTINCT ?wr ;
separator=", ") AS ?wrs)
where{
?lemma ontolex:writtenRep ?wr
} group by ?lemma
}
order by ?wrs
I think there is something wrong with the nesting optimization or
something related to the order by function.
Can you please help me ?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#213 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AATW7YDAVDW4W7XOIUB7VDLUNKHT7ANCNFSM5IHWAWYA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Looking at the first question again in more detail, I am attaching the two explains (for a data set with 20k triples, all having distinct subjects and objects but the same prediacte). As suggested initially, there is a difference in the plan: the slow version (explain-wo-limit) contains a dedicated ChunkedMaterializationOp at the end, which clearly dominates runtime. Looking at the code, I guess the problem happens here: https://github.com/blazegraph/database/blob/master/bigdata-core/bigdata-rdf/src/java/com/bigdata/bop/rdf/join/ChunkedMaterializationOp.java#L375 We add the constructed TermIds (all with termId=0L, indicating that they are mocked) to the idToConstMap. I confirmed that all of the collide in the hash code (which is 0 as well). Possible ideas for fixes would be 1.) Create a decorator for the idToConstMap (see e.g. https://stackoverflow.com/questions/41461515/hashmap-implementation-that-allows-to-override-hash-and-equals-method) It may also be possible to change hashCode() in TermId directly, i.e. if termId==0L compute the hash code based on the cached value always. While this would be the "clean" way to fix the issue, it sounds quite risky (and may cause problems with other code paths -- while I cannot come up with a scenario where this is problematic and I would believe it is the right change conceptually, it's really difficult to assess the blast radius of such an invasive change, both in terms of correctness and performance). |
Regarding your proposal:" I set the materializeProjectionInQuery variable in the AST2BOpContext class to false and thus I implicitly forced the projection to be materialized outside the plan". Yes, I think that should be okay as well, but it is more of a workaround (which may negatively impact the performance of other queries). |
Regarding your second question, here's the explain for the variant with ORDER BY (executed on some dummy data). As you can see, it contains two ChunkedMaterializationOp's that are expensive, due to the same reasons discussed above (with your change for materializeProjectionInQuery, there would probably be only one): In the light of this variant, my recommendation would be to fix the root cause (hash collisions) instead of changing materializeProjectionInQuery. |
Hi,
I'm using a local instance of Blazegraph 2.1.6 RC and I've noticed that if I submit a query with an aggregate function like group_concat it hangs unless I add a limit instruction.
The following query hangs
This other one does not hang
Obviously my dataset is smaller than 1000000000: it contains 138245 triples.
Any suggestion ?
The text was updated successfully, but these errors were encountered: