-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: nexmark q7 become slower and slower #7244
Comments
This comment was marked as outdated.
This comment was marked as outdated.
I took a look at nexmark q4. It seems it could be optimized by SELECT
Q.category,
AVG(Q.final) as avg
FROM (
SELECT MAX(B.price) AS final, A.category
FROM auction A, bid B
WHERE A.id = B.auction AND B.date_time BETWEEN A.date_time AND A.expires
GROUP BY A.id, A.category
) Q
GROUP BY Q.category; |
I guess the problem is in Q7, not Q4. Because Q7 contains such a join condition: risingwave/e2e_test/streaming/nexmark/views/q7.slt.part Lines 19 to 22 in 77534a9
This seems to be inefficient without interval join (risingwavelabs/rfcs#32). I'll do futher investigate and post results here later. |
Yes, q7 looks like a perfect case for interval/band join. |
I ran the Q4 and everything looks fine. So the problem is in Q7 as mentioned in #7244 (comment). I'll close this issue now. @chenzl25 Please help to check Q7 after interval/band join is completed. |
Well let's leave it for tracking that 🤣 |
please allow me to clarify first that I am just giving an optimization method for the special case in the Nexmark q7 and I think the interval/band join is necessary anyhow.
And we can find that the SQL writer just racks his brains to get the records that have the highest bid price in each tumble window. So we can easily express it as a GroupTopN query.
And the plan looks much better
I think the two queries are equivalent and check the batch query with our e2e test data. It gives the correct result. Are they equivalent? I know it is a little tricky but can we rewrite the pattern to the group topn in optimizer? 🤔 |
I think these two queries are equivalent, but it seems not general enough and a little bit hard to rewrite this query. Let's see the performance comparison. If the performance improvement is large enough we can try to implement this rewrite rule. |
FYI, the execution plan in Flink:
Seems no magic other than interval joins. 🥵 |
Yes, they are equal, but if we want to compare performance with flink, I think we should run both rewritten query with two engines. |
But according to the new benchmark result of Flink, kind of suspect if the interval join is the right choice here. Maybe we can force Flink to use normal hash join and test once. |
wait a minute, this one is hash inner join, why #7244 (comment) shows interval join? Because it is a logical plan? |
The throughput goes down from ~100K rows/s to ~3k rows/s. Still investigating.
The text was updated successfully, but these errors were encountered: