New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] res_currency: add order by to _select_companies_rates query #98844
Conversation
Nice work, thanks! Minor: some useful information seems to be missing from the commit message (timings, reason why the fix was chosen, etc.), while being present in the PR desc (which is great, btw). Would you mind updating it? I have also 2 small questions, out of curiosity:
Thanks! |
nice job. |
👀 wow, amazing. |
51aa68e
to
3113adc
Compare
@odony (@ryv-odoo I think that you wanted to have a look at this one). Tl;dr: The PG10 vs PG12 differences comes from this commit. It forces
I investigated a bit deeper and it seems that the differences between PG10 and PG12 (I also tested in PG14 and it's the same as PG 12) are twofold. The first difference is the change in the way PG handles CTEs. This is refered to here under General performance improvements The issue comes from The second difference that impacts the performances is again linked to the inlining of the CTE. In the PG10 plan, the Merge Left Join has only two children, the outer relation and the inner relation. It behaves as it should by first sorting both relations and then joining them. In the PG12 plan, the same Merge Left Join has a third child, the scanning of the Subquery, SubPlan 1. This happens because SubPlan 1 is in the Join Filter of the Merge Left Join (the join conditions with cr.date_end in the initial query). The filter has to be run for each row that the Merge Left Join produces. One Index scan on the res_currency_rate table takes 0.004 ms locally. If we multiply that by the number of loops, i.e. the number of times the filter is applied, we get 0.004ms * 2327562 = 9310ms, which is the Actual Total Time reported in the # 36 Index Scan node of the PG 12 plan. The reason this didn't happen in PG 10 was because the CTE was materialized before the Merge Left Join. So the Join Filter didn't need to perform any additionnal scan since the currency_rate CTE is already scanned to be sorted as it is the inner relation of the Merge Left Join. Also, the date_end subquery that shows up in the JOIN has already been executed when materializing the CTE.
If we take the whole query and we manually inline the CTE by turning it into a subquery, the PG 10 plan closely ressemble the PG 12 one. Using ORDER BY (it also works with OFFSET 0) to give "hints" about the subquery is an old hack in PG versions prior to 12 (e.g. here or here). It seems that OFFSET 0 (or ORDER BY here) has been used as a kind of an optimization fence. When added to a subquery, it forces the optimizer to generate the subquery's own plan and to treat it independantly, instead of trying to pull the subquery up in the parent query or doing other inline optimizations. It's kind of hard to find the spot in postgres code where this OFFSET 0/ORDER BY changes the way the subquery is optimized but I did find a reference to OFFSET 0 being an optimization fence in the comments here. The ORDER BY works here because it kind of tells the optimizer So the ORDER BY solution is a bit of an hack and I think that using the |
PG12 introduced an optimization for CTEs that automatically inlines CTEs if they are only refered once in the parent query. Prior to that CTEs were always materialzed, meaning that PG created a sort of temp table on the fly to store the result of the CTE's evaluation. Whereas this leads to performance improvements in general, in the particular case of _select_companies_rates this inlining becomes a performance bottleneck. This is because while the currency_rate CTE is only refered once in both purchase_report and product_margin, the join condition (cr.date_end is null or cr.date_end > ...) requires evaluating the CTE's date_end subquery twice. This, combined with the fact that in PG12 the planner goes for a Nested Loop JOIN instead of a HASH Join in PG10 makes the performances of the whole query much worse in PG12 than in PG10. Adding an ORDER BY (or an OFFSET 0, the resulting plan is the same) creates a kind of optimization fence that forces PG to evaluate the subquery first using its own plan. This removes the need to rescan the subquery each time the Merge JOIN filter has to be applied, which is a good strategy in this specific situation. The same result could be achieved by adding the keyword "MATERIALIZED" in the CTE definition. The issue is that this keyword did not exist in PG 10 so using it would require to check the PG version at runtime from python. Examples of query timings change before and after PR: Number of POs | Before PR | After PR 2000 | 7s | 345ms 7000 | 23s | 1.1s opw-2930578
3113adc
to
cb9adb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robodoo r+
PG12 introduced an optimization for CTEs that automatically inlines CTEs if they are only refered once in the parent query. Prior to that CTEs were always materialzed, meaning that PG created a sort of temp table on the fly to store the result of the CTE's evaluation. Whereas this leads to performance improvements in general, in the particular case of _select_companies_rates this inlining becomes a performance bottleneck. This is because while the currency_rate CTE is only refered once in both purchase_report and product_margin, the join condition (cr.date_end is null or cr.date_end > ...) requires evaluating the CTE's date_end subquery twice. This, combined with the fact that in PG12 the planner goes for a Nested Loop JOIN instead of a HASH Join in PG10 makes the performances of the whole query much worse in PG12 than in PG10. Adding an ORDER BY (or an OFFSET 0, the resulting plan is the same) creates a kind of optimization fence that forces PG to evaluate the subquery first using its own plan. This removes the need to rescan the subquery each time the Merge JOIN filter has to be applied, which is a good strategy in this specific situation. The same result could be achieved by adding the keyword "MATERIALIZED" in the CTE definition. The issue is that this keyword did not exist in PG 10 so using it would require to check the PG version at runtime from python. Examples of query timings change before and after PR: Number of POs | Before PR | After PR 2000 | 7s | 345ms 7000 | 23s | 1.1s opw-2930578 closes #98844 Signed-off-by: Raphael Collet <rco@odoo.com>
Just a random question: why do you need to check for SELECT r.currency_id,
COALESCE(r.company_id, c.id) AS company_id,
r.rate,
r.name AS date_start,
COALESCE(r2.name, NOW()) AS date_end
FROM res_currency_rate r
JOIN res_company c ON (r.company_id IS NULL OR r.company_id = c.id)
LEFT JOIN LATERAL (
SELECT name
FROM res_currency_rate
WHERE name > r.name
AND currency_id = r.currency_id
AND (company_id IS NULL OR company_id = c.id)
ORDER BY name ASC
LIMIT 1
) r2 ON 't' |
2 similar comments
Something else that can be done is adding the min/max dates in the context to avoid fetching too many currency rates. |
@Aurelienvd @rco-odoo this pull request has forward-port PRs awaiting action (not merged or closed):
|
PG12 introduced an optimization for CTEs that automatically inlines CTEs if they are only refered once in the parent query. Prior to that CTEs were always materialzed, meaning that PG created a sort of temp table on the fly to store the result of the CTE's evaluation. Whereas this leads to performance improvements in general, in the particular case of _select_companies_rates this inlining becomes a performance bottleneck. This is because while the currency_rate CTE is only refered once in both purchase_report and product_margin, the join condition (cr.date_end is null or cr.date_end > ...) requires evaluating the CTE's date_end subquery twice. This, combined with the fact that in PG12 the planner goes for a Nested Loop JOIN instead of a HASH Join in PG10 makes the performances of the whole query much worse in PG12 than in PG10. Adding an ORDER BY (or an OFFSET 0, the resulting plan is the same) creates a kind of optimization fence that forces PG to evaluate the subquery first using its own plan. This removes the need to rescan the subquery each time the Merge JOIN filter has to be applied, which is a good strategy in this specific situation. The same result could be achieved by adding the keyword "MATERIALIZED" in the CTE definition. The issue is that this keyword did not exist in PG 10 so using it would require to check the PG version at runtime from python. Examples of query timings change before and after PR: Number of POs | Before PR | After PR 2000 | 7s | 345ms 7000 | 23s | 1.1s opw-2930578 closes odoo/odoo#98844 Signed-off-by: Raphael Collet <rco@odoo.com>
PG12 introduced an optimization for CTEs that automatically inlines CTEs if they are only refered once in the parent query. Prior to that CTEs were always materialzed, meaning that PG created a sort of temp table on the fly to store the result of the CTE's evaluation. Whereas this leads to performance improvements in general, in the particular case of _select_companies_rates this inlining becomes a performance bottleneck. This is because while the currency_rate CTE is only refered once in both purchase_report and product_margin, the join condition (cr.date_end is null or cr.date_end > ...) requires evaluating the CTE's date_end subquery twice. This, combined with the fact that in PG12 the planner goes for a Nested Loop JOIN instead of a HASH Join in PG10 makes the performances of the whole query much worse in PG12 than in PG10. Adding an ORDER BY (or an OFFSET 0, the resulting plan is the same) creates a kind of optimization fence that forces PG to evaluate the subquery first using its own plan. This removes the need to rescan the subquery each time the Merge JOIN filter has to be applied, which is a good strategy in this specific situation. The same result could be achieved by adding the keyword "MATERIALIZED" in the CTE definition. The issue is that this keyword did not exist in PG 10 so using it would require to check the PG version at runtime from python. Examples of query timings change before and after PR: Number of POs | Before PR | After PR 2000 | 7s | 345ms 7000 | 23s | 1.1s opw-2930578 closes odoo/odoo#98844 Signed-off-by: Raphael Collet <rco@odoo.com>
PG12 introduced an optimization for CTEs that automatically inlines
CTEs if they are only refered once in the parent query. Prior
to that CTEs were always materialized, meaning
that PG created a sort of temp table on the fly to store the result
of the CTE's evaluation.
Whereas this leads to performance improvements in general, in the particular
case of
_select_companies_rates
this inlining becomes a performancebottleneck. This is because while the currency_rate CTE is
only refered once in both purchase_report and product_margin,
the Merge Join Filter
cr.date_end is null or cr.date_end > ...
requires evaluating the CTE's date_end subquery twice. This, combined
with the fact that in PG12 the planner goes for a Nested Loop Join instead
of a Hash Join in PG10 makes the performances of the whole query
much worse in PG12 than in PG10. PG10 query plan vs PG12 query plan.
We can see on the PG12 Query Plan that the Merge Join Node 7 scans the CTE twice (SubPlan 1 and SubPlan 2).
An eazy solution would be to add the keyword
Materialized
before the CTE definition. This keyword, introducedin PG12, forces the materialization of the CTE, avoiding its inlining and making the PG12 plan the same as the PG10 one. Unfortunately,
Materialized
wasn't there in PG10 so we cannot use this solution here since some Odoo instances still run on PG10.Instead, this PR adds an
ORDER BY date_end
clause at the end of the CTE. This works like an OFFSET 0as it is kind of an optimizer fence, forcing PG to evaluate the subquery first Ordered PG12 query plan
Speedup
Test DB with 7000 purchase_order, 2666 res_currency_rate and 6 active currencies.
Purchase_report query timings in PG12, changing the number of purchase_orders
Changing the number of res_currency_rate
--
I confirm I have signed the CLA and read the PR guidelines at www.odoo.com/submit-pr