Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace performance-killing double-left join postmeta queries with two separate queries #27746

Open
DavidAnderson684 opened this issue Sep 22, 2020 · 11 comments
Labels
Contributor Day - H1 2023 focus: performance The issue/PR is related to performance. plugin: woocommerce Issues related to the WooCommerce Core plugin. team: Proton type: enhancement The issue is a request for an enhancement.

Comments

@DavidAnderson684
Copy link
Contributor

DavidAnderson684 commented Sep 22, 2020

Because of the use of custom post types and postmeta, WooCommerce has several places that perform double left-joins on the post/postmeta tables in order to perform lookups. Probably the desire was to be efficient by only performing one SQL query. But the query itself, because of the double-left join, is hideously inefficient and worse by several orders of magnitude than an alternative performing an extra query.

As an example, every time a new order is placed by a new customer, since there's no _money_spent usermeta key/value, so WooCommerce will run this monstrosity in order to help populate it (WC_Customer_Data_Store::get_total_spent()): (perhaps that's triggered by one of our extensions, I haven't checked, but it is run for every new customer).

SELECT SUM(meta2.meta_value)
 FROM $wpdb->posts as posts
 LEFT JOIN {$wpdb->postmeta} AS meta ON posts.ID = meta.post_id
 LEFT JOIN {$wpdb->postmeta} AS meta2 ON posts.ID = meta2.post_id
 WHERE   meta.meta_key       = '_customer_user'
 AND     meta.meta_value     = '" . esc_sql( $customer->get_id() ) . "'
 AND     posts.post_type     = 'shop_order'
 AND     posts.post_status   IN ( 'wc-" . implode( "','wc-", $statuses ) . "' )
 AND     meta2.meta_key      = '_order_total'

The effect of creating a double-left join post/postmeta is to require MySQL to row-scan a number of rows that on a typical site amounts to a number of rows that is over 80% of the number of rows found in postmeta. That, of course, is an absolute performance-killer on any site with a substantial number of orders. On one site I'm looking at, it's a row scan of over 13,000,000 rows, every time there's a new order - ouch.

As a solution, I propose that you split the processing up over two much lower-impact SQL queries, with some processing in between in PHP. First get their orders - which you'd do via wc_get_orders() of course, but the resulting SQL will be like:

SELECT id FROM wp_posts p, wp_postmeta pm WHERE p.id=pm.post_id AND meta_key='_customer_user' AND meta_value='123456' AND post_status IN ( ... );

On the same site, that performs a row scan of only 1.3 % of rows - only 1/64th of the figure.

Then, armed with those IDs, grab the _order_total meta values with a simple query on the relevant post_id - since both meta_key and post_id are indexed, there's no further row-scanning involved.

This is just one example of those double left-joins - there are various others in the code.

@juliaamosova juliaamosova added the needs: developer feedback Issues that need feedback from one of the WooCommerce Core developers. label Sep 22, 2020
@lkraav
Copy link
Contributor

lkraav commented Sep 22, 2020

Looks like another duplicate of #20513

@DavidAnderson684 did you search issues for get_total_spent?

Not sure if this issue has separate value for the general proposal for overhauling all such queries still lurking in the codebase.

@DavidAnderson684
Copy link
Contributor Author

DavidAnderson684 commented Sep 22, 2020

@lkraav In my view it'd have value to provide an efficient solution now to each individual place, because the fixes described above are quick, easy and low impact, because they address a critical/widespread performance issue, and because the "move to using custom order tables" effort is about to approach its fifth birthday (#9735).

Yes, I'm aware of that effort - I took part in that thread (in which my concerns were successfully rebutted by Patrick Garman).

@DavidAnderson684
Copy link
Contributor Author

Store-owners having the same problem may be interested to note that for the specific instance of the problem above, they can deploy the woocommerce_customer_get_total_spent filter to over-ride that query with some code that implements the above suggestion (we haven't done so yet so I haven't attached any sample code).

@rrennick
Copy link
Contributor

@DavidAnderson684 @lkraav Are either of you interested in writing a PR for this?

Looks like another duplicate of #20513

It is. In the comment thread it was expected that WC Admin would address this. As WC Admin can be disabled with a filter, the fix for this would need to be done in core.

@rrennick rrennick added type: enhancement The issue is a request for an enhancement. focus: performance The issue/PR is related to performance. and removed needs: developer feedback Issues that need feedback from one of the WooCommerce Core developers. labels Sep 23, 2020
@DavidAnderson684
Copy link
Contributor Author

@rrennick We're currently working on suitable code, and will keep you informed.

By the way, is the wc_order_product_lookup table meant to have entries for all historical orders on a site? On the site I'm working on, the value of SELECT COUNT(*) FROM posts WHERE post_type='shop_order'; is about three times the value of SELECT COUNT(*) FROM wc_order_product_lookup;, which indicates that it's missing a huge amount of data?

@rrennick
Copy link
Contributor

By the way, is the wc_order_product_lookup table meant to have entries for all historical orders on a site?

wc_order_product_lookup is one record for each product on each order. wc_order_stats is one record per order + one per refund. *Only if the historical import was run on all of the order history.

@DavidAnderson684
Copy link
Contributor Author

"Only if the historical import was run on all of the order history."

@rrennick Is that something that ought to have happened if you've upgraded from prior WC versions? Or something you have to take additional steps to achieve? As I say, the numbers of rows I have don't at all match with what I'd expect if it was intended that all data gets imported automatically on upgrade.

@rrennick
Copy link
Contributor

Is that something that ought to have happened if you've upgraded from prior WC versions?

@DavidAnderson684 When a site is updated from WC 3.X to 4.X a notice is added that prompts to run the historical import. You can see the import screen in Analytics -> Settings. There is a dropdown that allows selecting a timeframe.

@DavidAnderson684
Copy link
Contributor Author

DavidAnderson684 commented Sep 28, 2020

@rrennick I see, thank you.

It appears that even though I pressed the button to import at that time (the WC 4.0 upgrade), and eventually that had seemed to finish (after a lot of pain - I temporarily had to multiply the CPU count by 20 ...), and even though the orders table is partially populated, on that page (when I go to Analytics -> Settings ) it's claiming that nothing has been imported. i.e. 0 customers, 0 orders currently imported.

So, I pressed it to start off again, and immediately started getting database deadlock errors. Thus I pressed "Stop". I'm not sure I want to upgrade all that CPU power to spend another couple of days running the import if it's likely to tell me that it succeeded at the end and then want to do it all over again again... I'm thinking that probably this - #26452 - is what needs addressing first?

@morvy
Copy link

morvy commented Apr 24, 2023

has this issue been discussed on Contributor Day on April 19th?

@nuil
Copy link

nuil commented Aug 30, 2023

What's the status on this?
We're also having some trouble with this. Our requests are taking more than 45seconds now.

I replaced it in our source-code for this query. Anyone interested in a PR?
Are there any other places where this issue is also happening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contributor Day - H1 2023 focus: performance The issue/PR is related to performance. plugin: woocommerce Issues related to the WooCommerce Core plugin. team: Proton type: enhancement The issue is a request for an enhancement.
Projects
Development

No branches or pull requests

8 participants