New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUCSS - the wpr_rucss_used_css table can grow big on large websites #4802
Comments
@vmanthos It would be good to know how many URLs are there. For 18.4GB it would be around 480 000 URLs when the Separate Cache for Mobiles is disabled or 240 000 when it's enabled. Is this accurate more or less? |
@piotrbak One sub-sitemap of the specific site had 6 other sub-sitemaps containing more than 17.000 URLs each. We are talking about hundreds of thousands of URLs in this case. But the size of the table can be an issue even if it doesn't reach that size. Here is the comment from a customer:
Ticket: https://secure.helpscout.net/conversation/1814909181/331377/ |
@piotrbak I encountered the same problem. My website has about 16w product pages. I use the editor of elementor pro. I checked wp_wpr_rucss_used_css in woocommerce. Now it has only been about a week and it has a size of 14G. Although I am a dedicated server , but this can not support long-term operation, I am very worried that the server will not be able to withstand the crash at any time, can you solve it as soon as possible or give a solution, I also use the compressed mobile phone side of wp-rocket and the useless css of the computer side and js, I have been looking for solutions everywhere recently, but so far, I don't seem to have found a solution! |
Possible fix: The problem might be prevented if we stored the Used CSS in files instead of the database. Styles will still be added inline to the page like it's currently done. It will just be fetched from a file instead of the database by refactoring this function to look for if a page's Used CSS exists, and fetch its content if it does - wp-rocket/inc/Engine/Optimization/RUCSS/Controller/UsedCSS.php Lines 578 to 584 in 15574e3
Using naming conventions for files (like we do for cache), we won't need to connect to the database to find and fetch the content (less overhead on the database). On servers that support Gzip (most servers nowadays do), we can consider storing the Used CSS as gzipped files directly. It will give larger gains in disk space vs. the full size stored on the database. 80% size gain on average. But it will have an overhead to uncompress the |
Related - https://secure.helpscout.net/conversation/1824377026/333287/ Used CSS average 200kb. |
related - https://secure.helpscout.net/conversation/1837014809/335667 |
Related: https://secure.helpscout.net/conversation/1851789174/337990/ Used CSS is ~ 0.85MB/URL and when I checked ~660 rows were there in the table which weighted ~560MBs. |
Related: https://secure.helpscout.net/conversation/1844337435/336865 The site has over 20k posts and the used CSS is about ~100kb for each URL. |
related: https://secure.helpscout.net/conversation/1851953603/338040?folderId=2683093 Used CSS is ~320kb |
related: https://secure.helpscout.net/conversation/1850560070/337857?folderId=3864740 70 000 posts. |
There has to be an expiry of some sort to discard cache of for those posts which are not active. Probably it should include a setting for number of days from the last-viewed time. Mine is a current affairs website. For me, 7 days is a good enough time. But there are sites which aren't updated so frequently, and their contents remain relevant, they may desire a much longer period. |
@Tabrisrp whenever you're available, we need to think about the moving data to filesystem here. Of course, using compression where we can, etc. Let's have a discussion about the approach we'll take. |
Related ticket: https://secure.helpscout.net/conversation/1864520160/340083/ There are ~6.500 URLs and the database size is ~1.5GB. |
https://secure.helpscout.net/conversation/1866269415/340397?folderId=377611 |
Possibly related - https://secure.helpscout.net/conversation/1865916125/340328 |
I think we have two options to think of here to possibly solve this issue as follows:-
|
@engahmeds3ed option 1 will still fail to fix the issue in sites with high number of pages with little ratio of duplicates. Option 2 is more efficient and would fit most shared hosting environments where file storage is unlimited and database storage has usually a cap. It will also give the flexibility to store CSS gziped when possible, which should take less space. The hash part is still valid. Maybe it could be used for the stored CSS files names to avoid duplicates? The hash will be referenced on the Used CSS table to manage the interaction with SaaS and picking the right used CSS files to inject. This will ultimately result in storing less Used CSS files. And will also result in storing less on the used CSS tables as the hash will weight less than the full filesystem path. |
Some thoughts:
|
@Tabrisrp In terms of hashing the filenames, we'd be storing them all in the same directory. If we had 70k or 100k CSS files there, wouldn't it slow down operations on those files? |
@piotrbak using the url as a path the same we do with cache can solve the problem? (I saw we got the url inside the table we can use for that) |
@CrochetFeve0251 Yes, but then sharing the same CSS file between different posts will be a bit harder/misleading |
@piotrbak how about a tree system. Example: |
How about we create subfolders from the hash itself? Let's say this is the hash of the generated RUCSS - We can use the left characters as sub-folders. The file structure would be something like: We can add as many subfolders as needed depending on how many pages the website has or using a filter. The folders won't always have the same files count, but it can avoid putting everything on the same folder without over complicating things to manage it. |
Likely related: https://secure.helpscout.net/conversation/1872967693/341323/ |
Scope a solution ✅In Create a new This class will be used as a dependency for the Update the Add/Update tests to match all the changes. Estimate the effort ✅Effort [M] |
would it be possible to have this fix in 3.11.3? My host is giving me a temporary free database upgrade. |
This table is 46GB on my client's site that has over 100,000 WooCommerce products in it. Even the postmeta table for that many products is only 2.4GB. I only just discovered this issue when trying to clone the database to my local dev environment (they installed the WP Rocket plugin on their own) and saw how huge it was. It's going to use all their disk space very quickly with the nightly backups. |
@dbarproductions My host also makes daily automatic DB backups. Normally they keep 14 days of backup but I see that in my case they only have 7 days. They probably are trying to keep it within the limits that way for a little longer. |
@dbarproductions + @bwafels the issue will soon be fixed as storing the used CSS will move from the database to the filesystem. |
For a site with 10k products (of 50 attributes) the table is 5GB. I do not think that creating CSS for each and every page/product does the trick, regardless of the storage method. Obviously when in filesystem we will not have the increased memory requirements that we are now facing. Our site, consumes 15GB of RAM even when "idle" - thanks to your huge table that is 8 (eight) times the size of the rest of the content. What you have to do gentlemen is to follow TagDiv's lead and just create ONE optimal CSS for each page template NOT ONE PER SINGLE PAGE. eg. ONE CSS file for the product page, one for the category page, ONE for the homepage etc. Assuming obviously the "worst case scenario" e.g. a page that included related products, cross-sells, upsells etc. and just apply this ONE template to all product pages. If unsure, just allow the end users to run the optimizer manually to their... most complex product page, as tagdiv does per template. I really love to use your great work in CSS optimization in another project. Sadly is over 150k products and your architecture makes it prohibitive. Other than that just keep up the good work in offering the best optimization in the market! |
Sorry but it's a no-go ;) Why? Check our website for a very good example. All our pages are totally different and don't contain the same blocks, and so they need a different Used CSS for each of them. By the way, the issue is fixed since WP Rocket 3.11.4. |
Before submitting an issue please check that you’ve completed the following steps:
Describe the bug
The
wpr_rucss_used_css
table's size depends on the size of the used CSS for each page, and also the number of pages on a site.On large websites that have a lot of URLs, it can grow out of proportion. In one case, that table's size reached
18.4Gb
.After internal discussion, I'm creating this GitHub issue to monitor these cases, and see if there is anything we can do in the future.
To Reproduce
Not relevant.
Additional context
This is different from #4161 which will be resolved with the new implementation of the feature.
Tickets:
https://secure.helpscout.net/conversation/1810416819/330445/
https://secure.helpscout.net/conversation/1657346732/300522
https://secure.helpscout.net/conversation/1670431889/303126
https://secure.helpscout.net/conversation/1665033380/301876
https://secure.helpscout.net/conversation/1645065559/297803
Backlog Grooming (for WP Media dev team use only)
The text was updated successfully, but these errors were encountered: