Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUCSS - database grows large when versions of the CSS/JS files change frequently #4161

Closed
3 of 4 tasks
NataliaDrause opened this issue Jul 15, 2021 · 59 comments
Closed
3 of 4 tasks
Assignees
Labels
module: remove unused css priority: high Issues which should be resolved as quickly as possible severity: major Feature is not working as expected and no work around available status: blocked Issue or PR is blocked by external factor. type: bug Indicates an unexpected problem or unintended behavior

Comments

@NataliaDrause
Copy link
Contributor

NataliaDrause commented Jul 15, 2021

Before submitting an issue please check that you’ve completed the following steps:

  • Made sure you’re on the latest version - 3.9.0.5
  • Used the search feature to ensure that the bug hasn’t been reported before - yes

Describe the bug
When a page contains a CSS or JS file with a version that is changing dynamically very often (for example on each page load), the database may grow very large even for smaller sites.

To Reproduce
The issue is reported in the ticket https://secure.helpscout.net/conversation/1571200331/279553/
where a version on the file like this: https://example.com/wp-content/uploads/essential-addons-elementor/cb70d11b8.min.css?ver=1626273953 was changing on every page load.

Expected behavior
Avoid regeneration of used CSS if the content of the file doesn't change.

Screenshots
Video shared in Slack: https://wp-media.slack.com/archives/C08N8J6VC/p1626265876027200?thread_ts=1626264564.026500&cid=C08N8J6VC

Additional context
Slack thread: https://wp-media.slack.com/archives/C08N8J6VC/p1626264564026500

@engahmeds3ed remarks:

The main problem here is the version, as for each url we search for it on the DB and then compare the hash of the content on the DB with the hash of the content on the file system so in our case we have different url in each cache generation, it's something like the WPR issue related to the huge size of min directory inside cache when any resource has timestamp on the version.
We may search with the hash not the url.

Ticket: https://secure.helpscout.net/conversation/1571200331/279553/

Backlog Grooming (for WP Media dev team use only)

  • Reproduce the problem
  • Identify the root cause
  • Scope a solution
  • Estimate the effort
@GeekPress GeekPress added module: remove unused css needs: grooming priority: high Issues which should be resolved as quickly as possible type: bug Indicates an unexpected problem or unintended behavior labels Jul 15, 2021
@GeekPress GeekPress added this to the 3.9.2 milestone Jul 16, 2021
@webtrainingwheels
Copy link

https://secure.helpscout.net/conversation/1575931995/281101?folderId=377611
Also related to the same Elementor Essential Addons file.

@iCaspar iCaspar added the GROOMING IN PROGRESS Use this label when the issue is currently being groomed. label Jul 19, 2021
@iCaspar
Copy link
Contributor

iCaspar commented Jul 19, 2021

Gooming Notes:

Can Reproduce:

No need to reproduce. The issue is visible in the code directly.

Root Cause:

Per @engahmeds3ed remarks in OP refs: We are checking whether a given URL exists already in the DB and then adding it if not. Thus, if a version changes in the URL, regardless of whether the content of that URL is identical to any previously saved version, we create a new resource for it in the DB.

Scope a Solution:

In our Database\Queries\ResourcesQuery class the create_or_update() method creates a new item if get_item_by( $url, $resource['url'] ) returns empty. We should instead check by matching the content hash:
get_item_by( 'hash', md5( $resource[ 'content' ] ) )

Note, It's worth considering that we could also use get_item_by( 'content', $resource[ 'content' ] ). I prefer the hash because the hash column will be indexed and provide much better performance for the comparison/lookup. The downside is that the hash will not be absolutely accurate about whether contents are actually changed -- we could get false matches in some rare instances (@see https://stackoverflow.com/questions/2444321/how-are-hash-functions-like-md5-unique) which would result in not saving a new resource when we should. Comparing contents directly would be 100% accurate, but a vastly more expensive query.

Tests and fixtures will need to be updated to reflect the new parameters and to add verifictation of not duplicating similar URLs with identical content.

Estimate Effort:

[S]

@iCaspar iCaspar added effort: [S] 1-2 days of estimated development time and removed GROOMING IN PROGRESS Use this label when the issue is currently being groomed. needs: grooming labels Jul 19, 2021
@engahmeds3ed
Copy link
Contributor

@iCaspar
What do you think if we checked firstly with the url and if not found, search with the hash, this will eliminate the error margin but may lead to some additional queries in our issue's case.

What do you think?

@iCaspar
Copy link
Contributor

iCaspar commented Jul 19, 2021

Indeed, We could run both checks in those cases.

@Alex43515
Copy link

@iCaspar iCaspar self-assigned this Jul 21, 2021
@Tabrisrp Tabrisrp added the severity: major Feature is not working as expected and no work around available label Jul 21, 2021
@ghost
Copy link

ghost commented Jul 24, 2021

All:

Thanks for addressing this. Our database table "wp_wpr_rucss_resources" (size) is HUGE. Taking up approximately 90% of our whole DB size.

Looking forward to your fix.

Cheers!

@NataliaDrause
Copy link
Contributor Author

@iCaspar iCaspar added status: blocked Issue or PR is blocked by external factor. and removed effort: [S] 1-2 days of estimated development time labels Aug 20, 2021
@iCaspar
Copy link
Contributor

iCaspar commented Aug 20, 2021

For Blocked status @see #4199

@viobru
Copy link
Contributor

viobru commented Aug 28, 2021

@NataliaDrause
Copy link
Contributor Author

@webtrainingwheels
Copy link

@vmanthos
Copy link
Contributor

@vmanthos
Copy link
Contributor

Related: https://secure.helpscout.net/conversation/1634776100/295569?folderId=2135277

In this case, it seems the culprit is the Salient theme's dynamic styles file:
/themes/salient/css/salient-dynamic-styles.css?ver=26257

@mifrero
Copy link

mifrero commented Jan 17, 2022

@NataliaDrause
Copy link
Contributor Author

@camilamadronero-zz
Copy link

@vmanthos
Copy link
Contributor

vmanthos commented Feb 7, 2022

@ecotechie
Copy link

Same as others here. I find this feature unusable since it makes the database size grow astronomically. Patiently awaiting a fix, till then asynchronous loading it is.

@G00golplex
Copy link

Same here, Identical issue. The site crashed because the maximum site of the database has been reached. is there really no fix for this?

@tbba
Copy link

tbba commented Feb 17, 2022

I also had a database going from 8 MB to 800 MB because of this. Such large database killed lots of actions like saving/restoring or doing seach/replaces in there. Timeouts, stalls, and so on.

Issue:
There is no way to empty that table other than purge the content with some database tool.

Expected / suggestion:
The table (which is cache and can be recreated anytime) should be purged

  • when switching the feature off in the ROCKET settings,
  • on deactivating the ROCKET module,
  • maybe also as a ROCKET admin menu entry
  • and maybe become a ROCKET database clean-up setting.
  • And there should be a warning when activating this about the possible effects to the database.

So at least this beast becomes more handleable.

@vmanthos
Copy link
Contributor

@camilamadronero-zz
Copy link

@NataliaDrause
Copy link
Contributor Author

@tbba
Copy link

tbba commented Feb 21, 2022

Are you sure you want to hide the discussion away from GitHub to some closed space via those "related" links?
This is not really encouraging users to further providing free feedback.
Or what is the reason? and what is the outcome?

@Chaiavi
Copy link

Chaiavi commented Feb 22, 2022

I don't understand the DEV here

This is a SEVERE bug, it breaks some of the sites the plugin is installed on and for moths they are not taking care of it.

This bug should have been addressed the moment it was found or at least within the week.

What is WP-ROCKET waiting for ?
Beats me

@piotrbak
Copy link
Contributor

Hello @tbba and @Chaiavi. Related URLs are for internal use only, they are pointing to the support tickets that are associated with the specific issue. There are a couple of reasons why we are doing that.

When it comes to the issue itself, we are working on it for a longer period of time already. The fix will be announced and released soon.

@NataliaDrause
Copy link
Contributor Author

@vmanthos
Copy link
Contributor

vmanthos commented Mar 1, 2022

@camilamadronero-zz
Copy link

@brandonjp
Copy link

brandonjp commented Mar 17, 2022

Until this is fixed, is there an authoritative solution (or at least information source) so we can know how to manually remedy this? Right now I am completely unable to even log into my /wp-admin because the wpr_rucss_resources table has grown to over 6gb (which is my db limit on this server)!

Specifically:

  1. Can I safely purge/empty/truncate the wpr_rucss_resources table? Will this have any negative impact on my public facing site?
  2. The docs state: "Every 30 days WP Rocket runs a cleanup to remove any unused resources from this table." Does this feature need to be toggled on? Are there any known bugs with the cleanup task? (My wpr_rucss_resources table has over 83k rows and at least 9k of them are at least 31+ days old.) I can't seem to find any related issues or info on the cleanup task.

@DahmaniAdame
Copy link
Contributor

DahmaniAdame commented Mar 17, 2022

@brandonjp 

Can I safely purge/empty/truncate the wpr_rucss_resources table? Will this have any negative impact on my public facing site?

Yes. 

The docs state: "Every 30 days WP Rocket runs a cleanup to remove any unused resources from this table." Does this feature need to be toggled on?

No. It's there out of the box. No setting is required. 

My wpr_rucss_resources table has over 83k rows and at least 9k of them are at least 31+ days old.

The process depends on the consistency of the CRON. It could be that. 

In all cases, it's best if you wait for the next iteration of Remove Unused CSS in your case where the resources table will no longer be used on the WordPress side.

@nb4cks
Copy link

nb4cks commented Mar 22, 2022

@piotrbak I encountered the same problem. My website has about 16w product pages. I use the editor of elementor pro. I checked wp_wpr_rucss_used_css in woocommerce. Now it has only been about a week and it has a size of 14G. Although I am a dedicated server , but this can not support long-term operation, I am very worried that the server will not be able to withstand the crash at any time, can you solve it as soon as possible or give a solution, I also use the compressed mobile phone side of wp-rocket and the useless css of the computer side and js, I have been looking for solutions everywhere recently, but so far, I don't seem to have found a solution!

@BasedoEcommerce
Copy link

@brandonjp 

Can I safely purge/empty/truncate the wpr_rucss_resources table? Will this have any negative impact on my public facing site?

Yes. 

The docs state: "Every 30 days WP Rocket runs a cleanup to remove any unused resources from this table." Does this feature need to be toggled on?

No. It's there out of the box. No setting is required. 

My wpr_rucss_resources table has over 83k rows and at least 9k of them are at least 31+ days old.

The process depends on the consistency of the CRON. It could be that. 

In all cases, it's best if you wait for the next iteration of Remove Unused CSS in your case where the resources table will no longer be used on the WordPress side.

My storage is almost full because of this issue and I can't just delete this table?

I've already disabled the "Optimize CSS delivery" option and still I can't delete the wpr_rucss_resources table?

@vmanthos
Copy link
Contributor

@alfonso100
Copy link
Contributor

@piotrbak piotrbak modified the milestone: 3.11.1 Apr 5, 2022
@piotrbak
Copy link
Contributor

This is fixed with 3.11 as we are not storing the data of CSS/JS files anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: remove unused css priority: high Issues which should be resolved as quickly as possible severity: major Feature is not working as expected and no work around available status: blocked Issue or PR is blocked by external factor. type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.