Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed: Don't empty node cache when creating a new node. #6081

Closed
danksearle opened this issue Apr 26, 2023 · 19 comments · Fixed by backdrop/backdrop#4414
Closed

Fixed: Don't empty node cache when creating a new node. #6081

danksearle opened this issue Apr 26, 2023 · 19 comments · Fixed by backdrop/backdrop#4414

Comments

@danksearle
Copy link

danksearle commented Apr 26, 2023

Description of the bug

In the core node.entity.inc is this line:

$this->resetCache($op == 'update' ? array($entity->{$this->idKey}): array());

when the $op is 'insert', it sends an empty array to resetCache, which results in it doing this:

cache_flush('cache_entity_' . $this->entityType);

Is there a reason for it to need to do that?

It seems unnecessary, and if you have a page that is very complex (perhaps badly written) and slow to load when the cache is empty, then that page loads slowly after new nodes are created.

I would suggest that the line in node.entity.inc be rewritten to this:

if($op == 'update') $this->resetCache(array($entity->{$this->idKey}));

So it doesn't clear the cache when inserting new nodes.

Steps To Reproduce

You need to have a page that is very complex, that loads slowly if there's no cache, but that runs fast with a cache. Have a way to add new data that affects that page content. In my case a dialog opens on top with a form, the form submission is via ajax, the node is saved and the page content is reloaded via ajax.

With core as it is now, that node save clears the cache and the reload takes 1 minute to complete. If I make the change I propose, the reload takes 1 second and the change is rendered (I don't need the cache to clear to see the change).

I'm afraid that I can't share my code or the data, and I don't have the time currently to spend creating an anonymous test set up for this.

Actual behavior

As I said, the cache is cleared on a node save, and the page content reload takes one minute to complete.

Expected behavior

I don't need the node cache to be cleared when a node is created.

Additional information

This bug report is not to optimise my code, even though that is of course a good thing to do. I am questioning that decision to clear the entire node cache when a new node is created - why does that need to happen?

My site is in development, hosted locally on my own PC with WSL2, Ubuntu and Lando.

Of course my code could be optimised, and perhaps that's the solution. I'm migrating code from Drupal 7 that worked for years.

Backdrop CMS version: 1.24.0
Web server: Apache/2.4.54 (Debian)
PHP version: 7.4.33
Database server: MySQL 5.7.29

@danksearle
Copy link
Author

danksearle commented Apr 26, 2023

It might help for me to explain that I am currently migrating a Drupal 7 project to Backdrop. The feature that is affected by this slowness has existed in that D7 project since 2014, so 9 years, and in that set up it worked. Yes the code is badly written and needs to be optimised - but it worked in D7. Now that I migrate to Backdrop, I find that it is too slow.

For the time being, I'm going to apply that code change as a core patch in my project here.

@indigoxela
Copy link
Member

indigoxela commented Apr 26, 2023

@danksearle many thanks for taking the time to report.

This code came in 10 years ago, from Drupal, and looking at the logic - this doesn't seem to be very useful. (Needless to say, this code has been changed in Drupal a long time ago. 😉)

$this->resetCache($op == 'update' ? array($entity->{$this->idKey}): array());

That fallback to flush all nodes of that type from cache, in case a new node of that type is created seems more like an oversight. Because flushing all is what actually happens here:

https://docs.backdropcms.org/api/backdrop/core%21modules%21entity%21entity.controller.inc/function/DefaultEntityController%3A%3AresetCache/1

My suggestion would be: if the operation is "create" do not attempt to flush anything. Or are there any concerns.

Calling out to our Entity expert - @argiepiano what do you think? Any concerns?

@klonos
Copy link
Member

klonos commented Apr 26, 2023

FTR, @indigoxela @yorkshire-pudding and myself were discussing this during the weekly office hours, and here were some of the thoughts that were thrown around the room:

  • If the operation is 'update', then the entity idKey should already exist (right?). So if that is the case, and if the idKey doesn't exist for some reason, then we should be perhaps throwing an exception (which is what Drupal does) instead of trying to clear ALL caches.
  • There is an inline comment above this code, that says // Reset general caches, but keep caches specific to certain entities., but it doesn't explain why we should be doing that. If there is an actual need to be resetting caches, then we should make sure that that is happening in batches or something (because that may actually be the cause of the slowness).
  • Perhaps the slowness is not caused during the cache reset, rather than the cache rebuild (?). If that is the case, can we then do something about that? Perhaps we could cache in the background while the page loads? (similar to what we did with Figure out a way to eliminate the "slow first load" problem caused by cron run. #1214 early on)

@klonos
Copy link
Member

klonos commented Apr 26, 2023

...besides @argiepiano, also pinging @hosef as our cache go-to person.

@kiamlaluno
Copy link
Member

When a node is not updated, the cache should not contain any data about that node; when a node is created, the cache should be updated to include it. Clearing all the cache seems a bit excessive, when a node is not updated.

@indigoxela
Copy link
Member

indigoxela commented Apr 27, 2023

To answer the question in this issue title: no, we should not flush the entity cache for nodes, when inserting a new node. 😏

A PR is ready for testing and review. I also extended functional tests to cover this situation.

@indigoxela
Copy link
Member

indigoxela commented Apr 27, 2023

Hm, interesting, this broke one of the Book module tests. Will investigate.

Edit: tests are passing now, but the one-liner fix might make code review a little trickier. 😁

@klonos
Copy link
Member

klonos commented Apr 27, 2023

Code LGTM, but I would like at least another pair of eyes on it before marking it RTBC. Lets see what our @backdrop/core-committers have to say about the approach as well.

@danksearle can you please test the PR and report back if it fixes the issue in your local/server? (and that it doesn't break anything else)

@argiepiano
Copy link

argiepiano commented Apr 27, 2023

@indigoxela, thanks for the PR

There is a problem with this approach. When you work with node trees (e.g. books), you need to clear the node disk cache. Otherwise the site will still show the outdated parent (which doesn't contain a reference to the child).

This can be seen as follows (after applying the patch):

  1. Enable Book
  2. Create a node or type "book page". Call it "My book" and select "" under Book outline -> Book
  3. Create a second node of type "book page". Call it "Chapter 1". Under Book outline > Book, select "My book" . Save

Now visit "My book" (first node). Even though the navigation at the bottom shows "Chapter 1", the body of the node doesn't show "Chapter 1". In fact if you load the node with Devel, you'll see that "has_children" is FALSE.

Screen Shot 2023-04-27 at 8 46 13 AM

Now clear caches and reload "My book". Now you'll see the referenced node.
Screen Shot 2023-04-27 at 8 46 25 AM

@herbdool
Copy link

It may require a book-specific fix by using hooks to clear the parent book node or all node caches. And take it out of the test in this pr.

@argiepiano
Copy link

argiepiano commented Apr 27, 2023

It may require a book-specific fix by using hooks to clear the parent book node or all node caches. And take it out of the test in this pr.

Agreed. The fix will need to take into account that the tree may have multiple levels, so it would have to recurse up and clear the disk cache for each individual parent. Or perhaps just opt for the easiest path and clear the whole node disk cache only after creating a book node, in book_node_insert() and others?

I also agree that clearing the whole node disk cache every time a new node is created is overkill.

@indigoxela
Copy link
Member

@argiepiano thank you so much for testing and helpful hints, and also thanks to @herbdool for having a look.

So my suspicion is confirmed, this excessive entity cache flushing actually obfuscated a problem in the Book module. 🤔

@indigoxela
Copy link
Member

So, this is what I came up with: whenever the book outline gets updated (not only on node creation), reset (entity-)caches for all affected nodes.

What do you think?

@klonos
Copy link
Member

klonos commented Apr 28, 2023

Thanks @indigoxela 🙏🏼 ...code and approach look good to me 👍🏼 ...I haven't tested this though, since it might be easier for @danksearle to do that (I have no easy way to reproduce the problem in the first place).

@argiepiano
Copy link

WFM.

@indigoxela
Copy link
Member

Rebased, to make phpcs check pass. All tests are green now.

@danksearle will you be able to test the change in your setup?

@klonos
Copy link
Member

klonos commented Apr 30, 2023

...and the code still looks good, so RTBC 👍🏼 😉

@quicksketch
Copy link
Member

Great work folks! I merged backdrop/backdrop#4414 into 1.x and 1.24.x. Thanks everyone!

@danksearle
Copy link
Author

@danksearle will you be able to test the change in your setup?

Yes, that code fixes it for me. Thanks!

@jenlampton jenlampton changed the title Should node cache be emptied when creating a new node? Fixed: Don't empty node cache when creating a new node. May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants