Test that cache evictions propagate to parent queries. #6412

benjamn · 2020-06-08T19:00:54Z

When an object is evicted from the cache, common intuition says that any dangling references to that object should be proactively removed from elsewhere in the cache. Thankfully, this intuition is misguided, because a much simpler and more efficient approach to handling dangling references is already possible, without requiring any new cache features.

As the tests added in this commit demonstrate, the cleanup of dangling references can be postponed until the next time the affected fields are read from the cache, simply by defining a custom read function that performs any necessary cleanup, in whatever way makes sense for the logic of the particular field. This lazy approach is vastly more efficient than scanning the entire cache for dangling references would be, because it kicks in only for fields you actually care about, the next time you ask for their values.

For example, you might have a list of references that should be filtered to exclude the dangling ones, or you might want the dangling references to be nullified in place (without filtering), or you might have a single reference that should default to something else if it becomes invalid. All of these options are matters of application-level logic, so the cache cannot choose the right default strategy in all cases.

By default, references are left untouched unless you define custom logic to do something else. It may actually be unwise/destructive to remove dangling references from the cache, because the evicted data could always be written back into the cache at some later time, restoring the validity of the references. Since eviction is not necessarily final, dangling references represent useful information that should be preserved by default after eviction, but filtered out just in time to keep them from causing problems. Even if you ultimately decide to prune the dangling references, proactively finding and removing them is way more work than letting a read function handle them on-demand.

This system works because the result caching system (#3394, #5617) tracks hierarchical field dependencies in a way that causes read functions to be reinvoked any time the field in question is affected by updates to the cache, even if the changes are nested many layers deep within the field. It also helps that custom read functions are consistently invoked for a given field any time that field is read from the cache, so you don't have to worry about dangling references leaking out by other means.

I recommend reading through this test not only because it demonstrates important capabilities of InMemoryCache, but also because the mythological subject matter contains some fun jokes/references, IMHO.

When an object is evicted from the cache, common intuition says that any dangling references to that object should be proactively removed from elsewhere in the cache. Thankfully, this intuition is misguided, because a much simpler and more efficient approach to handling dangling references is already possible, without requiring any new cache features. As the tests added in this commit demonstrate, the cleanup of dangling references can be postponed until the next time the affected fields are read from the cache, simply by defining a custom read function that performs any necessary cleanup, in whatever way makes sense for the logic of the particular field. This lazy approach is vastly more efficient than scanning the entire cache for dangling references would be, because it kicks in only for fields you actually care about, the next time you ask for their values. For example, you might have a list of references that should be filtered to exclude the dangling ones, or you might want the dangling references to be nullified in place (without filtering), or you might have a single reference that should default to something else if it becomes invalid. All of these options are matters of application-level logic, so the cache cannot choose the right default strategy in all cases. By default, references are left untouched unless you define custom logic to do something else. It may actually be unwise/destructive to remove dangling references from the cache, because the evicted data could always be written back into the cache at some later time, restoring the validity of the references. Since eviction is not necessarily final, dangling references should be preserved by default after eviction, and filtered out just in time to keep them from causing problems. And even if you ultimately decide to prune the dangling references, proactively removing them is way more work than letting a read function handle them on-demand. This system works because the result caching system tracks hierarchical field dependencies in a way that causes read functions to be reinvoked any time the field in question is affected by updates to the cache, even if the changes are nested many layers deep within the field. It also helps that custom read functions are consistently invoked for a given field any time that field is read from the cache, so you don't have to worry about dangling references leaking out by other means. I recommend reading through this test not only because it demonstrates important capabilities of InMemoryCache, but also because the mythological subject matter contains some good jokes, IMHO.

hwillson

This looks awesome @benjamn, and thanks for the Greek mythology refresher! 😂

hwillson · 2020-06-08T20:02:12Z

src/cache/inmemory/__tests__/readFromStore.ts

+      // Fun fact: Apollo is the only major Greco-Roman deity whose name
+      // is the same in both traditions.


The story we're telling in #6412 about using custom read functions to filter out dangling references works best if there's an easy way to check whether a Reference points to existing data in the cache. Although we could have introduced a new options.isValidReference helper function, I think it makes sense to let the existing options.isReference function handle this use case as well. I ended up refactoring how the toReference function gets created and passed around as well, since I want toReference and isReference to remain together as much as possible. I considered making isReference a property of EntityStore (like toReference used to be), but that would not have worked because the new isReference(ref, true) functionality needs access to the topmost layer of the cache, which only InMemoryCache knows about. Long story short, both isReference and toReference are now methods of InMemoryCache, rather than EntityStore.

The story we're telling in #6412 about using custom read functions to filter out dangling references works best if there's an easy way to check whether a Reference points to existing data in the cache. Although we could have introduced a new options.isValidReference helper function, I think it makes sense to let the existing options.isReference function handle this use case as well.

…6413) The story we're telling in #6412 about using custom read functions to filter out dangling references works best if there's an easy way to check whether a Reference points to existing data in the cache. Although we could have introduced a new options.isValidReference helper function, I think it makes sense to let the existing options.isReference function handle this use case as well.

jitensuthar · 2020-06-10T02:12:34Z

Hi, thanks for this. Was struggling with best way to handle (intentional) dangling references without having the queries fail. I had settled on just using the returnPartialData flag before, which I didn't love. Think this should help.

Couple of questions
(1) There are some cases where I may have an array of items where one (or more) of the items are evicted. Is the recommendation here to just leave that dangling reference in the cache and instead just filter it out on read? Are there any issues with writing that filtered output back to the field from within the read function (of course, only if values were actually filtered so that a write isn't executed on every single read)? I see that we have access to the cache object in the options, so it should be possible to do but wasn't sure if that's a flow/use-case that is being proposed here.

(2) What is the recommended approach to checking deeply-nested objects for dangling references, ideally without having to create type policies all the way down/up? For example say I have a notifications field that is an array of notificaiton objects, and I want to filter based on whether notification.notificationData.friendRequest.sendingUser points to a valid reference? I.e. if anything along that chain is an invalid reference, the entire (top-level) notification should be filtered out of the array. RIght now, I'd favor using nested readField calls to access several levels down. It looks like your proposal for isReference would only work for an immediate field of an object.

Overall, I do think the lazy approach makes a lot more sense and I hadn't thought about just filtering in read.

darkbasic · 2020-06-10T08:40:08Z

src/cache/inmemory/__tests__/readFromStore.ts

+      }),
+    })).toBe(true);
+
+    // You didn't think we were going to let Apollo be garbage-collected,


@benjamn this test should go straight to the documentation :)

darkbasic · 2020-06-10T08:44:05Z

src/cache/inmemory/__tests__/readFromStore.ts

+      },
+    });
+
+    const apolloRulerResult = cache.readQuery<{


@benjamn maybe adding another expect(cache.extract()).toEqual() after readQuery to further show that reading a query with type policies does not write it to the cache?

benjamn · 2020-06-17T19:08:48Z

Is the recommendation here to just leave that dangling reference in the cache and instead just filter it out on read? Are there any issues with writing that filtered output back to the field from within the read function (of course, only if values were actually filtered so that a write isn't executed on every single read)?

Yes, I would say it's not necessary to update the existing list in the cache, as long as the canRead filtering on read works for you (note: PR #6425 introduced options.canRead(x) instead of options.isReference(x, true)). If new objects with the same IDs are written to the cache in the future, those dangling references might become healthy/sound/?? (I don't know a good opposite for "dangling") again, so there might actually be some benefit to leaving them where they are. That said, you definitely can use the options.cache object to prune the list permanently. If you do that, I would suggest using something like

cache.modify({
  // As @darkbasic has suggested before, this ID should be easier to obtain.
  id: theCurrentEntityID,
  // This will permanently update elements to exclude any dangling references.
  nameOfTheList(elements, { canRead }) {
    return elements.filter(canRead);
  },
  // Important if you want the removal to be a quiet side-effect.
  broadcast: false,
});

What is the recommended approach to checking deeply-nested objects for dangling references, ideally without having to create type policies all the way down/up?

You're right that the filtering should happen in the read function for the notifications field, and I'm afraid I do not currently have a better recommendation than what you suggested (using nested readField calls and canRead). I realize this approach makes it hard to scan the data in a generic way without knowing which fields to check, so I expect this will be an area for future improvement.

This commit implements the proposal for automatic filtering of dangling references that I described in #6425 (comment). The filtering functionality demonstrated by #6412 (and updated by #6425) seems useful enough that we might as well make it the default behavior for any array-valued field consumed by a selection set. Note: the presence of field.selectionSet implies the author of the query expects the elements to be objects (or references) rather than scalar values. By making .filter(canRead) automatic, we free developers from having to worry about manually removing any references after evicting entities from the cache. Instead, those dangling references will simply (appear to) disappear from cache results, which is almost always the desired behavior. In case this automatic filtering is not desired, a custom read function can be used to override the filtering, since read functions run before this filtering happens. This commit includes tests demonstrating several options for replacing/filtering dangling references in non-default ways.

This commit implements the proposal for automatic filtering of dangling references that I described in #6425 (comment). The filtering functionality demonstrated by #6412 (and updated by #6425) seems useful enough that we might as well make it the default behavior for any array-valued field consumed by a selection set. Note: the presence of field.selectionSet implies the author of the query expects the elements to be objects (or references) rather than scalar values. A list of scalar values should not be filtered, since it cannot contain dangling references. By making .filter(canRead) automatic, we free developers from having to worry about manually removing any references after evicting entities from the cache. Instead, those dangling references will simply (appear to) disappear from cache results, which is almost always the desired behavior. Fields whose values hold single (non-list) dangling references cannot be automatically filtered in the same way, but you can always write a custom read function for the field, and it's somewhat more likely that a refetch will fix those fields correctly. In case this automatic filtering is not desired, a custom read function can be used to override the filtering, since read functions run before this filtering happens. This commit includes tests demonstrating several options for replacing/filtering dangling references in non-default ways.

benjamn added this to the Release 3.0 milestone Jun 8, 2020

benjamn self-assigned this Jun 8, 2020

benjamn added 💡 idea ✍️ working-as-designed 🧪 has-tests labels Jun 8, 2020

benjamn requested a review from hwillson June 8, 2020 19:07

hwillson approved these changes Jun 8, 2020

View reviewed changes

benjamn force-pushed the test-eviction-propagation branch from 06dced3 to 018ca84 Compare June 8, 2020 22:40

benjamn merged commit 7f2b1b0 into master Jun 8, 2020

benjamn deleted the test-eviction-propagation branch June 8, 2020 22:44

benjamn mentioned this pull request Jun 8, 2020

Support options.isReference(ref, true) to check Reference validity. #6413

Merged

benjamn mentioned this pull request Jun 9, 2020

[3.0.rc2] cache.evict without id is not working correctly #6418

Closed

benjamn added a commit that referenced this pull request Jun 9, 2020

Mention PR #6412 in CHANGELOG.md.

594c8ae

darkbasic reviewed Jun 10, 2020

View reviewed changes

benjamn mentioned this pull request Jun 17, 2020

Automatically filter out dangling references from arrays. #6454

Merged

github-actions bot locked as resolved and limited conversation to collaborators Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test that cache evictions propagate to parent queries. #6412

Test that cache evictions propagate to parent queries. #6412

benjamn commented Jun 8, 2020 •

edited

hwillson left a comment

hwillson Jun 8, 2020

jitensuthar commented Jun 10, 2020 •

edited

darkbasic Jun 10, 2020

darkbasic Jun 10, 2020

benjamn commented Jun 17, 2020 •

edited

		// Fun fact: Apollo is the only major Greco-Roman deity whose name
		// is the same in both traditions.

Test that cache evictions propagate to parent queries. #6412

Test that cache evictions propagate to parent queries. #6412

Conversation

benjamn commented Jun 8, 2020 • edited

hwillson left a comment

Choose a reason for hiding this comment

hwillson Jun 8, 2020

Choose a reason for hiding this comment

jitensuthar commented Jun 10, 2020 • edited

darkbasic Jun 10, 2020

Choose a reason for hiding this comment

darkbasic Jun 10, 2020

Choose a reason for hiding this comment

benjamn commented Jun 17, 2020 • edited

benjamn commented Jun 8, 2020 •

edited

jitensuthar commented Jun 10, 2020 •

edited

benjamn commented Jun 17, 2020 •

edited