Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never merge fields in cache unless objects have same identity. #5603

Merged
merged 2 commits into from
Nov 21, 2019

Conversation

benjamn
Copy link
Member

@benjamn benjamn commented Nov 21, 2019

These changes realize the consequences of the following extremely important principles:

  1. In order to accumulate GraphQL data over time, the InMemoryCache must be able to merge the fields of different response objects together, as those objects are written to the cache.

  2. However, merging the fields of two objects is allowable only when those objects are known to have the same identity.

    If we cannot determine the identities of both objects (using __typename and id, or keyFields), or their identities are different, we should never merge their fields together, because doing so risks combining unrelated fields into the same logical entity object, resulting in an object that could never have been returned by the GraphQL server.

  3. Even if two objects have the same identity, which allows us to merge their top-level fields, we should not assume we can recursively merge the values of fields that have the same name, unless those values are also identifiable objects with matching identities. Otherwise, we must replace the existing value with the incoming value, without attempting to combine them.

    Exception A: if the existing value and the incoming value are deeply equal to each other, then we can safely reuse the existing value to avoid needlessly altering references in the cache.

    Exception B: The application developer can provide custom field merge functions, e.g. for paginating array-valued fields. Since this level of customization was not possible in Apollo Client 2.x, some amount of automatic merging was necessary, but we can do the right thing now that the developer has more control.

I am happy that I did not have to update very many tests as a result of these changes, but the principles above are so important that I absolutely would have corrected as many tests as necessary to accommodate these changes. 🍳:feelsgood: ✨

These changes realize the consequences of the following extremely
important principles:

1. In order to accumulate GraphQL data over time, the InMemoryCache must
   be able to merge the fields of different response objects together,
   as those objects are written to the cache.

2. However, merging the fields of two objects is allowable only when those
   objects have the same identity.

   If we cannot determine the identities of both objects (using __typename
   and id, or keyFields), or their identities are different, we should
   never merge their fields together, because doing so risks combining
   unrelated fields into the same logical entity object, resulting in an
   object that could never have been returned by the GraphQL server.

3. Even if two objects have the same identity, which allows us to merge
   their top-level fields, we should not assume we can recursively merge
   the values of fields that have the same name, unless those values are
   also identifiable objects with matching identities. Otherwise, we must
   replace the existing value with the incoming value, without attempting
   to combine them.

   Exception A: if the existing value and the incoming value are deeply
   equal to each other, then we can safely reuse the existing value to
   avoid needlessly altering references in the cache.

   Exception B: The application developer can provide custom field merge
   functions, e.g. for paginating array-valued fields. Since this level of
   customization was not possible in Apollo Client 2.x, some amount of
   automatic merging was necessary, but we can do the right thing now that
   the developer has more control.

I am happy that I did not have to update very many tests as a result of
these changes, but the principles above are so important that I absolutely
would have thrown away the entire test suite if these changes had demanded
more drastic action.
// If this property was overridden by a custom merge function, then
// its merged value has already been determined, so we should return
// incoming without recursively merging it into existing.
return incoming;
Copy link
Member Author

@benjamn benjamn Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic was previously necessary to avoid re-merging data that had already been handled by custom merge functions. Now that we are no longer recursively merging data in the storeObjectReconciler function, there's no risk of double merging.

Comment on lines -433 to -439
// Incoming references can be merged with existing non-reference data
// if the existing data appears to be of a compatible type.
store.set(
incoming.__ref,
this.merge(
existing,
store.get(incoming.__ref),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 to this terrible hack

Comment on lines -451 to -456
if (Array.isArray(incoming)) {
if (!Array.isArray(existing)) return incoming;
if (existing.length > incoming.length) {
// Allow the incoming array to truncate the existing array, if the
// incoming array is shorter.
return this.merge(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most arrays in GraphQL results contain a sequence of entity references, which are already merged according to their identities, so it was always a little questionable to be recursively merging arrays of non-reference data.

}
invariant(
!isReference(existing) || isReference(incoming),
`Store error: the application attempted to write an object with no provided id but the store already contains an id of ${existing.__ref} for this object.`,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this error, and it definitely confuses people (see #2510), but fixing it by including the necessary ID fields in the query seems to be worthwhile in virtually every situation (as opposed to disabling the error). Open to discussion!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - I think having it is better than not as well, so I'm all for keeping it.

Copy link
Member

@hwillson hwillson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @benjamn! We might want to add a quick note about this in the changelog breaking changes section, but hopefully this won't impact many people (and the fact that you had to change very few tests for this has me very optimistic).

@benjamn benjamn merged commit 7dad1cf into release-3.0 Nov 21, 2019
@benjamn benjamn deleted the merge-fields-only-when-same-entity-identity branch January 7, 2020 17:45
benjamn added a commit that referenced this pull request Jan 23, 2020
One consequence of #5603 is that replacing non-normalized data in the
cache can result in loss of useful data. In almost every case, the right
solution is to make sure the data can be normalized, or (if that isn't
possible) to define a custom merge function for the replaced field, within
the parent TypePolicy.

It turns out we can give a very detailed warning in all such situations,
so that's what this commit does. Looking at the output for our test suite,
every warning is legitimate and worth fixing. I will resolve the warnings
and test failures that our test suite generates in subsequent commits.

TODO Update the documentation URLs after #5677 is merged.
benjamn added a commit that referenced this pull request Jan 24, 2020
One consequence of #5603 is that replacing non-normalized data in the
cache can result in loss of useful data. In almost every case, the right
solution is to make sure the data can be normalized, or (if that isn't
possible) to define a custom merge function for the replaced field, within
the parent TypePolicy.

It turns out we can give a very detailed warning in all such situations,
so that's what this commit does. Looking at the output for our test suite,
every warning is legitimate and worth fixing. I will resolve the warnings
and test failures that our test suite generates in subsequent commits.
benjamn added a commit that referenced this pull request Jan 24, 2020
One consequence of #5603 is that replacing non-normalized data in the
cache can result in loss of useful data. In almost every case, the right
solution is to make sure the data can be normalized, or (if that isn't
possible) to define a custom merge function for the replaced field, within
the parent TypePolicy.

It turns out we can give a very detailed warning in all such situations,
so that's what this commit does. Looking at the output for our test suite,
every warning is legitimate and worth fixing. I will resolve the warnings
and test failures that our test suite generates in subsequent commits.
benjamn added a commit that referenced this pull request Jun 1, 2020
> This is a revival of PR #5833, which has accumulated too many merge
conflicts over the last few months to be worth rebasing.

One consequence of #5603 is that replacing non-normalized data in the
cache can result in loss of useful data, which is preferable to mistakenly
merging unrelated objects, but not ideal.

In almost every case, the right solution is to make sure the data can be
normalized, or (if that isn't possible) to define a custom `merge`
function for the field in question, within the parent type policy.

It turns out we can give a very detailed warning about such situations in
development, and that's what this commit does. For example:

  Cache data may be lost when replacing the d field of a Query object.

  To address this problem (which is not a bug in Apollo Client),
  either ensure all objects of type D have IDs, or define a custom
  merge function for the Query.d field, so InMemoryCache can safely
  merge these objects:

    existing: {"__typename":"D","e":4}
    incoming: {"__typename":"D","h":{"__typename":"H","i":7}}

  For more information about these options, please refer to the
  documentation:

    * Ensuring entity objects have IDs: https://go.apollo.dev/c/generating-unique-identifiers
    * Defining custom merge functions: https://go.apollo.dev/c/merging-non-normalized-objects

Looking at the output for our test suite, every warning seems legitimate
and worth fixing. I will resolve the warnings in subsequent commits.
thomassuckow pushed a commit to thomassuckow/apollo-client that referenced this pull request Jun 13, 2020
Solves infinite loops when more than one query with different fields on an unidentified object

Fixes apollographql#6370
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants