Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring back heuristic fragment matching, with a twist. #6901

Merged
merged 7 commits into from
Sep 10, 2020

Conversation

benjamn
Copy link
Member

@benjamn benjamn commented Aug 25, 2020

Easily the biggest breaking change for fragment matching in Apollo Client 3.0 was the removal of the FragmentMatcher abstraction (#5073), along with its subclasses HeuristicFragmentMatcher and IntrospectionFragmentmatcher (#5684), which were replaced by the declarative possibleTypes configuration.

The HeuristicFragmentMatcher was so named because it attempted to perform fragment matching without any actual knowledge of supertype/subtype relationships in your schema (for example, abstract interfaces implemented by concrete object subtypes, or unions with multiple alternative member types), checking instead whether all the fields of a given fragment were present in the result object, which is often a relatively strong signal that the fragment probably matched.

The HeuristicFragmentMatcher could be fooled by field aliases and accidental sharing of field names between different fragments, but it was also relatively resilient to adding new subtypes on the server, because heuristic matching doesn't care what the true subtypes of a supertype are, so what's one more?

Another big drawback of the HeuristicFragmentMatcher was that it applied the same fuzzy logic to reading from the cache, where the heuristic makes a lot less sense. If an individual query result that you're writing into the cache has all the keys you'd expect if a certain fragment matched, that's a pretty good sign that the fragment matched. But when you have a lot of data in your cache, from lots of different queries, it's not as meaningful to observe that all the fields required by some fragment you're reading are present in an object, since they might be there just because they've been written into the cache by other queries over time, not because the fragment actually matches the __typename of the object.

In moving to the more exact possibleTypes system, we gave up both the benefits and the drawbacks of the HeuristicFragmentMatcher, replacing it with something functionally similar to the IntrospectionFragmentmatcher, but with a much simpler configuration API.

As #5750 demonstrates, the exactness of the possibleTypes system makes it challenging for existing clients to adapt when a new subtype is added on the server. Is there any way to make this system more flexible, like the HeuristicFragmentMatcher, but without the drawbacks? I believe so!

This PR improves the possibleTypes API by allowing "fuzzy" subtype strings, which are interpreted as regular expressions for type names, rather than as actual __typename strings. If a fuzzy subtype matches the __typename of an object, fragments on supertypes of that fuzzy subtype are allowed to match the object—provided the result also has all the keys
required by the fragment.

In other words, if you previously configured the following possibleTypes:

new InMemoryCache({
  possibleTypes: {
    Character: ["Jedi", "Droid"],
    Test: ["PassingTest", "FailingTest", "SkippedTest"],
    Snake: ["Viper", "Python"],
  },
})

and you wanted to relax the Test supertype slightly, you could opt into fuzzy/heuristic matching by using a regular expression, while still enforcing a certain suffix:

new InMemoryCache({
  possibleTypes: {
    Character: ["Jedi", "Droid"],
    Test: ["PassingTest", "FailingTest", ".*Test"],
    Snake: ["Viper", "Python"],
  },
})

Since the ".*Test" string is not a valid type name, it is interpreted as the regular expression new RegExp(".*Test"). In order to count as a match, the regular expression must match the entire __typename string, and the match will only be honored if all the fragment's fields are actually present in the result object. Because PassingTest and FailingTest are still specified explicitly, they will continue to match immediately, without any fuzzy logic, but an object with __typename equal to SkippedTest or WishfulTest would now have a chance of qualifying as a Test for the purposes of fragment matching.

Likewise, if you added Python: ["^[A-Z].*Python"] to the possibleTypes list above, a fragment SnakeFragment on Snake could match an object with a __typename of ReticulatedPython, provided that object has all the necessary fragment fields.

The key advantages of this new system compared to the HeuristicFragmentMatcher are that (1) you can specify fuzzy subtypes for specific supertypes (rather than applying heuristic matching for all types by default), and (2) that it "learns" about fuzzy subtypes while writing (where the heuristic tends to work well), and then merely uses those inferred supertype/subtype relationships when reading, so there is no need to perform heuristic matching while reading. This is the "twist" mentioned in the title of this PR. When one of these inferences happens, you'll see a warning in the console (in development).

Unlike the old HeuristicFragmentMatcher, which provided the default behavior for all fragment matching, you do have to opt into fuzzy matching with possibleTypes, but it can be a useful strategy for preparing your clients for upcoming server changes, or for relaxing the rules for a particular supertype, in cases when you anticipate the subtypes may soon change on the server.

It's more convenient to configure possibleTypes as a map from supertypes
to arrays of subtypes, since that's how a schema introspection query
reports them.

However, we can perform policies.fragmentMatches checks much more
efficiently if we invert that structure internally, using a map from
subtypes to sets of possible supertypes.

When a fragment with type condition S is tested against an object with
__typename T, we now search upwards from T through its supertypes until we
find S (fragment matches), or the search terminates (matching fails).

We can (and did) achieve the same results by starting from S and searching
downward for T, but the branching factors tend to be larger in that
direction, so the search tends to take longer.
A full explanation of these changes can be found in PR #6901.
@benjamn benjamn force-pushed the heuristic-fragment-matching-again branch from a73517a to ae4a6f5 Compare September 10, 2020 18:01
@benjamn benjamn marked this pull request as ready for review September 10, 2020 18:01
Copy link
Member

@hwillson hwillson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredible @benjamn - this looks awesome!

@benjamn benjamn merged commit 4a22dcb into release-3.2 Sep 10, 2020
@benjamn benjamn deleted the heuristic-fragment-matching-again branch September 10, 2020 18:31
@benjamn benjamn mentioned this pull request Sep 10, 2020
11 tasks
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants