Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement internal "GraphMode" response writer #967

Closed
6 tasks
josephsavona opened this issue Mar 18, 2016 · 14 comments
Closed
6 tasks

Implement internal "GraphMode" response writer #967

josephsavona opened this issue Mar 18, 2016 · 14 comments
Assignees

Comments

@josephsavona
Copy link
Contributor

Relay currently applies the results of queries, mutations, and subscriptions by traversing the query and payload in parallel. The payload cannot be processed in isolation because it lacks sufficient context - for example, consider the following payload:

{
  ... {
    friends: {
      edges: [...]
    }
  }
}

What does friends.edges signify? It could be a plain List[Friend], it could be the first: 10 friends in a connection, or it could be the first: 10, after: foo - a pagination result that should be merged with any existing edges. Currently, a payload can only be interpreted correctly in the context of a query. This process isn't optimal: a given field such as id may be queried multiple times by sibling fragments, and therefore has to be repeatedly processed. Further, the same object may appear multiple times in the response payload (e.g. the same person authored multiple comments), again causing duplicate processing.

Goals

The primary goal of this proposal is to define a data format that can be efficiently applied to a normalized client-side cache. The format should be capable of describing any change that could be applied to a normalized object graph: i.e. both the results of queries as well as mutations and subscriptions.

Specifically, we have found the following characteristics to be important to ensure efficient processing of query/mutation results:

  • Normalized data: avoiding duplication of data in the response reduces the time spent processing it.
  • Data-driven: queries themselves may have duplication (i.e. the same fields may be queried numerous times by sibling or nested fragments). The payload should be self-describing in order to reduce duplicate effort in processing.
  • First-class support for describing partial results, e.g. to allow pagination without loading all items of a list up-front.

Non-goals include:

  • Reducing byte-size over the wire in server -> client communication.
  • Defining a fully generic data response format. This proposal is specifically targeted at describing changes to a normalized object graph with the capabilities necessary for typical client-side applications.

Specification Strawman

We're still figuring this out, but we'd prefer to develop this specification in the open and with input from the community. We'll continue to update this as we iterate, but here's a commented example with Flow annotations:

Example Query:

Relay.QL`
  query {
    node(id: 123) {
      ... on User {
        id
        friends(first: "2") {
          count
          edges {
            cursor
            node {
              id
              name
            }
          }
          pageInfo {
            hasNextPage
          }
        }
      }
    }
  }
`

Standard "Tree" Response:

{
  node: {
    id: '123',
    friends: {
      count: 5000,
      edges: [
        {
          cursor: '...',
          node: {...},
        },
        ...
      ],
      pageInfo: {
        hasNextPage: true,
        ...
      },
    },
  },
}

GraphMode Response:

[
  {
    op: 'root',
    field: 'node',
    identifier: '123',
    root: {__ref: '123'},
  },
  {
    op: 'nodes',
    nodes: {
      123: {
        id: '123',
        friends: {
          __key: '0' // <- can refer back to this in range operations
          count: 5000,
        }
      },
      node1: {
        ...
      }
    }
  },
  {
    op: 'edges',
    args: [{name: 'first', value: 2}],
    edges: [
      {
        cusror: '...',
        node: {
          __ref: 'node1',
        },
      },
      ...
    ],
    pageInfo: {
      hasNextPage: true,
    },
    range: '0', // <- refers to the point in `nodes` with `__key = '0'`
  },
]

Where the shape of the response is:

type GraphModePayload = Array<GraphOperation>;
type CacheKey = string;
type GraphOperation =
  RootOperation |
  NodesOperation |
  EdgesOperation;
type RootOperation = {
  op: 'root',
  field: string;
  identifier: mixed;
  root: GraphRecord | GraphReference;
};
type NodesOperation = {
  op: 'nodes';
  nodes: {[dataID: DataID]: GraphRecord};
};
type EdgesOperation = {
  op: 'edges';
  args: Array<Call>;
  edges: Array<?GraphRecord>;
  pageInfo: PageInfo;
  range: CacheKey;
};
type GraphRecord = {[storageKey: string]: GraphValue};
type GraphReference = {
  __ref: DataID;
};
type GraphScalar = ?(
  boolean |
  number |
  string |
  GraphRecord |
  GraphReference
);
type GraphValue = ?(
  GraphScalar |
  Array<GraphScalar>
);

Next Steps

  • Implement proof-of-concept GraphMode response handler and use it in some real applications.
  • Refine the specification.
  • Use GraphMode for handling existing operations:
    • Transform and apply query payloads via GraphMode.
    • Transform and apply mutation & subscription responses via GraphMode.
  • Expose a public method on RelayEnvironment for applying GraphMode payloads to the store (as part of [meta] Relay Core API #559).
@wincent
Copy link
Contributor

wincent commented Mar 18, 2016

In your example I think you have employees and friends intermixed. I think you just mean friends everywhere, right?

@craffert0
Copy link

An important aspect of this is that we don't have redundant data. Imagine this query:

Relay.QL {
  query {
    node(id: "123") {
      id
      name
      ... on User {
        cousins {
          edges {
            node {
              id
              name
              ... on User {
                cousins {
                  edges {
                    node {
                      id
                      name}}}}}}}}}}}
`;

The result tree will have lots of duplicates, since my cousins have me as a cousin, and most have each other as cousins. In Graph Mode, we'll only have a single instance of each User.

@wincent
Copy link
Contributor

wincent commented Mar 18, 2016

An important aspect of this is that we don't have redundant data.

Surprisingly perhaps not quite as important as you may think, because gzip ends up eating up the redundancy for breakfast.

@eyston
Copy link

eyston commented Mar 18, 2016

Sorry if I'm missing this part but would GraphMode require MutationConfig for adding edges or that information could be captured in GraphMode itself?

@josephsavona
Copy link
Contributor Author

would GraphMode require MutationConfig for adding edges or that information could be captured in GraphMode itself?

@eyston Great question - the idea is GraphMode could describe mutations w/o any additional config. For example a range add might be described with:

{
  ...
  nodes: {
    123: {
      friends: {
        $type: 'connection',
        $data: [
          {calls: 'append', value: {$ref: 'addedID1'}},
          {calls: 'append', value: {$ref: 'addedID2'}}, // <-- append multiple edges at once
        ]
      }
    },
    addedID1: {
      ...
    },
    addedID2: {
      ...
    },

}

@skevy
Copy link
Contributor

skevy commented Mar 18, 2016

What is this...Falcor?!?!?

:trollface:

@eyston
Copy link

eyston commented Mar 18, 2016

Another question -- would non-node objects be embedded? e.g.

query {
  viewer {
    birthday {
      month day year
    }
  }
}
nodes: {
  123: {
    birthday: {
      month: 1,
      day: 1,
      year: 2000
    }
  }
}

kind of like no $type means interpret literally?

@josephsavona
Copy link
Contributor Author

@eyston yes, id-less records are inline

@josephsavona
Copy link
Contributor Author

After discussion with @leebyron I started looking for ways to avoid the special $type/$data keys. The main challenge is that connections simply can't be handled as-is: edges almost never just replaces the local edges and always requires some custom processing. A similar constraint holds for root fields, which are currently handled specially.

Here's an example query that demonstrates these challenges and an updated proposal for the data format:

query {
  me {
    id
    name
    address {
      city
    }
    bestFriend {
      name
    }
    friends(first: 2, after: $foo) {
      edges {
        node {
          id
          name }}}}}

The results could be described using operations similar to those in JavaScript Object Notation (JSON) Patch but with semantics specifically tailored to describing GraphQL/Relay object graphs:

[
  {
    // The `root` operation describes how a root field maps to an id.
    // This concept may not be necessary once Relay supports arbitrary
    // root fields.
    // In this case, `me` corresponds to id `123`:
    op: 'root'
    data: {
      field: 'me',
      arguments: null,
      id: '123',
   },
   {
     // The `add` operation denotes new data that should be added/merged into the object graph.
     // This describes scalar fields, plain lists, references (linked records), and lists of references.
     // Other field types such as pagination cannot be represented inline.
     op: 'add',
     data: {
       123: {
         name: '...',
         address: {
           city: '...',  // no cache identifier (`id`), so value is inline
         },
         bestFriend {$ref: '456'}, // single key `$ref` indicates a reference to another record
       },
       456: {
         name: '...',
       },
       friend1: {
         ... // first friend in the connection - note that it isn't linked to within this operation, that's ok
       },
       friend2: {
         ... // second friend in the connection - note that it isn't linked to within this operation, that's ok
       },
     },
   },
   {
     // The `connection` operation describes portions of a list that should be merged into
     // the existing list. It may be necessary to change the `id` key to a "path" in order to
     // allow updating connections on records without an `id`.
     op: 'connection',
     data: {
       id: '123',
       field: 'friends',
       arguments: {first: 2, after: 'fooCursor'},
       edges: [...], // includes `$ref`s to friends 1 and 2
       pageInfo: {...},
     },
   },
]

Note that the add operation does not include the friends field on record 123, because no scalar fields are fetched. The data for the friends field is supplied in a subsequent connection operation.

EDIT: I updated the issue description with a modified version of this proposal.

@eyston
Copy link

eyston commented Mar 21, 2016

sorry, more questions...

  • Where do you envision the translation from GraphQL query + payload into GraphMode happening?
  • Any thoughts on how this affects tracking queries -- or is that not related at all? When I say tracking I'm thinking about two scenarios which may not really be tracking (I am hazy on this part of Relay): diff'ing a query and intersecting the fat query. For diff'ing a query isn't type information necessary due to polymorphic fields? For instance just because field age is in the store doesn't mean ... on User { age } is satisfied by the store? or maybe it does? And the second thing is the fat query -- if I insert data directly into the store without the corresponding query wouldn't it be at risk of not being considered intersecting the fat query which could lead to stale data?

thanks!

@josephsavona
Copy link
Contributor Author

Where do you envision the translation from GraphQL query + payload into GraphMode happening?

For the foreseeable future this transform would happen on the client, possibly on another thread.

Any thoughts on how this affects tracking queries -- or is that not related at all? ... if I insert data directly into the store without the corresponding query wouldn't it be at risk of not being considered intersecting the fat query which could lead to stale data?

Yes, inserting data w/o a query could lead to stale data with the current approach to diffing and mutations. To prevent this, initially only Relay internals will use GraphMode, and we will use a pre/post traversal to update tracked queries along with every payload. We're also exploring an alternate approach to constructing mutation queries that avoids the need to store tracked queries.

@josephsavona josephsavona changed the title Implement "GraphMode" response writer Implement internal "GraphMode" response writer Mar 21, 2016
@wincent
Copy link
Contributor

wincent commented Mar 21, 2016

operations similar to those in JavaScript Object Notation (JSON) Patch but with semantics specifically tailored to describing GraphQL/Relay object graphs

I'm a bit worried about the potential confusion caused by making something that is similar-but-still-different. What's the value of getting rid of $data/$type special keys (but still keeping $ref) if it's only to move to something that isn't actually JSON Patch? We've gotten rid of two special keys, but only at the cost of adding two custom op values.

@eyston
Copy link

eyston commented Mar 21, 2016

We're also exploring an alternate approach to constructing mutation queries that avoids the need to store tracked queries.

I'm interested in this if you get to the point of something to share.

@wincent
Copy link
Contributor

wincent commented Mar 21, 2016

I'm interested in this if you get to the point of something to share.

@eyston So as not to crowd this issue, I've written something up in #973.

josephsavona referenced this issue Mar 29, 2016
Reviewed By: wincent

Differential Revision: D3068427

fb-gh-sync-id: 0cd95a14fc0c180f0808607faae4a47c8588935c
fbshipit-source-id: 0cd95a14fc0c180f0808607faae4a47c8588935c
ghost pushed a commit that referenced this issue Apr 1, 2016
Summary: Builds on D3068427 to implement a transform function that accepts as input a query and "tree" payload and outputs a GraphMode payload. Part of #967.

Reviewed By: wincent

Differential Revision: D2987074

fb-gh-sync-id: 4607921d997af5ec0cbb66b41336fd2ab27d9d52
fbshipit-source-id: 4607921d997af5ec0cbb66b41336fd2ab27d9d52
josephsavona referenced this issue Apr 6, 2016
Summary: Preparation diff for GraphMode. The current writer traverses over query results with `RelayNodeInterface.getResultsFromPayload`; this method has the side effect of generating a new `dataID` for id-less root records. In the new GraphMode writer root ids aren't generated until a `putRoot` record is encountered and the dataID does not need to be generated in `getResutlsFromPayload`. This diff will reduce the changes when we actually switch on the GraphMode writer.

Reviewed By: wincent

Differential Revision: D3139655

fb-gh-sync-id: f4d579b24a23fda732e90063a965d09487b0fb20
fbshipit-source-id: f4d579b24a23fda732e90063a965d09487b0fb20
josephsavona referenced this issue Apr 6, 2016
Summary:Preparation diff for GraphMode. This simplifies a few aspects of the current writer implementation and tests in order to reduce the number of changes when switching to GraphMode:

- Change `RelayRecordStore#hasDeferredFragmentData` (and associated setters) to `hasFragmentData`, and use this for both deferred fragment tracking *and* edge fragment tracking. This means `RelayFragmentTracker` is no longer needed. This simplifies the handling of fragment/data tracking in GraphMode, allowing GraphMode to deal with one interface for tracking instead of two.
- Update various callers of `hasDeferredFragmentData`
- Update writer call sites to not pass a fragment tracker
- Update the writer to not record tracked queries for non-root client records: RelayQueryTracker ignores these calls anyway, so this doesn't change the outward behavior (however the tests were expecting the *calls* to the tracker, not what was actually tracked).

Reviewed By: wincent

Differential Revision: D3140133

fb-gh-sync-id: 13bb235633d30f2dbb055c045724e4e88a0a0a32
fbshipit-source-id: 13bb235633d30f2dbb055c045724e4e88a0a0a32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants