Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursively nested objects #91

Closed
petrbela opened this issue Sep 10, 2015 · 44 comments
Closed

Recursively nested objects #91

petrbela opened this issue Sep 10, 2015 · 44 comments

Comments

@petrbela
Copy link

Suppose we have a forum where users can reply to comments recursively:

type RootQuery {
  message(id: String): Message
  messages: [Message]
}

type Message {
  id: String!
  content: String
  comments: [Message]
}

How could the client query for the nested comments?

This only goes three levels deep:

{
  messages {
    comments {
      id
      content
      comments {
        id
        content
        comments {
          id
          content       
        }
      }
    }   
  }
}

Would a consecutive query starting at the last comment be the right solution?

{
  message as comment($commentId) {
    comments {
      id
      content
      comments {
        ...
      }
    }
  }
}

Original SO post

@vladar
Copy link

vladar commented Sep 10, 2015

Saw similar question in Relay repo: facebook/relay#246

@petrbela
Copy link
Author

Yeah that's what I'd thought.

@nmn
Copy link

nmn commented Jan 21, 2016

@petrbela Can we re-open this issue, as that issue only talks about the relay side of things. I would like to know how to accomplish recursive object types while creating a graphQL server.

@ghost
Copy link

ghost commented Mar 13, 2016

@nmn +1

@petrbela petrbela reopened this Mar 13, 2016
@fson
Copy link
Contributor

fson commented Mar 17, 2016

@nmn You can define recursive object types by passing a thunk (a function without arguments) that returns the fields:

const Message = new GraphQLObjectType({
  name: 'Message',
  fields: () => ({
    comments: {
      type: new GraphQLList(Message),
      resolve(message) {
        return getComments(message.id);
      },
    },
    content: {
      type: GraphQLString,
    },
  }),
});

You can even define mutually recursive objects and GraphQL schema is smart enough to lazily call the fields functions to resolve them when needed.

@petrbela
Copy link
Author

@fson In other words, in the query, you only ask for "one level" of message's comments

{
  message(1) {
    id
    comments {
      content
    }
  }
}

And the comments field could return all nested comments

{
  "message": {
    "id": 1,
    "comments": [
      {
        "content": "First comment",
        "comments": [
          {
            "content": "Nested comment"
          }
        ]
      }
    ]
  }
}

Is that so? It seems to be against logic since the reply structure doesn't exactly mirror the query. Does the getComments method return the whole nested tree? How would the GraphQL executor know to expand the comments field once it steps into resolving the first comment? The query only tells the executor to ask for the content field.

@fson
Copy link
Contributor

fson commented Mar 17, 2016

@petrbela Only asking for one level of nested comments would return one level of nested comments. So you would explicitly make a query that fetches comments as deep as you need to go. The GraphQL executor resolves the fields recursively from top to bottom, so getComments(message.id) would simply return the list of comment messages for the given message. In other words, there isn't anything specific needed on the server to support these nested queries. The only trick is to define the fields with a thunk, to avoid problems with the execution order of the type definitions.

I made a gist with a working example of the recursive messages/comments schema, if you want to play with it: https://gist.github.com/fson/a14f5edf2ae3fb5294dd

@petrbela
Copy link
Author

Thanks for the comments.

The main question was if you could return a recursively nested object (without knowing the depth) in a single query.

Seems the only way would be to send a query with a lot of nesting (at least as much as the deepest reply tree), and GraphQL would expand each branch until it reaches a null object (leaf) in each of the branches. This should theoretically work with Relay, too, if I'm not mistaken.

Or do you have any other idea?

@fson
Copy link
Contributor

fson commented Mar 17, 2016

Yeah, as far as I know, it's not possible to write a recursive query of unlimited depth. Trying to do so using fragments will result in an error "Cannot spread fragment "MessageFragment" within itself.":

fragment MessageFragment on Message {
  content
  comments {
    ...MessageFragment
  }
}

This is because fragments are fully expanded before execution, and this fragment would result in infinite expansion.

I think a query with deep (but finite) nesting would be the only way to query a hierarchy like that. You can make that query using conditional fragments like @josephsavona suggested in the related Relay issue and Relay will expand it to a such deeply nested query.

I also think this is rarely a problem for real life UIs, because you wouldn't want to fetch infinite layers of comments anyway, but instead fetch some number of levels first and then have a way for the user to load more comments from a nested tree, like Reddit does.

@rmosolgo
Copy link

it's not possible to write a recursive query of unlimited depth

It's even explicitly forbidden by the spec!

The graph of fragment spreads must not form any cycles including spreading itself. Otherwise an operation could infinitely spread or infinitely execute on cycles in the underlying data.

http://facebook.github.io/graphql/#sec-Fragment-spreads-must-not-form-cycles

@leebyron
Copy link
Collaborator

leebyron commented Apr 7, 2016

Closing this issue since I believe it is answered. We explicitly do not allow fragment recursion or the querying of recursive structured since it can result in un-bound results.

Typically you also never really want to require something of infinite depth anyhow, so it's good to put a bound on these things.

A pattern I often see using fragments:

{
  messages {
    ...CommentsRecursive
  }
}

fragment CommentsRecursive on Message {
  comments {
    ...CommentFields
    comments {
      ...CommentFields
      comments {
        ...CommentFields
      }
    }
  }
}

fragment CommentFields on Comment {
  id
  content
}

this lets you define the fields you want at each depth level in one place, and then also explicitly state the "recursive" structure you want - in the above example it's 3 levels deep, but you can see how you could easily add more.

@jquense
Copy link

jquense commented Oct 19, 2016

is really there no other option here for fetching nested data, where the depth is unknown? There are plenty of normal situations where data is nested but definitely not infinitely recursive, such as the the comment case. Are we just supposed to tell users that they can't comment past a depth of 3?

@leebyron
Copy link
Collaborator

leebyron commented Oct 19, 2016

The issue is that GraphQL has no way of actually confirming that the data it encounters is not infinitely recursive. Without an ability to make this guarantee, allowing for cyclic queries could in fact expose infinite execution queries which could be used as an attack vector.

Also - infinite isn't really the only issue here. The other issue is reasonable limits per your application. For the same reason that we suggest using pagination to limit the amount of data fetched when accessing a list of unknown length, we also should be cautious not to fetch at an unknown depth and result in accessing way more information than desired.

Take Reddit as an example since it's close to this hypothetical nested comments example. They don't actually query to an unknown depth when fetching nested comments. Instead, they eventually bottom out with a "show more comments" link which can trigger a new query fetch. The technique I illustrated in a prior comment allows for this maximum depth control.

There's no magic number of 3, the technique can be expanded or shrunken for whatever maximum depth makes sense. For example, here's the same example expanded to fetch up to 8 levels deep:

{
  messages {
    ...CommentsRecursive
  }
}

fragment CommentsRecursive on Message {
  comments {
    ...CommentFields
    comments {
      ...CommentFields
      comments {
        ...CommentFields
        comments {
          ...CommentFields
          comments {
            ...CommentFields
            comments {
              ...CommentFields
              comments {
                ...CommentFields
                comments {
                  ...CommentFields
                }
              }
            }
          }
        }
      }
    }
  }
}

fragment CommentFields on Comment {
  id
  content
}

@jquense
Copy link

jquense commented Oct 19, 2016

Without an ability to make this guarantee, allowing for cyclic queries could in fact expose infinite execution queries which could be used as an attack vector.

sure, I understand why it would be a problem, I'm suggesting that being able to tell GraphQL that it has that guarantee would be a great feature. The alternative of copy/pasting a fragment nth times is not the best or even feasible in some cases.

I also appreciate the concern handled by "show more comments" but I don't know that that its a good idea for GraphQL to make that choice for our app, in the same way it doesn't require I paginate all lists above a certain length.

@leebyron
Copy link
Collaborator

I agree with you. For now this limitation continues to exist for this balance of simplicity for the common case and protection from inadvertent infinite recursion. Perhaps we can revisit it in the future if a solid proposal to do what you suggest is compelling.

@joelgriffith
Copy link

joelgriffith commented Sep 20, 2017

Commenting for any further Bingers/Jeevers:

const recursiveType = new GraphQLScalarType({
  name: `MyFunAntiGraphQLRecursiveType`,

  description: `Bails out of GraphQL conventions and just returns the straight JSON Object.`,

  serialize(_) {
    return _;
  },
});

Of course I'd highly, highly, highly (very highly) recommend not doing as such. But if you need an escape hatch, welcome to LOST

@mattyclarkson
Copy link

strawman: would be useful to have the recursion expanded and limited by the query language:

{
  messages {
    ...Comments
  }
}

fragment Comments on Message {
  id
  content
  comments {
    ...Comments * 3
  }
}

The * 3 expands it three levels deep, or some other bikeshedded syntax.

Probably hard to implemen. If there are multiple recursions or anything else in the spec, who knows. The parser could read the fragment, and the count. Once it has built out the fragment, ignoring the recursion, it could copy and stamp the nodes down the tree. Haven't really looked much into the GraphQL parser to figure out if how it'd be implemented 🤔

@idibidiart
Copy link

This may be a silly counter argument but such a syntax would encourage a ridiculous level of recursion :) Currently, I’d have to copy paste a selection 150 times (so I’m bound to think twice about doing it) whereas typing * 150 would be far easier. The current syntax (or lack of one) almost serves as a natural deterrent. If one has need for that level of recursion then batched resolution becomes really important (to work around the N+1 query proliferation problem: see graphql-resolve-batch on github/npm.

@zpeterg
Copy link

zpeterg commented Feb 3, 2018

As a note to others that may be doing nested recursions, like using the example per @leebyron: Make sure that each level has a unique ID, including unique against any copy objects that use the same structure. Eg., if you copy an object that is constructed using this, and you don't change each ID, Apollo Client cache can get the sub-elements mixed up (because the typename on all is the same). This is not a bug, of course, but it can be quite perplexing to unravel.

@ghost
Copy link

ghost commented Oct 12, 2018

thanks for the suggestion. I ended up doing something similar, with a field for path of a node in a tree and this path will be calculated by client based on its previous query result.

I don't understand from your suggestion, how client can know 'level' in advance to specify in the field of the query?

@aptfx
Copy link

aptfx commented Oct 13, 2018

@pgvidhya
The level/indention field is not meant for input! If you get a sequence of nodes in topological order you only need the nesting level to reconstruct the hierarchy faithfully. You can also see this when rendering - you don’t need to nest comments - you just need to indent them properly.

@mhuebert
Copy link

A practical example for this can be seen in GitHub's endpoint for listing all the files in a repo, which supports limited recursion: https://developer.github.com/v3/git/trees/#get-a-tree-recursively

This query isn't possible using their GraphQL api, which is a pity.

(the following graphql fetches results up to 4 levels deep)

fragment entryFields on TreeEntry {
  name
  file:object {
    ...on Blob {
      byteSize
    }
  }
}

fragment treeFields on Tree {
      entries {
        ...entryFields 
        object {
          ...on Tree {
            entries {
              ...entryFields 
              object {
                ...on Tree {
                  entries {
                    ...entryFields  
                    object {
                      ...on Tree {
                        entries {
                          ...entryFields
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }



query { 
  viewer {
    repository(name: "chia") {
      object(expression:"master:") {
        ...treeFields
      }
  }
}
}

@Zemnmez
Copy link

Zemnmez commented Jan 26, 2019

not going to re-open this one but I'd really like to be able to repeat some fragment to a specified depth. I'm doing some queries on the type system itself using introspection and it gets super gnarly

@coladarci
Copy link

We are re-writing an old JSONAPI in GQL where you can for ask questions who have followup questions configured by our customers. Our old rails api with JSONApi serializers allowed recursion and we took out boxes when someone managed to get into an infinite recursion. Personally, we applaud the strong stance of not considering this as a valid api design. and we are happy to have this pattern:

fragment QuestionsRecursive on Question {
  ...QuestionFields
  ... on ChoiceQuestion {
    options {
      followupQuestions {
        ...QuestionFields
        ... on ChoiceQuestion {
          options {
            followupQuestions {
              ...QuestionFields
              ... on ChoiceQuestion {
                options {
                  followupQuestions {
                    ...QuestionFields
                    ... on ChoiceQuestion {
                      options {
                        followupQuestions {
                          ...QuestionFields
                          ... on ChoiceQuestion {
                            options {
                              followupQuestions {
                                ...QuestionFields
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Was a touch hokey at the beginning, but protects us in a way we appreciate, given our past.

@stalniy
Copy link

stalniy commented Apr 4, 2019

Another way to get tree structures is to use id and parentId fields. Later compute tree on app level (anyway you need recursive loop to iterate over deeply nested structure).

{
  messages {
    comments {
      id
      parentId: replyTo
      content
    }   
  }
}

@joepie91
Copy link

joepie91 commented Apr 14, 2019

Another practical example of where you really, really want recursion of unlimited depth:

I'm currently trying to use GraphQL for a data access layer involving system metrics and statistics, tying together a bunch of disparate data sources, and one of the things I need to represent is a block device tree, like you might get from running lsblk. No browser or network service is involved here at any point.

The depth of child nodes is unknown upfront (there may be various layers of RAID, LVM, partitions, and so on), but pretty much guaranteed to be within reasonable limits, given the usecase. There does not currently seem to be a reasonable way to express "get me all the block devices and all of their children recursively".

I agree that unlimited recursion is almost never what you want, and any such feature should probably involve an explicit unsafe marker to communicate that to people, but there is a non-zero amount of valid usecases for it.

@aptfx
Copy link

aptfx commented Apr 15, 2019

@joepie91
In which order do you consume them?

There is no reason the block device tree could not just be a sequence of devices with hierarchical path information. Even a hierarchical JSON object is just a sequence of bytes interpreted as tree - so there is nothing fundamentally different.

What do you win by having a “recursive flag”? Which queries get possible then which are not now?

@joepie91
Copy link

joepie91 commented Apr 15, 2019

@aptfx I consume them recursively. Representing the tree as a linear sequence would require a) linearizing it on the data source side (since it's provided as recursively nested data) and b) special-casing the query logic on the receiving side to reassemble it back into a recursive tree. That's a lot of unnecessary extra complexity.

The query that becomes possible is "give me all block devices, with all of their children, with all of their children, etc." at unlimited depth, without unnecessary linearization/delinearization steps.

@aptfx
Copy link

aptfx commented Apr 15, 2019

@joepie91
You consume them "recursively" most of the time means depth first order sometimes breadth first order. If you really look at it - somewhere in your code is an order to it. Sometimes the answer is there and not in extending a spec like GraphQL for unsafe edge cases.

It's really simple to implement those in a generic way. If you really want to get around this step - you can just pump out and serialize/parse JSON.

@joepie91
Copy link

joepie91 commented Apr 15, 2019

Outputting JSON sidesteps the validation features in GraphQL, so that is not a realistic option.

And again: I am well aware that you can linearize the data. The point here is that that adds needless data-wrangling complexity, given that at no point in the chain is a linear representation the sensible/native one. I would have to do two data transformation steps purely to appease GraphQL's limitations.

And even if unlimited recursion is an edge case, recursion itself certainly isn't; right now you need unergonomic fragment hacks for that, which makes no sense considering that tree-like data is quite common. It's not a big stretch to add "unlimited, if you're okay with the risks" to that design goal, hence why I'm mentioning my usecase.

I'm honestly not really interested in further arguing about how I could hypothetically represent my data - I understand that quite well. I'm merely here to contribute a data point to consider for a more ergonomic API design.

@aptfx
Copy link

aptfx commented Apr 15, 2019

@joepie91
No problem - you provided a data point that is in no way different to the examples before. It just is your own one. As before I provided help to practically solve that problems for yourself.

Ftw. You will have to live with the fact that others might have and make public different opinions about proposals - like allowing unlimited recursion in the spec. I think it is a really bad idea. Sorry.

@joepie91
Copy link

I am well aware that you think it's a bad idea. That's not really relevant unless you can present a well-reasoned costs/benefits analysis, which you haven't - you've just been going on about variations of "surely you can linearize it somehow", which I've repeatedly explained is not a reasonable option in my case.

Your comments are in no way constructive nor helpful; they're just pushing your own view on things without bothering to understand or acknowledge somebody elses requirements. You're arguing from the assumption that there can't possibly be a valid usecase for tree-shaped data representations, and if somebody thinks they have one, they must just be missing something.

All that sort of attitude does is piss people off, so please stop doing that.

I'd be happy to discuss the details of the usecase and how to support it safely with somebody who is genuinely interested in understanding it and figuring out an ergonomic solution (ie. probably the maintainers), to be clear. I just don't care one bit for this "you must not realize that your approach is wrong" grandstanding.

@aptfx
Copy link

aptfx commented Apr 15, 2019

@joepie91
I hope you realize that you're getting personal and disrespectful here and I give you the chance to get back to a professional attitude again. Just because you did not like my suggestions does NOT mean that they were not offered as a genuine try to help you in your case!

This issue got discussed from different views. Your example does not provide anything new to the table - that's not to make it bad it's just not really different to the examples before. You didn't provide anything new and convincing above that. This is not to denigrate you but to motivate you to get a more convincing use case than the one here. Or even better: a solution! This is still open source...

  • There is a (syntactically cumbersome) solution for limited recursion (as outlined above). Ideas about sugaring that up we're mentioned and certainly are an interesting idea.
  • Rethinking the model is another option - this works for many people and often solves different problems too
  • then there is the JSON scalar workaround - which is indeed technically possible - though obviously with its own negatives

Supporting arbitrary deep recursion has several problems:

  • Danger of cycles (DOS attack possible)
  • Even without cycles there easily can be dangers of DOS
  • Even sequences should be limited/paged

How do you want to solve this problems? As far as I tried it, it always comes back to setting appropriate limits. If I understood you right, you propose something like an "unsafe" flag one can use to buy in. The problem here is, that such flags tend to get misused and the effect will be thousands of vulnerable and fragile GraphQL services. I hope you can understand my (and others) guardedness against this path.

@joepie91
Copy link

Pointing out disrespectful behaviour from your side and indicating that I am not going to further engage in it, is not 'disrespectful'. It's calling out unacceptable behaviour.

Even sequences should be limited/paged

No, they shouldn't, necessarily. Not everybody is building a startuppy webapp. In fact, in my very first post in this thread, I explicitly pointed out that this concerns a data access layer, and that no network service or browser is involved at any point. This makes pagination absolutely irrelevant to my usecase.

If I understood you right, you propose something like an "unsafe" flag one can use to buy in. The problem here is, that such flags tend to get misused and the effect will be thousands of vulnerable and fragile GraphQL services.

There's quite a bit of precedent for unsafe flags by now; Rust uses them (unsafe blocks), React uses them (dangerouslySetInnerHTML), Node uses them (Buffer.allocUnsafe). In all of these cases, the flag is practically successful at conveying to the broader userbase that the operation they are doing carries safety risks.

Naturally this is not a 100% safeguard, but that is true for anything; including for the approaches that can already be implemented today, such as the JSON workaround (where the DoS vector would simply exist in custom code instead of in the GraphQL schema).

In other words: allowing this but explicitly marking it as unsafe would not make things worse; it could even be argued that it makes things better, by adding the explicit unsafe marker instead of expecting people to infer that from the logic.

Therefore:

and the effect will be thousands of vulnerable and fragile GraphQL services.

I do not find this to be a credible concern.

@IvanGoncharov
Copy link
Member

IvanGoncharov commented Apr 15, 2019

@petrbela @joepie91 Let's all be respectful, according to our new Code of Conduct
https://github.com/graphql/foundation/blob/master/CODE-OF-CONDUCT.md

You both have valid points about this feature request, so the decision should be made based on our guiding principles:
https://github.com/graphql/graphql-spec/blob/master/CONTRIBUTING.md#guiding-principles

I think "Favor no change", "Simplicity and consistency over expressiveness and terseness" are the most relevant for this case.

Also, it partially covered by "Enable new capabilities motivated by real use cases":

common scenarios are more interesting than rare ones

At the same time if someone wants to be a new champion for this issue he/she should open a new issue and present a more formal proposal explaining how it fits our guiding principles.

In general, commenting on closed issues doesn't change their status and is not tracked by maintainers.

@wmertens
Copy link

wmertens commented Apr 25, 2019

@joepie91 I don't think your data access layer argument is very strong:

  • You're building a GraphQL client anyway, which is doing much more processing than just some object serialization
  • The array => object processing is a bit unwieldy, granted, but not that different from what would need to happen anyway, just with an extra array thrown in there

I don't think that's enough to counterbalance the implications of full recursion.

I wonder if a directive could not provide the (de)serialization? The client would have to support such a directive too. It would allow skipping the intermediate array representation in the app code. E.g. adding @recursive would change the schema to be a list with extra fields to indicate where a child goes (e.g. always stored under key children, depth: 1 for children, depth: -2 to exit two nested objects, depth: 0 or missing for siblings)

PS since you're here: I really liked your points of view on JWT, I agree 100% and have often linked this to people :)

@MardanovTimur
Copy link

Hello to all! I worked on a problem with recursive queries in GraphQL. And this prompted me to make an @recursive directive with an indication of the depth parameter (depth), as @wmertens wrote about this earlier. But I did it before the topic with the recursive query began to evolve. I implemented the directive in the library graphene (graphql-core, python), since I am mostly a python developer. I am very interested in this topic, and I can share my realization with you. I would like to join the project graphql-spec and graphene as an employee. The commit with the implementation was made about a month ago. Sorry for my bad english.
A couple of lines with the implementation

def insert_recursive_selection(selection, depth, frame=[]):
    def insert_in_frame(selection, paste_selection, frame=frame):
        if frame:
            relay_frame = frame.pop(0)
        else:
            # remove directive
            selection.directives = []
            paste_selection.directives = []
            # return inner selection
            returnable_selection_set = selection.selection_set
            # insert in depth
            returnable_selection_set.selections.append(paste_selection)
            return returnable_selection_set
        for selection in selection.selection_set.selections:
            if selection.name.value == relay_frame:
                return insert_in_frame(selection, paste_selection, frame)
    #  remove_directive(selection)
    for counter in range(int(depth)):
        copy_selection = deepcopy(selection)
        copy_frame = deepcopy(frame)
        selection = insert_in_frame(selection, copy_selection, frame)


def build_recursive_selection_set(ctx, directive, selection):
    depth_size = directive.arguments[0].value.value
    is_relay = relay_node_check(selection)
    if is_relay:
        insert_recursive_selection(selection, depth_size, ['edges', 'node'])
    else:
        insert_recursive_selection(selection, depth_size)

It works with relay schema too

@MardanovTimur
Copy link

Directive recursive looks like this:

query recursiveTestQuery {
     model (first: 30) @recursive(depth: 5) {
           id
           model_name
     }
}

@mluis
Copy link

mluis commented Jul 24, 2019

Would it be possible with arbitrary parameters? Something like:

query recursiveTestQuery {
     model (first: 30) @recursive(model_name: 'leaf') {
           id
           model_name
     }
}

@leebyron
Copy link
Collaborator

I’m going to lock this issue since it has been closed for years and still generating comments that are not being tracked (since closed)

As pointed out in the original discussion, there is room to propose and champion an RFC to add this functionality to the spec, however currently there are work-arounds documented above. Next steps for someone who feels strongly about this feature is to open a new PR or Issue with an RFC to discuss.

@graphql graphql locked as too heated and limited conversation to collaborators Jul 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests