Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain how "in-between" resources for the "include" parameter should be handled #497

Closed
gmta opened this issue Mar 21, 2015 · 24 comments · Fixed by #637
Closed

Explain how "in-between" resources for the "include" parameter should be handled #497

gmta opened this issue Mar 21, 2015 · 24 comments · Fixed by #637
Milestone

Comments

@gmta
Copy link

gmta commented Mar 21, 2015

The format documentation states the following:

In order to request resources related to other resources, a dot-separated path for each relationship name can be specified:

GET /articles/1?include=comments.author

Note: A request for comments.author should not automatically also include comments in the response.

Now if the relation comments.author references to resources of type people, how do I know that the resources in the included key reference the comments.author relationship? In other words, how would the response look like if I were to execute the following request:

GET /articles/1?include=comments.author,author

Given that there exists a link called author that references a resource of type people. How would I differentiate between the two relations when I do not have any information about the comment resources that live in between the article and people resources?

I could not find anything explaining how I should handle such a request and I think that this paragraph could include an example.

@bintoro
Copy link
Contributor

bintoro commented Mar 21, 2015

I believe the articles resources would look like this:

{
  "type": "articles",
  "id": "1",
  "title": "Rails is Omakase",
  "links": {
    "author": {
      "self": "/articles/1/links/author",
      "related": "/articles/1/author",
      "linkage": { "type": "people", "id": "9" }
    },
    "comments.author": {
      "self": "/articles/1/comments/links/author",
      "related": "/articles/1/comments/author",
      "linkage": [{
        "type": "people",
        "id": "11"
      }, {
        "type": "people",
        "id": "12"
      }]
    }
  }
}

We do need an example of this in the spec, for sure. I'm trying to set up a separate section for advanced examples. This one would fit right in.

@gmta
Copy link
Author

gmta commented Mar 21, 2015

Right, but wouldn't that require more changes to the format? For example, if I look at the definition for links I get:

The value of the "links" key is a JSON object (a "links object") that represents related resources, keyed by the name of each association.

The word "association" appears one other time in the entire format description - in an unrelated note.

I'm new to JSON API and although I like the format, I find it hard to pinpoint definitions such as these.

@bintoro
Copy link
Contributor

bintoro commented Mar 21, 2015

@gmta, you're right. Definitions and terminology have not been strong points of JSON API, but lately there's been huge progress towards fixing all that. Let's keep the issue open until this case has been addressed, too.

As for pinpointing definitions, I'm planning a PR that overhauls the formatting of all primary definitions, but I can't submit it right now with so many PRs pending.

@bintoro
Copy link
Contributor

bintoro commented Mar 22, 2015

So, I started working on this, and it turns out it's not so simple. I think this question should be milestoned for 1.0 (unless there's already a definitive solution I'm not aware of).

What if we have GET /articles/1?include=comments,comments.author?

Should the comments.author relationship appear as comments.author within the primary articles, as author within included.comments, or both?

It seems a bit complicated if the presence of data[...]/links/comments.author is contingent on the (non-)inclusion of comments. On the other hand, duplicating the information by default is not desirable either.

One could also expect the outcome to be governed by the fields param (related issue #476).

To solve this without considering fields, we could require that the non-included "in-between" resource must nevertheless appear in included, but those objects would only contain the necessary relationship information. This way, the payload could be structured in a uniform manner regardless of what was requested for inclusion.

Example:

GET /articles/1?include=author,comments.author (no comments requested)

{
  "data": [{
    "type": "articles",
    "id": "1",
    "title": "JSON API paints my bikeshed!",
    "links": {
      "author": {
        "linkage": { "type": "people", "id": "9" }
      },
      "comments": {
        "linkage": [
          { "type": "comments", "id": "5" },
          { "type": "comments", "id": "12" }
        ]
      }
    }
  }],
  "included": [{
    "type": "people",
    "id": "9",
    "first-name": "Dan",
    "last-name": "Gebhardt",
  }, {
    "type": "people",
    "id": "11",
    "first-name": "Bin",
    "last-name": "Toro",
  }, {
    "type": "people",
    "id": "12",
    "first-name": "Leeroy",
    "last-name": "Jenkins",
  }, {
    "type": "comments",
    "id": "5",
    "links": {
      "author": {
        "linkage": { "type": "people", "id": "11" }
      }
    }
  }, {
    "type": "comments",
    "id": "12",
    "links": {
      "author": {
        "linkage": { "type": "people", "id": "12" }
      }
    }
  }]
}

The important thing here is that the included comments above do not contain the comment body or any other info apart from the author relationship.

GET /articles/1?include=author,comments,comments.author would return a similar document but with full comment representations.

While this system would get rid of the dotted-notation "deep relationships" within links objects, it is somewhat incompatible with default includes. Since data[...]/links now contains a comments member instead of comments.author, how is the client to know if the comments have been automatically included in full?

@tkellen
Copy link
Member

tkellen commented Mar 22, 2015

See http://github.com/endpoints/example for an actual working implementation of this.

The following is not fully compatible with RC3, but it shows how I think this should work:

GET /authors/1

{
  "data": {
    "id": "1",
    "name": "J. R. R. Tolkien",
    "date_of_birth": "1892-01-03",
    "date_of_death": "1973-09-02",
    "type": "authors",
    "links": {
      "books": "/authors/1/books",
      "books.chapters": "/authors/1/books.chapters",
      "self": "/authors/1"
    }
  }
}

GET /authors/1?include=books

{
  "data": {
    "id": "1",
    "name": "J. R. R. Tolkien",
    "date_of_birth": "1892-01-03",
    "date_of_death": "1973-09-02",
    "type": "authors",
    "links": {
      "books.chapters": "/authors/1/books.chapters",
      "books": {
        "type": "books",
        "id": [
          "1",
          "2",
          "3",
          "11"
        ]
      },
      "self": "/authors/1"
    }
  },
  "included": [
    {
      "id": "1",
      "date_published": "1954-07-29",
      "title": "The Fellowship of the Ring",
      "type": "books",
      "links": {
        "series": {
          "type": "series",
          "id": "1",
          "resource": "/series/1"
        },
        "author": {
          "type": "authors",
          "id": "1",
          "resource": "/authors/1"
        },
        "chapters": "/books/1/chapters",
        "firstChapter": "/books/1/firstChapter",
        "stores": "/books/1/stores",
        "self": "/books/1"
      }
    },
    //  ...
    {
      "id": "11",
      "date_published": "1937-09-21",
      "title": "The Hobbit",
      "type": "books",
      "links": {
        "series": {
          "type": "series",
          "id": "null"
        },
        "author": {
          "type": "authors",
          "id": "1",
          "resource": "/authors/1"
        },
        "chapters": "/books/11/chapters",
        "firstChapter": "/books/11/firstChapter",
        "stores": "/books/11/stores",
        "self": "/books/11"
      }
    }
  ]
}

GET /authors/1?include=books.chapters

{
  "data": {
    "id": "1",
    "name": "J. R. R. Tolkien",
    "date_of_birth": "1892-01-03",
    "date_of_death": "1973-09-02",
    "type": "authors",
    "links": {
      "books": "/authors/1/books",
      "books.chapters": {
        "type": "chapters",
        "id": [
          "1",
          // ...
          "289"
        ]
      },
      "self": "/authors/1"
    }
  },
  "included": [
    {
      "id": "1",
      "title": "A Long-expected Party",
      "ordering": 1,
      "type": "chapters",
      "links": {
        "book": {
          "type": "books",
          "id": "1",
          "resource": "/books/1"
        },
        "self": "/chapters/1"
      }
    },
    // ...
    {
      "id": "289",
      "title": "The Last Stage",
      "ordering": 19,
      "type": "chapters",
      "links": {
        "book": {
          "type": "books",
          "id": "11",
          "resource": "/books/11"
        },
        "self": "/chapters/289"
      }
    }
  ]
}

GET /authors/books?include=books,books.chapters

{
  "data": {
    "id": "1",
    "name": "J. R. R. Tolkien",
    "date_of_birth": "1892-01-03",
    "date_of_death": "1973-09-02",
    "type": "authors",
    "links": {
      "books": {
        "type": "books",
        "id": [
          "1",
          "2",
          "3",
          "11"
        ]
      },
      "books.chapters": {
        "type": "chapters",
        "id": [
          "1",
          // ...
          "289"
        ]
      },
      "self": "/authors/1"
    }
  },
  "included": [
    {
      "id": "1",
      "date_published": "1954-07-29",
      "title": "The Fellowship of the Ring",
      "type": "books",
      "links": {
        "series": {
          "type": "series",
          "id": "1",
          "resource": "/series/1"
        },
        "author": {
          "type": "authors",
          "id": "1",
          "resource": "/authors/1"
        },
        "chapters": "/books/1/chapters",
        "firstChapter": "/books/1/firstChapter",
        "stores": "/books/1/stores",
        "self": "/books/1"
      }
    },
    {
      "id": "2",
      "date_published": "1954-11-11",
      "title": "The Two Towers",
      "type": "books",
      "links": {
        "series": {
          "type": "series",
          "id": "1",
          "resource": "/series/1"
        },
        "author": {
          "type": "authors",
          "id": "1",
          "resource": "/authors/1"
        },
        "chapters": "/books/2/chapters",
        "firstChapter": "/books/2/firstChapter",
        "stores": "/books/2/stores",
        "self": "/books/2"
      }
    },
    {
      "id": "3",
      "date_published": "1955-10-20",
      "title": "Return of the King",
      "type": "books",
      "links": {
        "series": {
          "type": "series",
          "id": "1",
          "resource": "/series/1"
        },
        "author": {
          "type": "authors",
          "id": "1",
          "resource": "/authors/1"
        },
        "chapters": "/books/3/chapters",
        "firstChapter": "/books/3/firstChapter",
        "stores": "/books/3/stores",
        "self": "/books/3"
      }
    },
    {
      "id": "11",
      "date_published": "1937-09-21",
      "title": "The Hobbit",
      "type": "books",
      "links": {
        "series": {
          "type": "series",
          "id": "null"
        },
        "author": {
          "type": "authors",
          "id": "1",
          "resource": "/authors/1"
        },
        "chapters": "/books/11/chapters",
        "firstChapter": "/books/11/firstChapter",
        "stores": "/books/11/stores",
        "self": "/books/11"
      }
    },
    {
      "id": "1",
      "title": "A Long-expected Party",
      "ordering": 1,
      "type": "chapters",
      "links": {
        "book": {
          "type": "books",
          "id": "1",
          "resource": "/books/1"
        },
        "self": "/chapters/1"
      }
    },
    // ...
    {
      "id": "289",
      "title": "The Last Stage",
      "ordering": 19,
      "type": "chapters",
      "links": {
        "book": {
          "type": "books",
          "id": "11",
          "resource": "/books/11"
        },
        "self": "/chapters/289"
      }
    }
  ]
}

@tkellen
Copy link
Member

tkellen commented Mar 22, 2015

Whoops, our implementation doesn't include inter-linking within the included records, but it will soon (e.g. each book in included should list all chapter ids under the linkage when both are requested).

Should the comments.author relationship appear as comments.author within the primary articles, as author within included.comments, or both?

It doesn't matter if the associations appear separately or in both places--a client can resolve the relationships in either case. I suppose it would be good to provide a SHOULD or MUST picking one, though.

@bintoro
Copy link
Contributor

bintoro commented Mar 22, 2015

It doesn't matter if the associations appear separately or in both places--a client can resolve the relationships in either case.

Sure. I was trying to see if there's a better way to accomplish uniform representations in the various cases.

If we're sticking with dot-separated association names in links, then we need a decision on what to do when the client requests /foo/1?include=bar,bar.baz:

  • Require the inclusion of a bar.baz member under links in (each) foo.
    • The Good: Whatever is requested in include is guaranteed to appear in links.
    • The Bad: All bar-to-baz linkage items appear in the payload twice.
  • Allow (require?) the server to deliver the baz relationships only under the included bar objects.
    • The Good: Unique representation for each relationship; reduced payload size.
    • The Bad: Included relations are not guaranteed to be found in primary links.

Once there's an answer to this, I can put together a PR that explains it. Also, we need a place for an example, which is going to be too marginal to include in the base spec. @tkellen, what do you think about repurposing the Examples page to hold some advanced usage examples?

@tkellen
Copy link
Member

tkellen commented Mar 22, 2015

I think the first option is the one we want. I don't think:

The Bad: All bar-to-baz linkage items appear in the payload twice.

...is actually bad.

I also doubt that the payload size would appreciably decrease after gzipping in the second option.

I'm 👍 on having expanded examples.

I've yet to discuss this with @dgeb, @steveklabnik or @wycats, but I want to modify all the examples to use http://github.com/endpoints/fantasy-database soon (and probably move that repo under the json-api org), and to host a reference implementation of it (probably using endpoints). I also plan to provide a test suite that anyone can implement against using that reference database. I think that will go a long way to alleviate concerns about lack of examples.

@ethanresnick
Copy link
Member

So, my understanding of this has been different than @bintoro's and @tkellen's. I assumed it would work as follows:

  1. There would never be dot-separated paths under the links key, as having these gets messy. It raises the questions that @bintoro brought up and I can imagine those keys being misinterpreted as complex attribute links by newcomers to the spec. (Or constraining our ability to support complex attribute linking later.)
  2. Leave it up to the client to make a request that allows them to link the resources if necessary, as they may not need to.

So, for example, a request to /articles/?filter[newer-than]=....&include=comments.author would return:

  1. A "comments" key in each article's link object, but no "comments.author" key nor an "author" key under the "comments" key.
  2. People resources in the "included" section.

This means that the client can't link the authors to their comments, but this may not matter. E.g. the client may just want to populate a "Recent Commenters" widget and not care who commented on what.

Then, if the client does want to be able to join the authors to their comments, it simply must do include=comments,comments.author. Then the article resources in the primary data would again only include a "comments" key in their link objects, but those comments can be found in the "included" section. Once the client finds the comments, each comment an "authors" key in its link object that the client can use to find the authors in the "included" section as well.

@tkellen
Copy link
Member

tkellen commented Mar 23, 2015

@ethanresnick I have deployed applications using nested relations. It's very useful to reach into them without including the intermediary records. Also, the spec already mandates dot-notated links keys:

The value of the include parameter MUST be a comma-separated (U+002C COMMA, ",") list of relationship paths. A relationship path is a dot-separated (U+002E FULL-STOP, ".") list of relationship names. Each relationship name MUST be identical to the key in the links section of its parent resource object.

@ethanresnick
Copy link
Member

@tkellen I actually read that part of the spec as suggesting that dot-separated keys should not be used in links. It compares the links keys to relationship names not relationship paths, which are the dot-separated ones.

Re it being useful to reach into nested relations without including the intermediate records: I can see that. It seems like the main advantage would be simpler processing on the client side, and possibly a smaller payload (though the payload size can usually be mitigated with the fields parameter). Still, I'm not sure that the benefits of this easier processing and/or smaller payloads are worth introducing dot-separated keys for (assuming my interpretation above is correct and that dot-separated paths aren't already in the spec).

@hhware
Copy link
Contributor

hhware commented Mar 28, 2015

I would like to vote for a combination of points suggested above, and also discuss how the conflict/ambiguity of include/fields can be addressed (related: #476).

TL;DR I think it is best not to impose any implicit requirements on inclusion of related resources and instead rely on explicit include & fields from the query string or endpoint defaults. This will allow to use a generic server-side library implementing JSON API to serve use cases with custom requirements wrt relationships after some preprocessing of input.

If we're sticking with dot-separated association names in links, then we need a decision on what to do when the client requests /foo/1?include=bar,bar.baz:

  • Require the inclusion of a bar.baz member under links in (each) foo.
    • The Good: Whatever is requested in include is guaranteed to appear in links.
    • The Bad: All bar-to-baz linkage items appear in the payload twice.
  • Allow (require?) the server to deliver the baz relationships only under the included bar objects.
    • The Good: Unique representation for each relationship; reduced payload size.
    • The Bad: Included relations are not guaranteed to be found in primary links.

I would like to propose the following option: server should only return what has been explicitly requested either via query parameters include & fields or, in their absence, via their default values specified by the given API for each endpoint.

An extended description:

  • The client should explicitly specify what it wants and be fully responsible for making sense of the returned result.
  • The server should not be concerned with "traceability" of resources in included from resources in primary data.
  • Both relationship and attribute names can be listed in fields (related: Namespace of attributes and relationships #471).
  • The client should be allowed to request nested relationships without intermediaries -- I believe it should be a decision of a designer of a given API whether they are supported, this spec should not dictate this.
  • If include and/or fields are not explicitly specified in the request, the particular API may provide default values for them separately for each endpoint. E.g., the defaults for the endpoint authors from this comment are: include="", fields[authors]=name,date_of_birth,date_of_death,books,books.chapters.
  • The server should return what is requested (either via query string or via defaults), without adding in-betweens in full or truncated form, and without enforcing other conventions on inclusion of related resources.

Motivation:

  • IMHO, this approach is the most straightforward and easy to understand.
  • It would make it possible to return both lean and rich responses, depending on the way the client formulated the request.
  • It would allow specific API implementations to introduce their own conventions, while still using generic JSON API libraries.

I would like to illustrate the 3rd point with an example. Imagine a generic server-side JSON API library to fetch data and build responses, which follows the approach described above. Let the default for include be "", and for fields let it be all attributes and all relationships of a resource (no nested relationships by default). Suppose this library also validates

  • values of elements of include & fields for the given endpoint,
  • that there is no conflict between include & fields (specifically, requesting fields[TYPE] returns error if TYPE is neither a primary resource nor it is mentioned anywhere in include).

Imagine an API implementation which wishes to use this library. That API has a custom convention: for every relationship, either both link object and corresponding related resource (in included) are returned, or none of them. Some outer layer of that API's implementation could then parse the values of include and fields from the query string (or their defaults for that endpoint) and alter them so that the custom convention is satifsfied (if a relationship is not listed in include, exclude it from the link object by specifying custom files[TYPE], etc.) The resulting values of include &fileds can then be passed to the library, which would do the heavy lifting.

IMHO, adherence to explict values of include & fields should allow generic libraries to work with various custom requirements on relationships, including those suggested by @bintoro here and by @ethanresnick here.

Or constraining our ability to support complex attribute linking later

@ethanresnick, not sure I understand. Since attributes and relationships share namespace within a resource, could such a conflict even exist? For the example of authors.books.chapters, books is a relationship for authors, and chapters is a relationship for books. How could, say, complex attributes of books conflict with anything?

I would like to join others in requesting to milestone this issue for 1.0.

@ethanresnick
Copy link
Member

The client should explicitly specify what it wants and be fully responsible for making sense of the returned result.
The server should not be concerned with "traceability" of resources in included from resources in primary data.
The client should be allowed to request nested relationships without intermediaries -- I believe it should be a decision of a designer of a given API whether they are supported, this spec should not dictate this.

@hhware Sounds like we're on the same page about how this should work overall.

To your particular question about my remark on complex attributes: I didn't mean to suggest that using dot-separated links keys in include responses would cause a naming conflict, but just that it would give the use of dot-separation a particular meaning (i.e. a multi-level include). So, then, if we were to link complex attributes with dot-separated keys, the interpretation of those keys would get murkier (e.g. the client might have to do some parsing to figure out whether they point to a complex attribute or to a nested include that was included by the server by default). This whole conversation might be moot, though, since the final design that @bintoro and I proposed for complex attribute linking didn't use dot-separated keys anyway.

@ethanresnick
Copy link
Member

As a general note: if we wanted to make it easy for clients to use the raw payload to randomly access included resources by following linkage in the primary resources, we never would've represented included as an array. We would've represented it as something like this:

{
  //..
  "included": {
    "people": {
      "27": {
        // resource object for person 27
      },       
      "28": {
        // resource object for person 28
      }
    }, 
    "comments": {
       //id keyed hash of comment resources
    }
  }
}

Then "linkage": {"type": "people", "id": "27"} could've been very easily found.

The fact that we didn't use this representation implies to me that we expect that clients will loop over the included array and read it into a more useful data structure themselves. (And, of course, this is what Ember Data does.)

Therefore, when it comes to allowing dot-separated links keys or not, I don't think we should be particularly sensitive to the fact that including a dot-separated key in the links object (like "books.chapters" within each author's links) makes it slightly easier to connect the chapters directly to an author than does including only the direct relationship name in each resource's links object. The client is already responsible for finding a simple way to do these lookups.

That still leaves the question of payload size—i.e., not allowing dot-separated keys means the client has to include each intermediate resource if it wants to be able to connect the primary resources to the lowest-level included ones—but again:

  1. The payload size can usually be mitigated with fields. E.g. /authors/1?include=books,books.chapters&fields[books]=chapters.
  2. There are many cases where the client really doesn't need to connect the included and primary resources. For example, I was using JSON API to make a Sponsors page for a conference, and I wanted to list all the companies that provided space for the conference's events. So I did (essentially) /events?filter[isConferenceEvent]=true&include=venue.organization, to include the organization that owns the venue at which each event was held. That way I could put those organizations on the Sponsors page. But I didn't care which event or which venue each sponsor was associated with.

Meanwhile, the advantage of not allowing dot-separated keys for now is that it makes the meaning of the keys in links more straightforward: those keys represent relationships directly on the primary resource, and they're not affected by include. Also, if we have dot-separated links keys in GET responses, what does that imply for PATCH and POST?

I definitely haven't thought about this enough/considered all the consequences of either route, but my strong gut reaction is that dot-separated keys are really not a road we want to go down for the base spec.

@nevson
Copy link

nevson commented May 12, 2015

This might be a bit off topic but recently new technologies like Relay (incl. GraphQL), Falcor (Netflix), Datomic, ... are popping up which all of them allows to define queries to retrieve a complex data-tree within a single response containing all the data required e.g. by a UI component. For me this sounds very similar to what can be achieved via the include + fields request parameter (am I wrong with this assumption?).

Something that's not really clear for me is how to handle pagination / limitation of included resources in JSON API. In GraphQL you'd be able to fetch only a range of items from any "in-between" resource (see example find a user by handle and see some posts on https://graphql-ruby-demo.herokuapp.com/ ) via the find(number) expression.
How would I be able for a request like GET /authors/books?include=books,books.chapters to reduce the number of included chapters to only one or two entries (as not all results are required)? Using page[limit] as currently defined won't work in this case I guess.

Or would you say such complex data queries are not necessary or out of scope from JSON API and you'd be better off issuing several GET requests to fetch the resources separately and then build the result via code?

@gr0uch
Copy link
Contributor

gr0uch commented May 12, 2015

@nevson I think that is outside of the scope of this specification. That is what the filter query is for, you can define your own semantics for it.

@hhware
Copy link
Contributor

hhware commented May 15, 2015

@nevson, IMHO, necessity of several GETs is not implied. I think it is within the scope of the spec:

To paginate an included collection returned in a compound document, supply pagination links in the corresponding link object.

The spec just does not define the exact way of requesting it.

@tkellen
Copy link
Member

tkellen commented May 18, 2015

I'm not sure what to do about what I'm about to write, but I'm leaving it here for posterity.

I would argue that in any case where you cannot link the data in included back to the primary resource object(s) in data, you don't actually care about what is in data at all. In fact, what is "included" should be the primary data of the request.

From @ethanresnick's example:

There are many cases where the client really doesn't need to connect the included and primary resources. For example, I was using JSON API to make a Sponsors page for a conference, and I wanted to list all the companies that provided space for the conference's events. So I did (essentially) /events?filter[isConferenceEvent]=true&include=venue.organization, to include the organization that owns the venue at which each event was held. That way I could put those organizations on the Sponsors page. But I didn't care which event or which venue each sponsor was associated with.

...you don't actually care about the events, you're just "hacking" JSON API to get the orgs, yeah?

@ethanresnick
Copy link
Member

@tkellen I want to re-read this issue and think about it more closely but, just as a tentative response to your comment: yeah, I was "hacking" for the orgs in my example. And you might be right that, in general, the use cases in which the client can't link included back to data are ones in which it doesn't care about the primary data. But that doesn't seem like it will always be the case. For example, imagine the same request for /events?filter[isConferenceEvent]=true&include=venue.organization. Maybe the client wants to use that to build the conference site's homepage, and the sponsors need to go on the homepage along with a schedule (with event titles, times, and descriptions), but the event venue can be listed on a "details" page for each event. In that case, both the primary and included data are being used, and it really is the case that the intermediate entity (the venue) isn't necessary.

I'm not sure what the implications of this are though...

@hhware
Copy link
Contributor

hhware commented May 18, 2015

you don't actually care about what is in data at all.

@tkellen, I do not think it is always the case. Another consideration along the lines of what @ethanresnick is talking about: suppose one wants to display a set of resources along with some statistical information about them, which is based on attributes/number/types/etc of their related resources. So both primary data and some resources related to it are needed. One way to solve this is to compute that statistical information on the server, but what if the nature of the data is highly variable and multi-dimensional, so that it is impractical to support all this multitude of options server-side? Or just that the architecture of the system is such that this kind of work is supposed to be done on the client? Another way to solve it is to request two pieces separately, but suffers from the same problems as the first one, just the other direction along relationships.

IMHO, there is no reason for the spec not to be flexible in this case and require (potentially tons of) in-betweens to be shipped to the client along with useful information...

@dgeb
Copy link
Member

dgeb commented May 19, 2015

As I see it, the core tradeoff here is:

A) A clear normative requirement that linkage data be included to connect primary and included resources.

vs.

B) The flexibility to allow clever ad hoc queries in which clients can intuit connections between primary and deeply nested resources, without the need to include intermediate resources.


The value of A is undeniable. It ensures that servers provide the linkage necessary to connect primary and included resources. It ensures that the included resources are in fact related to primary resources. It means that compound documents will always be fully "connected".

The value of B is more tenuous. It provides some potential for minimizing request count and payload size. However, it is truly not a generalizable benefit. It only "works" in certain cases for certain requests in particular applications.

I had a long discussion with @tkellen and @lgebhardt today about this tradeoff. We were unable to arrive at a best-of-all-worlds solution that provided both A and B. We discussed options such as complex relationship paths (e.g. comments.author) in primary resources to include linkage data therein. However, the presence of these complex paths would be dictated by the include parameter, and that seemed like a vague serialization rule and not very normative. We also discussed possible rules for a normative requirement in which linkage data MUST be included when every chain is included, but that also seemed arbitrary and brittle.

Therefore, we are leaning heavily toward providing benefit A for everyone with a simple and clear requirement. The alternative - to remove that normative requirement for linkage data - would mean that compound documents could be "broken". Some implementations would choose to not provide linkage data, even for parent-child relationships, and that seems like a much worse consequence than losing the potential benefit B for certain select cases with certain select implementations.

This issue, which seems rather simple on the surface, has turned out to be surprisingly tricky.

dgeb added a commit that referenced this issue May 20, 2015
Define full linkage in the "compound documents" section and expand upon
its implications in the "inclusion of related resources" section.

[Closes #497 and #624]
@nevson
Copy link

nevson commented May 20, 2015

@daliwali, @hhware Thanks for your feedback :)

To paginate an included collection returned in a compound document, supply pagination links in the corresponding link object.

How could an exemplary implementation look like for my example mentioned in #497 (comment) ? Honestly this is to me a bit unclear.

A) A clear normative requirement that linkage data be included to connect primary and included resources.

@dgeb Sorry for asking but would that mean that a request for comments.author MUST automatically also include comments in the response so that it's fully connected or does it mean that something like comments.author won't be possible at all (just include=comments will be allowed)?

@tkellen
Copy link
Member

tkellen commented May 20, 2015

@nevson As an API implementor, you'd internally create an alias, like participants or comments-authors. This would appear in the relationships of the primary resource under a key of the same name, allowing you to skip intermediate records while still linking them to the primary resource.

@dgeb
Copy link
Member

dgeb commented May 20, 2015

@nevson To add to what @tkellen said, comments.author would still be allowed but full linkage would be required in the response. Please see #637 for complete details.

@dgeb dgeb closed this as completed in #637 May 20, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants