Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linking from complex attributes #383

Closed
ethanresnick opened this issue Mar 1, 2015 · 20 comments
Closed

Linking from complex attributes #383

ethanresnick opened this issue Mar 1, 2015 · 20 comments

Comments

@ethanresnick
Copy link
Member

The use case

Consider the following cases:

  1. A resource representing a product has a "price" attribute, which is an object of the form: {"amount": ..., "currency": ...}, where the "currency" property refers to a particular currencies resource.
  2. A resource representing a project has a "team-members" attribute, which is an array of objects of the form: {"person": ...., "role-played": ...}, where the "person" and "role-played" properties refer to particular people and roles resources.

In both of the cases above, the attribute mentioned is an object or an array of objects that includes a link to another resource. But, crucially, the values for these attributes are only likely to occur in, or are only relevant in the context of, one resource. That is the (amount, currency) pair that represents a product's price doesn't make sense outside the context of that product, and the (person, role) pair that describes the role of a person on a particular project likely does not need to be considered outside of that project.

Accordingly, there's little semantic or functional justification for making separate, addressable resources from these "complex attributes"'s values. Meanwhile, turning them into separate resources can have a performance cost for backends implemented with document databases, which allow the complex attributes' values to be stored directly in the main resource's document. Therefore, the JSON-API spec shouldn't mandate that these complex attributes be turned into separate resources. But, if it's not going to mandate that, it has to somehow allow these complex attributes' links to other resources to be expressed in the response.

Prior Discussions

This issue has been discussed many times before, including in #238, #276, and #311. But many of the proposed solutions no longer apply post rc2. Post rc2, though, @bintoro made two proposals that offer good starting points for discussion.

My initial preferred approach

As I said previously, I think @bintoro's first proposal makes the a lot of sense. It's syntactically consistent with the current approach, and it seems easier to implement server-side than splitting up the complex attribute would be. Its downside, of course, is that it makes a parser's search for "links" somewhat more complicated.

An in between option, that keeps it reasonably simple to find links but still keeps a complex attribute's value together, could be the following:

{
  “data”: {
    “id”: “1”,
    “type”: “people”,
    “name”: “Victor Mayonnaise”,
    “links”: [“employer”, “address.city”],
    “employer”: {
      “relationship”: “http://example.com/people/1/links/employer”,
      “resource”:  “http://example.com/companies/2”
      “type”: “companies”,
      “id”: “2”
    },
    “address”: {
      “street”: “12 Mean Street”,
      “city”: {
        “type”: “cities”,
        “id”: “123”
      }
    }
  }
}

Here, the link objects have been treated just like any other attribute value, so links out from simple attributes are now at the top-level the resource. The "links" key just lists an array of paths that hold link objects.

I have a feeling this might get strong pushback, though, as people seem to like the idea of the "links" key holding an object. I like that too, in theory, but I wonder whether, if the alternative is having multiple "links" keys throughout the resource, the above isn't a better compromise. I don't know.

Thoughts?

@bintoro
Copy link
Contributor

bintoro commented Mar 1, 2015

Great analysis @ethanresnick. I like your "in-between" approach a lot!

One aspect I've been trying to push for is uniformity of representations. It would be very elegant if a client didn't have to parse the same thing in different ways depending on the HTTP endpoint it happens to use. (This principle is not limited to complex attributes. I've brought this up elsewhere, too.)

Complex attributes that have their own links bunch would obviously fit this bill, since they are completely self-contained.

In the in-between proposal, the representation per se isn't uniform (because links switches context), but it doesn't matter. The crucial thing is whether the parsing logic can remain uniform, and your proposal accomplishes this perfectly. The links array will always be at the top level of the resource object, so /person/1 and /person/1/address can be parsed in identical ways.

GET /person/1/address

HTTP/1.1 200 OK

{
  “data”: {
    “links”: [“city”],
    “street”: “12 Mean Street”,
    “city”: {
      “type”: “cities”,
      “id”: “123”
    }
  }
}

Furthermore, a client shouldn't have to care whether address is internally a complex attribute or a linked resource. Requesting /person/1/address should return the same kind of representation regardless. This is also true in your scheme.

Nice job!

@ethanresnick
Copy link
Member Author

@bintoro Thanks!

And I'm glad you're thinking about parsing uniformity. That's been an ongoing concern for many of us—see, e.g., #276—and I think the spec is much better now in this regard. I still haven't taken the time to wrap my head around all the new cases introduced by rc2, but I'm happy to hear that the in between solution appears to work with them.

One thing, though: I'm not sure that GET /people/1/address is actually a response we need to be concerned with. The point with the complex attributes case, as I understand it, is that the data in question really is just an attribute, and not a distinct resource that it would make sense to address separately. So, in the same way we wouldn't have GET /people/1/first-name, I don't think we should have it for address.

(EDIT: remove discussion of GET /people/1/links, since that's an invalid request).

The other request that I'm most worried about is GET /people/1/links/address.city. I think this would look exactly the same that requesting a link from a simple attribute would. So GET /people/1/links/address.city is no different in structure than GET /people/1/links/employer. What that structure is, though, needn't be discussed here, since that's already happening in #357.

@bintoro
Copy link
Contributor

bintoro commented Mar 1, 2015

One thing, though: I'm not sure that GET /people/1/address is actually a response we need to be concerned with. The point with the complex attributes case, as I understand it, is that the data in question really is just an attribute, and not a distinct resource that it would make sense to address separately.

In this particular case, sure, it may never be necessary to target address directly. But more generally, it would be highly desirable to be able to switch from normalized resources to complex attributes (or vice versa) without affecting what the response looks like.

When you think of it this way, a term like "embedded resource" or "subresource" might be more apt than "complex attribute". (EDIT: But for the purposes of the spec, I wouldn't favor such a term unless the structure was identical with "proper" resources, which it isn't in this proposal.)

The other requests that I'm most worried about are GET /people/1/links and GET /people/1/links/address.city.

I would expect GET /people/1/links to respond with this:

{
  "data": {
    "links": ["employer", "address.city"],
    "employer": {
      "relationship": "http://example.com/people/1/employer",
      "href":  "http://example.com/companies/2",
      "type": "companies",
      "id": "2"
    },
    "address.city": {
      "type": "cities",
      "id": "123"
    }
  }
}

In the "nested links bags" scheme, I would expect any links in the substructures to be completely excluded, simply because they are not, in fact, part of the links object you're targeting. If it's desirable to target all of the links in the entire tree in a single request, then let's scrap the nested scheme right now. My first impression is that I like your proposal better anyway.

By the way, I like that you've included both the relationship-specific resource URL as well as the canonical URL (href) in the link object. I'm trying to accomplish the same thing in #379.

EDIT: I should add that, purely from a parsing point of view, I find it simpler to deal with the "nested links" format because the links are available at the relevant context. Is there some obvious use case for dealing with them all at once? I mean, if address were a normalized resource instead, you wouldn't know about the address.city link anyway. But what I really like about the new proposal is that it cleans up the syntax by getting rid of the resource-level links objects.

@ethanresnick
Copy link
Member Author

@bintoro Whoops, I totally didn't mean to put relationship and href keys in my payload earlier. Not sure how that happened! I think the discussion in #379 is a good one to have, but for simplicity here I meant to just assume that the structure of a link object stays constant. I've edited my earlier post accordingly.

Re...

But more generally, it would be highly desirable to be able to switch from normalized resources to complex attributes (or vice versa) without affecting what the response looks like.

Can you explain this a bit? As I see it, there are cases where a separate resource makes sense semantically and other cases where what you want to specify is really a complex attribute. But I can't imagine one case changing into another. If we really do start thinking about embedded resources, a la HAL, I think stuff's gonna get messy really fast.

@bintoro
Copy link
Contributor

bintoro commented Mar 2, 2015

I was thinking of something you said in #341:

by far, the most annoying aspect of the current API I've built on this spec is the extra, normalized resources I've had to create

It feels like an appealing property of a system if the representation of a resource can remain the same when it appears as a part of a larger resource. For example, you might take a snapshot of an independent, linked resource and permanently store it as part of some context. In this respect, the "nested links" route is more complete.

I suppose you're right that actually requesting a wholly dependent subresource directly is mostly just a theoretical concern.

Looking at this again, one thing that occurred to me is that the "in-between" approach has a property that might be dissatisfying to some:

Now:

  • /people/1/employer doesn't conflict with any actual JSON member called employer.
  • /people/1/links matches the links object perfectly.

The in-between model:

  • /people/1/employer coincides with an employer member yet doesn't match its contents.
  • /people/1/links no longer reflects the structure of the links object.

So, it kind of breaks the correlation between URLs and the corresponding objects.

I guess the choice between the alternatives depends at least partly on how useful it really is to enumerate all the relationships at the top level of the main resource (see the addendum I edited on my previous comment).

@ethanresnick
Copy link
Member Author

@bintoro Let me back up a step. The reason I want to keep the concept of a "complex attribute", rather than moving to the notion of a "subresource", is that I think having both the notion of a "subresource" and an "included resource" will get confusing. For example, I think it'll make people ask why a request for GET /people/1?include=employer can't include the "employer" as a subresource within the person's resource object, rather than in a separate "include" key. And I'm not inclined to (re)start all those discussions when we're so close to 1.0.

That said, for consistency and easy parsing, I agree with you that it makes sense to structure "complex attributes" in the same way we structure resource objects. And, I agree with your points about the in-between model breaking a lot of nice correspondences between the URLs and the payload.

So, now, I'm thinking that one option would be to stick with the nested-links approach—which has all the advantages for parsing and url–object correspondence—and simply prevent the concept of a complex attribute from collapsing into the concept of a subresource by saying explicitly that a complex attribute MUST NOT be accessible at its own url, in the same way that simple attributes (like name) aren't.

The only problem I see with this is that, if the complex attribute doesn't have a url, how do you request the link object inside of it? The nested-links payload's structure would suggest /people/1/address/links/city, but that becomes really awkward if, per the above, we've stipulated that GET /people/1/address shouldn't work. (EDIT: remove references to requesting /people/1/links.)

So what are our other options? We could go back to your second proposal, but I agree that splitting up the attribute is suboptimal.

A radical option would be to simply require the server to serialize the "address" attribute as a series of key-value pairs that live at the same level as the simple attributes, so we'd get "address.street", "address.city", etc. top-level members, rather than having an object live at the "address" key. But I think this creates more problems than it solves.

Alternatively, we could stick with the in-between approach, but change the url of /people/1/employer to /people/1/linked/employer. This could be interpreted as requesting the resource linked to from the link object at the employer member. With this change, none of the urls "under" /people/1 directly map to sub-parts of the /people/1 payload; instead both /type/id/linked and /type/id/links become special namespaces with special behavior. Of course, this throws away the direct url/payload path correspondence, but at least it does so uniformly. The bigger concern, I think, is that this rename would imply that it should be possible to do GET /people/1/linked and have that return all the linked resources. Whether being able to request that is a good thing or not, I'm not sure. But I don't think it'd be that hard to implement on the server, since the requisite query is the same as that for GET /people/1?include=all,the,linked,properties and the server should have a schema to know what those property names are.

Thoughts? Other options?

@bintoro
Copy link
Contributor

bintoro commented Mar 2, 2015

I think having both the notion of a "subresource" and an "included resource" will get confusing

Sorry, I was being unclear... I'm not proposing changing the term unless there's some particular reason. You said:

The point with the complex attributes case, as I understand it, is that the data in question really is just an attribute

...so I used the term "subresource" to explain my thinking.

saying explicitly that a complex attribute MUST NOT be accessible at its own url, in the same way that simple attributes (like name) aren't

But what's the value in prohibiting it? The spec could just stay silent on the issue just like in the case of simple attributes.

if the complex attribute doesn't have a url, how do you request the link inside of it? [...] how is a user supposed to get the city's link object?

By requesting /people/1, and there you have it in data.address.links.city.

If there's no need to request /people/1/address directly (because it's wholly dependent), then certainly there should never be a need to access something inside it directly, right?

EDIT: Oh now I get it, you mean how to manipulate the relationships directly, like POST to a to-many. Well, you don't, you have to set them on the object. This is why I like the ability to address these things directly; then it happens in the usual way, e.g., /people/1/address/links/city.

EDIT 2: Hmm. I wouldn't actually find it atrocious at all if a server didn't serve /people/1/address yet exposed the relationships at /people/1/address/links/. Just return a sensible status code for the address if you don't wish to deliver it in isolation from the people resource. Why not?

@ahacking
Copy link

ahacking commented Mar 2, 2015

So, now, I'm thinking that one option would be to stick with the nested-links approach—which has all the advantages for parsing and url–object correspondence—and simply prevent the concept of a complex attribute from collapsing into the concept of a subresource by saying explicitly that a complex attribute MUST NOT be accessible at its own url, in the same way that simple attributes (like name) aren't.

Not being addressable is precisely the use case I want to support. Anyone who says 'you can just normalize' should go and do RDF tripples because thats where you end up. I don't want to implement a tripple store so I want attributes which have no identity in their own right but which may be conplex to be able to link out. There are countless basic data modeling use case drivers for this.

As I've said previously the real problem stems from segregation of fields and links which has dubious justification, if any, at all. Its a handrail just like typed collection keys gave people comfort until they realized the handrail actually gets in the way.

@bintoro
Copy link
Contributor

bintoro commented Mar 2, 2015

Not being addressable is precisely the use case I want to support.

No one's saying complex attribute must be addressable. I just don't see the point of prohibiting it either. If someone wants to retrieve "1" by issuing GET /people/1/id, I don't mind.

And links bunches need not be addressable ever. The spec says:

If a link object contains a relationship URL, then the server MUST respond to requests to that URL to update the relationship.

If.

But presumably people will want to do this, and there's a dearth of sensible options apart from making the nested links objects addressable. And what's the harm in that? An application is free to not implement and go through the main object or the other side of the relation instead.

@ahacking
Copy link

ahacking commented Mar 3, 2015

@bintoro I just wanted to spell out that the major driver is for rich nested attributes is that they are not addressable as they don't have an id, and the fact the server does not support that level of granularity on fields.

I do see a problem with interoperability with addressing "sub resources" / fields like you describe. Having clients support both styles where on some servers they may be able to address with fine granularity and others only at the owning resource is quite a different api.

If however we limit 'jsob-api' defined addressibility to a resource, we still have the fields query param to control what we want returned. If you want to have finer grained addressability then I would say that would be outside of json-api defined semantics and more a local/proprietary behavior.

@bintoro
Copy link
Contributor

bintoro commented Mar 3, 2015

I understand the concern. Let's not veer off the main point because for such cases the spec already has it laid out:

A valid JSON API document must consist of data objects. A wholly dependent complex attribute that doesn't have an ID cannot, own its own, constitute a valid data object. There is absolutely no way the base spec permits the kind of granular access you're worried about.

However, what must be resolved is how to access the outbound links in such an attribute. For purposes of addressing, I don't see an issue with a complex attribute having a URL structure.

The way I would go about it is to return a 204 or 403 at /resource/:id/complexattr but make the links available normally at /resource/:id/complexattr/links/*.

@ahacking, are you also opposed to this? (EDIT: as a recommendation — see next comment)

@ethanresnick, the same principle applies to your modified proposal:

The bigger concern, I think, is that this rename would imply that it should be possible to do GET /people/1/linked and have that return all the linked resources.

I don't think it implies anything. There's absolutely no rule saying all segments along a URL path must be directly accessible.

We already have an example of this today. Fetching /people/1/links would have to produce a links object or perhaps an array of link objects, but neither is valid for primary data. The client is supposed to always target an individual relationship.

The bigger problem with the rename is that it reintroduces the "links"/"linked" confusion and leads to ugly URLs because you'd have to repeat "linked" at each level; consider /posts/1/linked/comments/23/linked/user vs. /posts/1/comments/23/user today.

@bintoro
Copy link
Contributor

bintoro commented Mar 3, 2015

One more thing: it's important to keep in mind that the resultant URL structure in the "nested links" scheme is nothing more than a logical extension of the recommended URL structure. The spec doesn't mandate any specific format, so an application can freely devise its own URL plan.

For example, to expose all links directly under /people/1/links:

{
  "data": {
    "type": "people",
    "name": "Victor Mayonnaise",
    "address": {
      "street": "12 Mean Street",
      "links": {
        "city": {
          "self": "http://example.com/people/1/links/address-city",
          "type": "cities",
          "id": "123"
        },
        "state": {
          "self": "http://example.com/people/1/links/address-state",
          "type": "states",
          "id": "12"
        },
        "country": {
          "self": "http://example.com/people/1/links/address-country",
          "type": "countries",
          "id": "1"
        }
      }
    }
  }
}

@ethanresnick
Copy link
Member Author

@bintoro Your comments clear a bunch of things up. Between all the changes in RC2 and the fact that I've been out of this spec for a while until recently, I'm still reorienting myself to a bunch of the details. E.g. in the above discussion, I was operating under the assumption that GET /people/1/links was a valid request whose response was already defined in the spec. (I've updated my earlier comments accordingly, so this thread is easier to read.)

Given, then, as you point out, that we already have some recommended urls for which dropping segments from the end produce an invalid url, I'm fine with this same principle applying to links from complex attributes. So we'd be recommending that GET /<type>/<id>/links/<attr> and GET /<type>/<id>/<complexattr>/links/<attr> be valid requests, even though GET /<type>/<id>/links, GET /<type>/<id>/<complexattr>/links, and GET /<type>/<id>/<complexattr> would 404. I'm fine with that!

So I think we just go with the nested links option.

There's one more case I want to consider, though: the case where the complex attributes holds an array of objects, each of which links out. For example, take a slightly-simplified version of the team-members case from my original post, which, with the nested links strategy, might look like this:

{
  "id": "1",
  "type": "projects",
  "name": "My awesome github project",
  "team-members": [{
    canPush: true,
    "links": {
      person": {
        //link object
      }
    }
  }, {
    canPush: false,
    "links": {
      person": {
        //link object
      }
    }
  }]
}

In this case, what are the recommended urls for those link objects?

EDIT: We could build a url that uses the team member's index and, in REST terminology, that would be a sensible resource. (E.g. "/projects/1/team-members/0/links/person` would be the resource for "the link object for the person property of the first team member of project 1"). But, I'm not sure that such a url would be useful, given that the entity backing that resource is subject to change in unintuitive ways. So maybe we could recommend providing no relationship url at all? Or maybe we can find a scheme to recommend that better picks out the desired link object entity?

@bintoro
Copy link
Contributor

bintoro commented Mar 3, 2015

Interesting example, @ethanresnick. My first instinct is the same, i.e. to use the array index.

As for recommendations: since complex attributes may contain arbitrary structures, I think we should either refer to the JSON Pointer spec on how to construct the URL path or say nothing. Applying the JSON Pointer transformation would yield the same /team-members/<index>/person/links/... result and cover everything else too.

In any case, the discussion about addressing stuff inside complex attributes should definitely come with a clear warning about mutability. An index or a key is not a forever-unique ID. Good point there.

Another thing I'd like the final spec to spell out more clearly is the principle that a client can't make any assumptions about URLs.

If the team members' links structures didn't have self members, then either

  • the links are not addressable, or
  • the client and server follow some application-specific convention.

Either way, no problem.

Or maybe we can find a scheme to recommend that better picks out the desired link object entity?

In the absence of IDs it's quite simply impossible to guarantee you're targeting the right entity. A client can fetch a link URL, but by the time it issues a request to it, the parent resource may have moved. Unfortunately there's nothing JSON API can do about it, so it's up to the application to prevent such race conditions somehow.

@ethanresnick
Copy link
Member Author

@bintoro I totally agree with all that! JSON Pointer, plus a warning about mutability, sounds good.

@dgeb I think @bintoro and I are in agreement here on how links from complex attributes should work. The final payload we're proposing looks like the below. Does this look good for submitting a PR?

(I've used non-numeric ids below just to make it clearer what's linking to what without having to add comments.)

GET /projects/ember

{
  "data": {
    "id": "ember",
    "type": "projects",
    "name": "Ember",
    "links": {
      "parent-organization": {
         "self": "http://example.com/projects/ember/links/parent-organization",
         "resource": "http://example.com/projects/ember/parent-organization",
         "type": "organizations",
         "id": "tilde"
      }
    }
    "team-members": [{
       "startDate": "2014-09-01T00:00:00.000Z",
       "isLead": true,
       "links": {
          "member": {
            "self": "http://example.com/projects/ember/team-memberships/0/links/member",
            "type": "people",
            "id": "ykatz"
          }
       }
    }, {
       //more team members, including ones who aren't leads and who may have an `endDate`
    }]
  }
}

@steveklabnik steveklabnik added this to the 1.0 milestone Mar 5, 2015
@dgeb dgeb removed this from the 1.0 milestone Mar 6, 2015
@dgeb
Copy link
Member

dgeb commented Mar 6, 2015

This could be an additive change post-1.0.

@bintoro
Copy link
Contributor

bintoro commented Mar 6, 2015

This could be an additive change post-1.0.

But not very easily unless reserved keys are prefixed as suggested in #313. Linking from within complex attributes in the way proposed here means that links becomes a reserved key in any JSON object nested within a complex attribute. It's a backward-incompatible change unless we go with @links etc.

@tkellen
Copy link
Member

tkellen commented Jul 7, 2015

Per our contribution guide, I am closing this discussion here because it is not on our roadmap for 1.1.

I encourage you to continue this dialog @ http://discuss.jsonapi.org.

We appreciate the suggestions greatly, but the core team only has so much bandwidth to deal with new feature requests. Because of this, we are focusing our efforts on empowering the community to create extensions without our direct involvement (until a critical mass of adoption makes it clear that something like this belongs in the base specification).

@ethanresnick
Copy link
Member Author

/ref #794

@ethanresnick ethanresnick added this to the JSON-API 1.2-beta milestone Apr 27, 2016
@seanrucker
Copy link

Sounds like this is to be supported through an extension? Has anyone started on that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants