Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Local IDs to v1.1 #1244

Merged
merged 1 commit into from
Apr 23, 2020
Merged

Introduce Local IDs to v1.1 #1244

merged 1 commit into from
Apr 23, 2020

Conversation

dgeb
Copy link
Member

@dgeb dgeb commented Dec 9, 2017

A client may include a "local ID" as a lid member in a resource object to
uniquely identify the resource within the request document. Every representation
of that resource, whether as a resource object or resource identifier object,
must then include the matching lid member and value.

When a server receives a request document that contains resources with local
IDs, the server must include the matching lid member and value in every
representation of that resource or resource identifier in the response document.

This addition paves the way for requests of all kinds that may need to establish
linkage between resources that have not yet been assigned a server-generated ID.


Note: This proposal obviously overlaps with a portion of #1197, but I'd like to more fully integrate the concept of local ids in the spec without restricting them to a section on "side posting".

Note: I'm proposing the name lid after considering negative feedback to my key proposal in #1197. I favor a name that references either "local" or "document" instead of "temporary", because this field does not really seem to have a temporal component to it. I also don't like squishing multiple words into one (like tempid) but an abbreviation like lid is different and consistent with id.

@ethanresnick
Copy link
Member

👍

Thanks for writing this up. lid as a name seems fine.

I just want to note the connection between the approach here and #1245: adding lid to existing resource identifier objects only works if we have a feature negotiation strategy in place; otherwise, the server will ignore the lid while continuing to process the request as valid, which (I think...) could produce the wrong result in an unacceptable way. That's why the proposal in #1197 defined an entirely new resource identifier format, i.e., something like:

{ "lid": "...." /* not id key here; maybe no type either, but that doesn't matter */ }

This makes lids basically an optional feature that can hard fail if not supported, analogous to how the base spec handles unsupported optional features like ?sort and ?include.

Again, a new format like that may not be necessary; I'm just highlighting the dependency on #1245 in its absence.

@sandstrom
Copy link
Contributor

sandstrom commented Dec 11, 2017

@dgeb Very much in favour! 🏅

This should enable implementors such as Ember Data to solve some issues around nested resources. May also be useful as an alternative to JSON pointers.

@dgeb
Copy link
Member Author

dgeb commented Feb 2, 2018

I've updated this PR to introduce the concept of a top-level identities array, which allows correlation between lids in a request with ids in a response.

By including an identifier in this array, a client can guarantee that the server will return it, regardless of the shape of the data in the response.

@sandstrom
Copy link
Contributor

@dgeb Awesome improvement! 😄

Easy to parse for clients (that need to pair up lids with ids), and conceptually simple (i.e. easy for people to understand).

@dgeb
Copy link
Member Author

dgeb commented Feb 10, 2018

While the identities array is necessary to guarantee that clients can always receive server-generated IDs in #1197 (side-posting), it is not strictly necessary in #1254 (operations). Thus, I've removed this commit for now to keep this PR as lean as possible and will rebase #1254 to use this.

I've also reworded some of the language in this PR as per suggestions from @courajs in #1254.

I think I will let the optional identities array stand on its own as a separate PR if there is strong support for it, but I think it may cause more confusion than good until we resolve more fundamental issues.

@sandstrom sandstrom mentioned this pull request Feb 12, 2018
17 tasks
together with `type`, uniquely identifies the resource _locally_ within the
document. If a [resource object][resource objects] contains a `lid` member, then
every representation of that resource, including other [resource objects] and
[resource identifier objects], **MUST** also contain the `lid` member with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This [resource identifier objects] fails to link to the correct section. Should this instead be

[resource identifier objects][resource identifier object]

A [resource object][resource objects] **MAY** contain a `lid` member that,
together with `type`, uniquely identifies the resource _locally_ within the
document. If a [resource object][resource objects] contains a `lid` member, then
every representation of that resource, including other [resource objects] and
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing word: "...of that resource, including within other..."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that word is missing? Because:

including other ROs and RIOs must also contain […]

("including within other ROs and RIOs must also contain" does not sound right to me)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full sentence:

If a resource object contains a lid member, then every representation of that resource, including other resource objects and [resource identifier objects], MUST also contain the lid member with the same value.

I am suggesting clarifying that the other resources objects are referring to the subject of the sentence, not that they themselves are other versions of it (which is prohibited by JSON API).

Here's another go, even more specific:

If a resource object contains a lid member, then every representation of that resource, including referential representations contained within other resource objects and [resource identifier objects], MUST also contain the lid member with the same value.

Copy link
Contributor

@wimleers wimleers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pointing out a few potential clarifications.

P.S.: after reading "lid member" many times now, I can't help but point out something funny: "member" in Dutch is … "lid" 😄

_format/1.1/index.md Show resolved Hide resolved
A [resource object][resource objects] **MAY** contain a `lid` member that,
together with `type`, uniquely identifies the resource _locally_ within the
document. If a [resource object][resource objects] contains a `lid` member, then
every representation of that resource, including other [resource objects] and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that word is missing? Because:

including other ROs and RIOs must also contain […]

("including within other ROs and RIOs must also contain" does not sound right to me)

A "resource identifier object" **MUST** contain `type` and `id` members.
A "resource identifier object" **MUST** contain a `type` member. A resource
identifier object **MUST** also contain an `id` member, except when it contains
a `lid` that identifies it as matching a new resource to be created on the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this then state explicitly that lids can only exist in request documents, not in response documents?

Copy link
Contributor

@sandstrom sandstrom Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wimleers lid can exist both in the request and the response object (that's sort of their purpose)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, to communicate to the client what the server-generated id for a client-specified lid is. Of course. Silly me!


A "resource identifier object" **MAY** also include a `meta` member, whose value is a [meta] object that
contains non-standard meta-information.
A "resource identifier object" **MAY** also include a `meta` member, whose value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no actual change here, just reflowing. This change can/should hence be reverted AFAICT.

@@ -510,7 +538,7 @@ Each member of a links object is a "link". A link **MUST** be represented as
either:

* a string containing the link's URL.
* <a id="document-links-link-object"></a>an object ("link object") which can
* <a id="document-links-link-object"></a>an object ("link object") which can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no actual change here, just reflowing. This change can/should hence be reverted AFAICT.

@wimleers
Copy link
Contributor

After reading @EndangeredMassa's review at #1254 (review), I started wondering why we even need lids? Why can't we just ask that clients generate UUIDs and specify those as the id for each new resource to be created?

I'm assuming it's for two reasons:

  1. we don't trust that every client generates UUIDs correctly (most importantly: sufficiently random)
  2. we don't want to burden every client with generating UUIDs correctly

@dgeb
Copy link
Member Author

dgeb commented Feb 14, 2018

@wimleers Thanks for reviewing! And yes, you are correct that the entire need for lids goes away if you can depend on client-generated UUIDs. I think we should encourage their use more, although we can't force the issue too much because of the two reasons you gave.

@DLiblik
Copy link

DLiblik commented Feb 14, 2018

@dgeb @wimleers Hang on - this statement is not true:

...the entire need for lid goes away if you can depend on client-generated UUIDS.

The broader domain problem being solved by lid is linkage on two levels:

  1. Linkage between resources in a request where some resources do not yet have a persisted ID assigned.
  2. Linkage between resources in a request that do not yet have a persisted ID assigned and those same resources in the response where they do have a persisted ID assigned.

By far the most common driver of this is that the server does not allow for client ID generation, not
that the client can't do a good job of random ID generation.
This happens for lots of server-side reasons, some of the more common ones being:

  • incrementing ID (in enterprise systems, this is still the dominant case and is REQUIRED in many systems
    for real-world concerns like financial audits, where, for example, invoice numbers that are sequential
    allow for detection of fraud via missing invoice numbers, etc.)
  • dependent/derived ID's where parts of the information used in ID generation does not exist on the client
  • ID-space distribution (a.k.a. hashing) where the ID is transformed to evenly distribute over a target space
  • ... and so on

You might argue the validity of any of those points from a design perspective, but for now I just
want to be able to wrap those systems in a JSON API interface that is efficient for the client!

@dgeb
Copy link
Member Author

dgeb commented Feb 14, 2018

@DLiblik I wholeheartedly agree that the spec should continue to be compatible with systems which do not support client-generated UUIDs. I believe the conditional if in my statement keeps it correct, but sorry if it was confusing or misleading.

@DLiblik
Copy link

DLiblik commented Feb 14, 2018

10-4 @dgeb - and thanks for putting this all together. Btw, we've being doing this "our own way" since 2015 (we tucked a {meta: _localId: "123" } on our objects and otherwise followed all the same rules) and I can tell you that this approach does solve the problems it sets out to address! I'm very happy to see it get formalized into the spec.

@krainboltgreene
Copy link
Contributor

Bit late in, but as I pointed out in another PR this is the first specification name that is truncated. Any reason we shouldn't just call this a local or some other real word?

@sandstrom
Copy link
Contributor

@dgeb I know you're busy with many other things, but just curious if there are any outstanding issues with this PR or #1268? 😄

@dgeb
Copy link
Member Author

dgeb commented Jun 24, 2018

@sandstrom there are a few minor details to iron out in #1268. I hope we can land that very soon, and then follow up with local IDs and operations.

@sandstrom
Copy link
Contributor

@dgeb Just curious, now that profiles (#1268) is merged, are there any outstanding issues here?

Copy link
Contributor

@sandstrom sandstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@jesseclark
Copy link

I'm concerned about the parameter name 'lid'. It doesn't convey any information about what the param is actually used for. Could we consider changing it to something more meaningful? I don't see any params that are two combined words in the spec anywhere so not sure if that is allowed or handled with hyphens or underscores but what about 'linkage_id', 'local_id', or something else that ties in with the description of what the purpose of the parameter is?

@sandstrom
Copy link
Contributor

@jesseclark I understand your point, but I actually prefer a succinct name in this case. It's clearly explained in the spec, so it's easy to look up (id itself is an abbreviation of identity/identification).

Somewhat similarly, the spec also contain meta (short for metadata) and prev (for previous).

@jesseclark
Copy link

jesseclark commented Sep 21, 2018

@sandstrom id has been so long in usage in computing that it is fairly universally understood at this point, lid not so much. If someone is looking at a payload that contains lid they almost have to go look it up, while a name like linkage_id or link_id would at least give some clue to the purpose and seeing that param with the same values in different sections of a json document one could infer that the field linked those sections together. lid increases cognitive dissonance in the spec.

Also, meta and prev are also in somewhat common usage in programming. Both are also abbreviations for one word while lid combines two abbreviations. Would mda or pre have been acceptable alternatives to meta or prev?

@krainboltgreene
Copy link
Contributor

krainboltgreene commented Sep 21, 2018 via email

@jbescoyez
Copy link

jbescoyez commented Sep 21, 2018

@jesseclark @krainboltgreene Stop bike shedding please. We really need local ids land to JSONAPI. IMHO It has been enough time discussing that. lt is just fine regarding the constraints the jsonapi team has set. So, let it go and keep going forward.

@krainboltgreene
Copy link
Contributor

krainboltgreene commented Sep 22, 2018 via email

@jbescoyez
Copy link

@krainboltgreene You are right. Sorry for the negative comment above. It does not bring anything good to the conversation. I'm just a bit concerned to re-discuss something that has been already discussed a lot in #1197 and in this PR.

@dgeb dgeb changed the title Introduce Local IDs to v1.1 Introduce Local IDs to v1.2 Dec 3, 2018
@turtleman
Copy link

The discussion seems to have stalled to the naming of the local id attribute the last time. Multiple commentators are saying that lid is not a good one. I'll throw one suggestion to the mix and propose that what if the name is changed from lid to local-id and that's it? This proposal would be good to go then?

@sandstrom
Copy link
Contributor

Are there any news on this?

Local IDs will help solve some concrete problems around duplicate data with nested records. For example, this is solved in Ember Data under JSON API ([see thread[(https://github.com/emberjs/data/pull/4441#issuecomment-524501809)) using a lid, even though support isn't officially added in JSON API yet.

If the spec could introduce this much-discussed feature I think many would find it useable.

@dgeb
Copy link
Member Author

dgeb commented Oct 8, 2019

@sandstrom please see #1435 for a discussion of extensions as well as the newly proposed local identities extension in #1436.

I'm going to keep this PR open until we make a final decision about including local identities as an extension vs. directly in the base spec. Either way, we know they're an important addition.

A client may include a "local ID" as a `lid` member in a resource object to
uniquely identify the resource within the request document. Every representation
of that resource, whether as a resource object or resource identifier object,
must then include the matching `lid` member and value.

This addition paves the way for requests of all kinds that may need to establish
linkage between resources that have not yet been assigned a server-generated ID.
@dgeb dgeb changed the title Introduce Local IDs to v1.2 Introduce Local IDs to v1.1 Apr 16, 2020
@dgeb
Copy link
Member Author

dgeb commented Apr 16, 2020

After exploring the concept of a local identities extension (#1436), @gabesullice and I have come back around to the original approach taken in this PR. We are planning to introduce a lid field that can be used to identify resources uniquely by type, locally within a document, when an id field is not present.

This fills a gap in the base spec in which resources can not reference themselves in relationships when ids are server-generated. For example:

 POST /people HTTP/1.1
 Content-Type: application/vnd.api+json"
 Accept: application/vnd.api+json

 {
   "data": {
     "lid": "a",
     "type": "person",
     "attributes": {
       "firstName": "John",
       "lastName": "Doe"
     },
     "relationships": {
       "bestFriend": {
         "data": {
           "lid": "a",
           "type": "person"
         }
       }
     }
   }
 }

This will also prove useful for extensions, such as atomic operations (#1437), and undoubtedly many others.

I should also note that, although the updated spec makes no allowance for lid members to be returned in responses (in which every resource should have an id), it is quite possible for extensions to be written that place a requirement on clients and servers to include a map between lids and ids if that's useful to the extension.

Please review the updated language in this PR. We'd like to merge this into the 1.1 spec and get a new draft out soon.

@DLiblik
Copy link

DLiblik commented Apr 16, 2020

@dgeb I think this is a great outcome. We've been using this approach in practice for several years now in a proprietary MERGE extension we put in place in a number of systems now to handle batch (atomic) operations (slightly different syntax, but same idea: a local ID where a global one is not available).

It's worked very well and we have yet to hit a transactional situation that broke down (to be clear, I'm not suggesting any changes to use MERGE as the verb - just indicating our particular flavor of rubber bands and popsicle sticks).

For any transactions that assign a global ID to the object during the server transaction we also return both local and (assigned) global ID fields so that the client can tell which witch is which when reintegrating into client-side state.

@sandstrom
Copy link
Contributor

@dgeb Looks great! 💯

@jbescoyez
Copy link

We are also using this approach for atomic operations. We even have extracted it so that it is reusable throughout our Ember app. I'm excited to see this landing in the 1.1 specs so that it can be implemented in "official" JSONAPI libs. 🎉

@gabesullice
Copy link
Contributor

I think we've settled on a limited scope, well defined solution. This fills a gap in the base spec, while providing the necessary foundation for new extensions (like atomic operations in #1437). Thanks for the help, everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.