Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial updates using PUT? #291

Closed
hkroger opened this issue Oct 15, 2014 · 23 comments · Fixed by #408
Closed

Partial updates using PUT? #291

hkroger opened this issue Oct 15, 2014 · 23 comments · Fixed by #408
Milestone

Comments

@hkroger
Copy link

hkroger commented Oct 15, 2014

See this:
http://jsonapi.org/format/#crud-updating-attributes

It allows partial update of resources using PUT but for example here it is mentioned that PUT is for complete replacement of the document: https://tools.ietf.org/html/rfc5789

I would love to have PUT for also partial updates but is that against the standards?

@steveklabnik
Copy link
Contributor

Yes, I agree with you, but we made the decision to allow it because nobody else follows this rule :/

It's the one thing in the spec that I'm not happy about, but I care about people using things more than anything else.

PATCH works just fine if you want to truly conform to RFC 5789.

@hkroger
Copy link
Author

hkroger commented Oct 15, 2014

I understand, thanks!

@bintoro
Copy link
Contributor

bintoro commented Feb 21, 2015

Btw, this situation just got worse with RC2. Due to all the MUSTs, building an implementation that conforms to the relevant RFCs is now actively forbidden.

I still haven't seen a compelling explanation for why, in 2015, PUT should continue to be confounded with PATCH. In implementing an API, associating HTTP methods with CRUD actions is a rather trivial task compared to the total workload. JSON API is a new specification and thus shouldn't be burdened with outdated practices if they are easy to change.

The actual definition of PUT is unlikely to change, so this would be a great opportunity to nudge the community in the right direction by encouraging conformant usage. It's funny how everyone says they would like to do PUT right but can't because "everyone else" is doing it wrong.

@hhware
Copy link
Contributor

hhware commented Feb 21, 2015

I would also vote for separation of PUT and PATCH as full replacement vs partial update.

One of the reasons is that I would like to be able to require the API client to supply If-Match request header and to verify that it matches the current version of the resource on full replacements, but make If-Match optional on partial updates (i.e., if the client supplies If-Match, it is honored). Then the client could have the choice of safe way of PUT (if the client does not supply If-Match or supplies an outdated version of the resource in it, error is returned) and less safe way of PATCH (the client can decide what the policy is). But how can I do it if PUT is both full replacement and partial update? Making this decision based on the fact whether all necessary resource attributes are supplied in PUT request will just confuse everyone.

However, given that "nobody else follows this rule", perhaps JSON API could at least provide an option for the API designers to follow the rule? For example, provide an extension which, if used, only allows full replacements via PUT. The server can then require the client to use this extension by responding to requests w/o extension with error. Those who do not want to follow the rules can just ignore the extension.

IMHO, any such extension would only make sense if it is official and supported by major implementations.

@ahacking
Copy link

The rfc doesn't say what a resource is.

The basic principle is you should be able to PUT what you GET. So if you can fetch a resource or collection they are both equally a resource as far as the RFC is concerned. The RFC says resources may overlap with other resources at other endpoints so it is anticipated in the RFC that you can perform both bulk and partial updates of resources that an endpoint overlaps with and even have effects on other resources (specifically it mentions versioned resource side effects)

I would prefer json api NOT have the current default behavior that is "all unspecified attributes are assumed to be part of the PUT". That breaks the GET/PUT dogfood rule.

If you want to partially update then use the attribute and related resource inclusion params that json api provides to scope the resources to just the attributes (and/or related resources) you want to effect. That makes GET/PUT round trip symmetrical and follows the RFC.

When you pass query params to scope the attributes affected you are referring to a different resource as far as the RFC is concerned. That is why GET returns a different representation (ie without certain attributes or with certain additional linked resources). You should be able to PUT what you GET.

So given its a different resource we can preserve PUT semantics AND achieve both partial and bulk update of the resource(s) that the scoped resource overlaps with. We remain 100% in line with http semantics without having to use PATCH.

@bintoro
Copy link
Contributor

bintoro commented Feb 22, 2015

I would prefer json api NOT have the current default behavior that is "all unspecified attributes are assumed to be part of the PUT". That breaks the GET/PUT dogfood rule.

This is my gripe exactly, particularly since that behavior is now a MUST. The main spec gives the following rationale:

[...] the server must interpret the missing attributes in some way. Choosing to interpret them as null values would be just as arbitrary as choosing to interpret them as containing their current values [...]

But this argument doesn't hold water. It's like saying MySQL REPLACE is being arbitrary because it's not an alias of UPDATE.

Nobody's proposing that the missing attributes should be assigned a particular value like null. Missing attributes are just that — missing. The server is supposed to interpret them according to its normal rules regarding resource creation (as with POST). An unspecified attribute often ends up being stored as some sort of null internally, but it may just as well receive some field-specific default value.

When you pass query params to scope the attributes affected you are referring to a different resource as far as the RFC is concerned. [...] We remain 100% in line with http semantics without having to use PATCH.

This is actually very insightful, but from an implementation point of view wouldn't it be considerably easier just to switch the method from PUT to PATCH? Why must PATCH be avoided? I realize there are old clients that do not support it, but that can always be circumvented.

@ahacking
Copy link

I have nothing against PATCH per se, just that its often not needed when you have query params that give you scoping. My view is pretty simple, whatever a GET can return should be acceptable as a PUT for modifying things and therefore can target collections of resources and/or partial views of collections or just a single object (whatever you conceive it to be). I am also in the camp that supports PUT being able to create resources when sufficient information is provided ( ie. when using client assigned identifiers such as v4 uuids). I tend to not use POST in my apis as I prefer resource creation to be idempotent. A resource gets created if it doesn't already exist and updated if it does. This works well for my use cases where I may have a graph of objects, some new, and some already existing which may have changes. My client doesn't have to worry about what is new vs existing or what may or may not have been applied before the connection dropped. It just tries again. PATCH requires more intimate knowledge of the current state of the resource in order to formulate a delta. Yes a PUT can fail if it has been modified or the etag has changed and you are using those conditional headers but I prefer the optimistic approach of PUT and then a fallback refresh if required.

I think PATCH works better for a fire hose type endpoint where you can both consume and target arbitrary resources, but again it requires knowledge of current state on the server to formulate an accurate delta.

@dgeb
Copy link
Member

dgeb commented Feb 22, 2015

@ahacking I like the symmetry of the GET / PUT approach you've suggested that employs the fields parameter to scope the fields. I'm going to reopen this issue for further discussion.

@dgeb dgeb reopened this Feb 22, 2015
@dgeb dgeb added this to the 1.0 milestone Feb 22, 2015
@dgeb
Copy link
Member

dgeb commented Feb 22, 2015

Let me repeat my comment from another issue on this topic:

I'd like to clarify the rationale for PUT-as-update instead of PUT-as-replace because I'm aware that it is somewhat controversial.

Basically, it comes down to the rather nebulous definitions of "whole resources" that should be replaced. This is especially an issue for relationships. Should every to-one and to-many relationship also be replaced along with attributes? I think this would be surprising to many clients, who might expect only attributes, or perhaps only attributes and to-one relationships, to be replaced. How about relationships that have inverses?

We've avoided much of this murky territory in JSON API by spec'ing PUT-as-update. This also simplifies implementations that don't also need to support both PATCH and PUT to be compliant with the base spec.

If we were to require fields to scope the fields of an update, then we'd of course have to consider what is appropriate for PUT requests that do not include the fields parameter. I still think this gets murky, especially regarding resetting of relationships.

@ahacking
Copy link

Thanks Dan, its also not just fields but includes of linked resources as well. I've found it an incredibly useful approach to get transactional idem-potency. You PUT a graph of things that need to exist together, which happens a lot when you care about what goes in and your DB has both referential integrity and other check constraints on the sanity of the data.

I intend to open another issue, but ideally I would like fields and linked resources merged in the url params, similar to how odata does it. I believe I provided links in another issue back a few months ago that showed the approach odata uses for controlling what information gets returned. From memory it was just a list of fields and relations and parentheses for controlling what is returned in the related thing. For example requesting a Post with related Author information: ``select=title,createdAt,body,author(currentName(personalTitle,firstName,lastName),avatarUrl)`.

Sorting/ordering can use a similar approach with a leading +/- to allow ordering on specific fields and/or related resource fields.

@ahacking
Copy link

If we were to require fields to scope the fields of an update, then we'd of course have to consider what is appropriate for PUT requests that do not include the fields parameter. I still think this gets murky, especially regarding resetting of relationships.

Whatever a GET returns is fair game for the PUT ... if you want GET to include ID's of related things by default then ID's of related things are in scope by default for the PUT. I would suggest perhaps anything to do with related resources should be either asked for explicitly or excluded explicitly (I don't really have a preference either way if there is a wildcard inclusion/exclusion for either case).

@bintoro
Copy link
Contributor

bintoro commented Feb 22, 2015

If we were to require fields to scope the fields of an update, then we'd of course have to consider what is appropriate for PUT requests that do not include the fields parameter. I still think this gets murky, especially regarding resetting of relationships.

Again, it's only murky until it's defined.

If PUT were to target whole resources only, then with just a little bit of hand-waving it might be possible to define relationships as falling outside of the state to be replaced (unless explicitly included in the data). After all, relationships are things that do not necessarily correspond to physical attributes on the actual entities.

If scoped PUTs are supported, we have to look at things more rigorously. The RFC places great emphasis on the representation as the interface between the client and the server. This seems to necessitate that any relationship that is part of the full representation must be considered "in scope" when no fields is present. In other words, if an association is exposed as a field on the resource, then it is targeted by a full PUT, but if it's only expressed on the other side of the relation, then it should be left intact.

I would be happy to be wrong here because this makes things a bit awkward. Consider a client that is only equipped to deal with an A-to-B relation on the A side. If the server happens to expose the relationship on the B side as well, then the client cannot PUT to B without inadvertently removing the association.

Also, what's included in a representation might change over time, making clients that deal with a fixed set of fields unsafe to use. Clearly there needs to be some mechanism to perform a PUT that excludes relationships. I would strongly prefer this as the default, but unfortunately it seems nonstandard.

@ahacking
Copy link

@bintoro honestly I don't think it matters what the default is but whatever it is it would be nice to have a way to scope either inclusion or exclusion of all relations or fields if you need to.

Whatever a GET returns can be PUT without loss, if GET returns relationships by default then PUT will consider both fields and relationships by default. If GET just returns fields then PUT by default will only consider fields. The real difference is what is implied when a relation is not present in the PUT. The default GET behavior implies what an unscoped PUT can affect. It may be prudent to return an error on any attempt to PUT a relation or field if the scope would not permit it being returned in a GET. This seems necessary to preserve the symmetry and the fact that default scope vs some other scope is actually referring to a different resource with a different representation as far as http is concerned.

I lean towards GET saying nothing about related to-many linked resources because I prefer to GET related resources on the many side using a foreign key, i.e GET /comments?post_id=xx since it scales better as cardinalities grow over time. On the to-one/belongs-to side of a relation I prefer to pass the relation by default because thats the side I use to manage relations. This probably differs from what many people do but its closer to how relational databases work and achieve scalability.

But it really doesn't matter what the default is, the essential thing is that you can PUT what you GET whatever it returns. We do need some rigor around setting fields or relations that fall outside the default or specified scope, and as I suggested, I think this should return an error.

@bintoro
Copy link
Contributor

bintoro commented Feb 22, 2015

@ahacking I think we're both trying to make essentially the same point. If GET /foo/1 returns a set of relationships, then the client must take care to include all of those relationships when issuing a PUT, or else they will be deleted. This is something the RFC is quite clear about.

PUT is very robust, but for the aforementioned reason it can also be dangerous if the client isn't sophisticated enough to always keep track of what constitutes a full representation of a resource.

This is why I'm advocating PATCH alongside PUT. It's the only way to replicate what JSON API PUT does today (i.e., update only a subset of fields based on what is present in the data).

@ahacking
Copy link

@bintoro I have no issue with PATCH being supported alongside PUT and I certainly wasn't proposing to prevent PATCH.

I can see many cases where clients may prefer PATCH as their method of update because of the way they track changes internally. I believe this aligns with Dan's own orbit.js library as it uses patches as its method of syncing between logical stores although what it sends to a server does not have to be a PATCH it does align with using PATCH directly.

As for clients not remembering what they GET, I feel that is a fundamental problem we shouldn't be trying to solve directly since it is difficult for such a client to do anything sensible in accordance with HTTP semantics. However with the ability to explicitly scope in or scope out all relations via a wildcard mechanism I think even this is a non issue, and given that PATCH is still there I don't think there's a problem.

@ahacking
Copy link

@bintoro you can also scope your PUT to just the fields you are updating and still not have to use PATCH. Eg. /comments/1?fields=a,b,c and only those fields will be considered in scope for the PUT.

@dgeb
Copy link
Member

dgeb commented Feb 22, 2015

So JSON API's Patch extension does support the PATCH method with the JSON Patch format. As @ahacking mentioned, this does align quite well with the approach I'm using internally (and optionally externally) in orbit.js.

But supporting PATCH with two different formats in the same spec (or rather base spec + extension) has great potential for confusion.

Avoiding an alternative PATCH format in the base spec is one reason why scoped PUT is so appealing.

@bintoro
Copy link
Contributor

bintoro commented Feb 22, 2015

/comments/1?fields=a,b,c and only those fields will be considered in scope for the PUT.

I realize how it works and, just to be clear, I fully support this scheme. Not only is it extremely RESTful, it's explicitly suggested by the HTTP spec:

Partial content updates are possible by targeting a separately identified resource with state that overlaps a portion of the larger resource [...]

My attachment to PATCH stems only from its extreme simplicity, and it should go without saying that simplicity promotes adoption.

For instance, a client might not have any use for fields when GETting but would still like to do partial updates. Also, when adapting an existing system to JSON API, it's one thing to flip a switch and go from PUT to PATCH, but it's quite another to begin delimiting resource boundaries based on client-supplied fieldsets.

But supporting PATCH with two different formats in the same spec (or rather base spec + extension) has great potential for confusion.

Does it, really? I might agree if JSON Patch was still a part of the base spec, but in the current organization I just don't see it.

However, this is something of a non-issue now that RC2 no longer mandates that PATCH requests must contain a JSON Patch payload. As far as I can see, a conformant server is free to provide key–value PATCH if it pleases (precisely what I had in mind when filing #327).

@ahacking
Copy link

I support #327 and think that it is a reasonable approach to migrate current PUT semantics into PATCH. It certainly seems to be the right http verb to use. My only remark is that if scope is provided or defaulted via fields/includes it should be respected and generate an error on any attempt to modify an attribute or relationship outside of that scope. By definition changing such fields or relations can never be part of the resource representation being PATCHed.

No one is forcing use of PATCH or indeed the json-api media type for use with PATCH. I don't find it confusing and I feel the spec is better aligned with http semantics as a result of scoping PUT instead of the patch like semantics. The use of #327 rounds out the spec and as a whole the spec is better anchored to the http RFC. I also believe if the spec doesn't define PATCH semantics for the json-api media type people will be creative anyway...

@steveklabnik
Copy link
Contributor

My position is basically the same as in the 'resource' thread. Yes, PUT shouldn't be used for partial updates. Yes, thousands upon thousands of people ignore the idempotency aspect and use PUT for partial updates. Using PATCH would be wonderful, but that also introduces another media type, of which there is only one or two, which has its own set of problems.

@bintoro
Copy link
Contributor

bintoro commented Feb 26, 2015

@steveklabnik As for the 'resource' thread, I agree that JSON API's concept of resource is perfectly fine. It's only the choice of the term that may cause confusion as evidenced there, and that's just a matter of documenting the difference vis-à-vis HTTP.

But that position doesn't force our hand in the PUT matter one way or another. No one's existing system is going to become magically JSON API -compatible anyway, so why is switching from PUT to PATCH supposedly such a big pain?

PUT will still be useful for the same thing, as it allows more robust partial updates via delineation of the target subresource using the fields param.

but that also introduces another media type, of which there is only one or two, which has its own set of problems

I don't understand what this means.

@dgeb
Copy link
Member

dgeb commented Mar 6, 2015

We understand the uproar and are making an 11th hour switch: PATCH will be used instead of PUT in the base spec.

The "Patch Extension" will be renamed the "JSON Patch Extension" to disambiguate the PATCH method from the JSON Patch format.

@ethanresnick
Copy link
Member

Yay! Happy dance.

@tkellen tkellen mentioned this issue Mar 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants