What data should go in headers, and what data in the response body #45
Comments
I've never used or seen used the range acceptance headers for pagination -- and would that cover all the metadata you may need? (total count, how many per page, current offset) I get that there's a benefit to being able to get metadata about the response without actually parsing the response body. It just doesn't seem like a huge benefit. |
Is there room for some language merely flagging this for the API producer to consider? |
I'm not personally comfortable recommending something none of us here at 18F have ever used ourselves. If that's not true, and an 18Fer has used it and thinks it's a good idea to include, then let's find a way to at least reference it. |
There are two pieces here:
Don't a lot of us already do the second? Currently, we address basic numbering for errors, but don't explicitly discuss where to put the status code. It's implied by the code sample to put it in the response body, but why not say that without a compelling reason not to, there needs to be an HTTP status code in the response headers (per http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html)? |
The suggestion here is to use HTTP status code There's no way to return an HTTP response without a status code. It doesn't make sense to put it "in the response headers" -- it's a core part of an HTTP response. We could potentially add language suggesting the use of response codes commonly used in APIs, like |
Totally agree re abstaining from any recommendation or language on pagination and codes. Can we just make clear that if you're sending a status code in the JSON body, it should match the code in the response header? E.g. no 200s in the header when a 500 is in the JSON body? Or is this too obvious? |
This can be handled in the HTTP status code alone. Having a status code in the JSON body was really only needed over JSONP, where only 200 HTTP status codes would allow clients to see the response. Since we're not using JSONP, this isn't necessary anymore. Related to #45.
This can be handled in the HTTP status code alone. Having a status code in the JSON body was really only needed over JSONP, where only 200 HTTP status codes would allow clients to see the response. Since we're not using JSONP, this isn't necessary anymore. Related to #45.
Yeah, okay, this inspired me to update the error message body in 41ab481 to remove the status code from the JSON body altogether. That makes the subsequent references to status codes unambiguously about HTTP and not the JSON body. The only reason to have separate JSON body status codes from HTTP status codes is if you are supporting JSONP. All JSONP responses, including errors, need to use 2XX response codes, or else browsers silently drop the callback and the request never completes. So, it's common practice for JSONP error messages to use a 200 response code for errors. We're not using JSONP, so this isn't necessary for us. |
👍 Should we link to the HTTP spec/list of status codes in the section where we mention them. E.g.:
|
Hi, I wrote the original comment on HN. First off, I'm not entirely sure of the scope of these API recommendations. My experience comes from creating APIs to support dataset discovery and large scale batch data processing (i.e. supporting thousands of parallel jobs). So that's where I'm coming from, and my users aren't the average user. Nonetheless, if an API might be used for large scale data processing, I do have a lot of experience with that. First off: Headers are typically meant to be metadata about the request and/or underlying resource (endpoint) itself. I don't believe everything should be packed into the headers by any means, but I believe a lot of value can be added when adding headers that are actionable in nature to a response (i.e. caching). For the Range header/partial comment thing, I think it's probably something to think about, but I wouldn't take it to be strict at all. My experience has been providing APIs specifically for the return of (physics) datasets from a system akin to a virtual file system, where the quantity of datasets can easily surpass 10k items and gzipped responses can easily still be over several megabytes in size. Furthermore, due to the nature of metadata which is attached to my datasets, object size isn't entirely predictable. One of those 10k objects might be 100 bytes long, one might be 1kB. As such, it has been my experience has been that chrome can occasionally choke on processing JSON that large, and naive JSON parsers aren't always up to the task either. In these cases, it is useful in understanding the nature of the response before processing. For example, another thread could fire up an additional request before it starts processing, easily saving a half second or more in latency. So maybe it makes sense when returning 10k objects, but likely it doesn't make sense when returning 10 objects (unless they are really, really big objects). For the HTTP status code, I do think that using the standard HTTP status codes provides lots of benefits. However, it is clear some of these status codes were written for a different age, and as such, they can occasionally be awkward. A good example is the This is why you can easily end up with responses where someone returns a
There is some value in this, however. An example being when web applications are behind proxies/load balancers which prefer to return a 4xx or 5xx when the proxies themselves can't find the web application or have issues. This can be semantically confusing to a user if the user only checks the status code, but it could easily be worked around if, for example, the web application typically responds with a header about the application itself which hopefully isn't sanitized by the proxy server (a Server header will often be rewritten with a proxy's Server header, for example). For example: ...Which is a good use case, I believe, for adding metadata about the response in a header. I'd concede many of these recommendations don't always make sense for many APIs, especially low volume, small response, and relatively static APIs. Hopefully this discussion is useful for you guys :) |
I gave this a first read, and will re-read a few more times, parse, and respond. Until then, I just want to thank you for taking the time share your knowledge and API domain experience with us, the the general public; on HN, and right here. |
Yeah it was a little long, sorry about that. Maybe a good recommendation on what should go in a header is this: Whenever data can be immediately actionable by a client, server, or a proxy[1] between them, it may be useful to add this data to a header. Examples include:
[1] A proxy is anything which may read and act based on data in a request or response, such as a load balancer, gateway, or filter. |
👍 to this change. I've definitely encountered my fair share of API errors being returned with a 200 response status, so I think this nicely clears up the guidance (since it did seem a bit ambiguous what the intention of the
@adelevie I also like your suggestion of linking off to details about the various HTTP error codes to pick from (since I think developers can often be drawn towards just using something generic like 500 all the time, even if there might be more appropriate status codes to use). However, what about linking off to the wikipedia page on the topic instead? The RFC 2616 page does go into nice detail, but the wikipedia page aggregates together some of the other common status codes, like the additions from RFC 6585 (plus, the all important And thank you @brianv0 for the detailed feedback! There's certainly a fair bit to chew over here. Regarding the specific topic of using |
The folks who have worked on (and are working on) the data platform there may have useful thoughts with regards to paginating large result sets. Pinging @cndreisbach, @marcesher. |
@GUI, shall we link to both? |
From an HN comment:
This seems like it's in the grey area between standard and recommendation, but what does everyone think? cc @konklone @GUI
The text was updated successfully, but these errors were encountered: