Allow custom content type for JSON body#190
Conversation
|
I thought it's worth to properly refactor this into some helpers for the content type. |
|
Can you take a look at the CI failures please? |
|
Ah, it seems the names of capture groups are only properly supported for recent R versions. Fixed now 😄 |
R/resp-body.R
Outdated
| check_response(resp) | ||
| check_installed("jsonlite") | ||
| check_content_type(resp, | ||
| check_resp_content_type(resp, |
There was a problem hiding this comment.
Why can the suffix arguments go away now?
There was a problem hiding this comment.
It isn't needed as the suffix can be determined from the types argument.
There was a problem hiding this comment.
I think that only coincidentally works for json and xml, but isn't generally true: https://www.iana.org/assignments/media-type-structured-suffix/media-type-structured-suffix.xhtml.
There was a problem hiding this comment.
I don't see a counter example there. Maybe I misunderstood.
There is also an extensive list of media types.
Do you think any of these cases should be treated differently?
devtools::load_all("~/GitHub/httr2/")
#> ℹ Loading httr2
# anything with suffix `+cbor` should be valid
check_content_type("application/cbor", "application/cbor")
#> NULL
check_content_type("application/ace+cbor", "application/cbor")
#> NULL
check_content_type("application/ace", "application/cbor")
#> Error:
#> ! Unexpected content type 'application/ace'
#> ℹ Expecting 'application/cbor' or 'application/<subtype>+cbor'
#> Backtrace:
#> ▆
#> 1. └─httr2:::check_content_type("application/ace", "application/cbor")
#> 2. └─rlang::abort(...) at httr2/R/content-type.R:68:2
# if `valid_type` has <subtype>+<suffix> the content type must have the same structure
check_content_type("application/ace+cbor", "application/ace+cbor")
#> NULL
check_content_type("application/ace", "application/ace+cbor")
#> Error:
#> ! Unexpected content type 'application/ace'
#> ℹ Expecting 'application/ace+cbor'
#> Backtrace:
#> ▆
#> 1. └─httr2:::check_content_type("application/ace", "application/ace+cbor")
#> 2. └─rlang::abort(...) at httr2/R/content-type.R:68:2
check_content_type("application/cbor", "application/ace+cbor")
#> Error:
#> ! Unexpected content type 'application/cbor'
#> ℹ Expecting 'application/ace+cbor'
#> Backtrace:
#> ▆
#> 1. └─httr2:::check_content_type("application/cbor", "application/ace+cbor")
#> 2. └─rlang::abort(...) at httr2/R/content-type.R:68:2Created on 2023-01-18 with reprex v2.0.2
There was a problem hiding this comment.
As far as I can see from reading the spec, there's no relationship between the subtype and the suffix, except that there happens to be both subtypes and suffixes for xml and json.
IOTW this is incorrect because there's no +plain suffix:
check_content_type("application/cbor", "text/plain")
#> Error:
#> ! Unexpected content type 'application/cbor'
#> ℹ Expecting 'text/plain' or 'text/<subtype>+plain'There was a problem hiding this comment.
I'm not quite sure I can follow. As far as I understand there are two cases:
- content type has no suffix: e.g.
application/jsonorapplication/xml. The subtypejsonresp.xmltells you that the body is in JSON resp. XML format and can therefore be parsed accordingly. - content type has a suffix: e.g.
application/vnd.api+jsonhas the suffixjson. This tells you that the content is in JSON format. The subtypevnd.apitells you that the JSON has a specific structure.
This is also what Wikipedia says about this:
Suffix is an augmentation to the media type definition to additionally specify the underlying structure of that media type, allowing for generic processing based on that structure and independent of the exact type's particular semantics. Media types that make use of a named structured syntax should use the appropriate IANA registered
"+"suffixfor that structured syntax when they are registered.
and this is how I understood section 4.2.8. Structured Syntax Name Suffixes and 4.11. Fragment Identifier Requirements of RFC6838 (though I find that a bit difficult to understand).
Therefore I would argue that check_content_type("application/vnd.api+json", "application/json") should work (the actual content type is more specific than the demanded one) but check_content_type("application/json", "application/vnd.api+json") should not work (the actual content type is less specific than the demanded one).
Does this make sense to you or do you see other issues here?
There was a problem hiding this comment.
I think you're pre-supposing a relationship between type and suffix that doesn't exist, and it would be better to keep suffix as a separate argument. In practice, I don't think this will affect how the code works, but I don't like writing a function that implies a structure that I don't believe to be true.
But I don't think I'm doing a good job of explaining exactly what I mean here, so would you mind just trusting me and switching back to an explicit suffix argument? I think it's a pretty simple change.
There was a problem hiding this comment.
I'm happy to go back to suffix. But I'm not quite sure what the interface should like like in the end. I think there are three types of questions on the content type:
- Is the content in JSON format?
- Is the content type
application/json? - Is the content type
application/vnd.api+json?
| Is the content in JSON format? | Is the content type application/json? |
Is the content type application/vnd.api+json? |
||
|---|---|---|---|---|
application/cbor |
no | no | no | |
example/json |
yes | no | no | |
application/json |
yes | yes | no | |
application/test+json |
yes | yes | no | |
application/vnd.api+json |
yes | yes | yes |
Do you agree that these three questions all make sense? And what do you think the interface should look like then?
There was a problem hiding this comment.
Let take a stab at this PR and give you something to look at.
…into req_body_json-type
#Conflicts: # NEWS.md # tests/testthat/_snaps/req-error.md
|
It looks like the only remaining issue in this is a slight conflict in NEWS.md. |
|
|
||
| # Must set header afterwards | ||
| # Respect existing Content-Type if set | ||
| type_idx <- match("content-type", tolower(names(req$headers))) |
There was a problem hiding this comment.
I've changed the logic a bit here — now if you manually override with the content-type header, we don't check it. I think this is ok, and gives the user an escape hatch if they're having problems. But generally we expect people to set the content type via the appropriate req_body_* function, which will verify that the type is as expected.
| #' | ||
| #' @param resp A response object. | ||
| #' @param valid_types A character vector of valid content types. | ||
| #' @param valid_suffix A string given an "structured media type" suffix. |
There was a problem hiding this comment.
This keeps the type and suffixes orthogonal in a way I think matches the spec.
|
@mgirlich let me know what you think of this approach. |
#Conflicts: # NEWS.md # R/resp-body.R # tests/testthat/_snaps/req-body.md # tests/testthat/_snaps/req-error.md # tests/testthat/_snaps/resp-body.md
|
I find it a bit confusing that the following works: check_content_type(
"audio/abc+json",
valid_types = "application/json",
valid_suffix = "json"
)If we want to allow this then Another really confusing example is check_content_type("application/test+json", "application/test+json")This would suggest that we should validate the |
| # ``` | ||
| stopifnot(length(x) == 1) | ||
| regex <- "^(?<type>application|audio|font|example|image|message|model|multipart|text|video)/(?<subtype>(?:(?:vnd|prs|x)\\.)?(?:[^+;])+)(?:\\+(?<suffix>(?:[^;])+))?(?:;(?<parameters>(?:.)+))?$" | ||
| if (!grepl(regex, x, perl = TRUE)) { |
There was a problem hiding this comment.
Maybe this should be an error instead? And we might just have a special handling when the content type is NULL or the empty string?
|
Oh yeah, we can definitely make |
Fixes #189. Fixes #284.
This PR doesn't support parameters in the media type. They could be added but I never encountered them and I'm not sure the extra complexity is worth it.
For
req_body_form()it only allowsapplication/x-www-form-urlencoded. I'm happy to add support for the more general form similar to JSON but I'm not sure it's worth it.