Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt RFC 9457 error response format #133

Closed
eric-murray opened this issue Feb 5, 2024 · 20 comments
Closed

Adopt RFC 9457 error response format #133

eric-murray opened this issue Feb 5, 2024 · 20 comments
Labels
enhancement New feature or request

Comments

@eric-murray
Copy link
Collaborator

eric-murray commented Feb 5, 2024

Problem description
The current CAMARA error response has some drawbacks:

  • It is proprietary (in the sense that nobody else uses it) requiring the API consumer to use proprietary error handlers
  • It cannot provide links to more specific (possibly API provider specific) documentation on the error
  • The allowed code values are too narrow to provide much more detail than the HTTP status value itself

Possible evolution
Adopt the RFC 9457 error format as follows:

  • Rename the code field to title and message field to detail
  • Allow an optional type field which can be used to link to better documentation, for example:
    • Specific lines of documentation within the API specification
    • Specific sections of the CAMARA documentation
    • Alternative API provider specific documentation or support pages
    • IANA standardised problem types

Additional context
This issue was already raised in Issue #31, which proposed adopting the now obsolete RFC 7807. There was a long conversation in that issue, which is summarised below.

(EDIT - To be clear, the "advantages" and "disadvantages" listed below are summarised from the conversation of Issue #31. They are not my personal nor Vodafone's opinion. This can be found in my comment below.)

Advantages of adopting the RFC error format

  • Adopts a standardised error response format
  • Provides the possibility to link to additional troubleshooting information for application developers and other API consumers, who may be trying out APIs manually rather than programmatically
  • Allows developers to use RFC compliant error handlers
  • Allows extension members, such as cause, to be adopted whilst remaining standard compliant

Disadvantages of adopting the RFC error format

  • Breaking change, requiring API providers and consumers to update their code base
  • Populating additional fields such as type cannot be automated using code generators, which complicates the construction of error responses (for those that want to include the type field) and hence makes more work for API implementors
  • Such additional information is not useful for production APIs being called by applications with automated error handling
  • Additional error detail could be provided by adding more possible code values than are currently defined, and standardising their use across APIs
  • Some hyperscalers use error response formats that include code and message, though the CAMARA format follows none of them exactly

Further edit

It was commented at the recent TSC meeting that the proposal was not clear, so I will clarify that now:

  • Rename the code field to title, with no other change to how that field is currently used
  • Rename the message field to detail, with no other change to how that field is currently used
  • No change to the status field
  • Introduce an optional type field as follows:
    • If not included, the client will interpret this as about:blank
    • If included, the API provider can include any URI that they choose. No standardisation of the contents of this field is proposed by this issue.
  • The Content-Type should be changed to application/problem+json, but that is not widely used and I always take a pragmatic approach to these things.

And that is it. None of this rules out further evolution of the error response in future, possibly towards a "purer" implementation of RFC 9457 if that would be useful. But nor does it require it.

@uwerauschenbach
Copy link
Collaborator

My preference is to go for the RFC error format. It might be a painful change now but we have a solid foundation for the future that also allows standard compliant extension by machine-readable fields.

@pjhac
Copy link
Collaborator

pjhac commented Feb 5, 2024

Hello,

I would have been tempted to support RFC format, but the disadvantages listed by @eric-murray above somehow discouraged me. In particular, the "status", as defined in the RFC, has to be redundant of the HTTP Status Code, but I know by experience that poorly designed/implemented clients and servers are going to misuse it (a recurring situation was to provide a sub-API response in such "status" parameters). If the server violates RFC9457, the client cannot really know it, and as such should stick to the HTTP Status Code, which makes this additional "status" meaningless.

I'd personally find more useful to have an error "code" (like DEVICE_IDENTIFIER_TYPE_NOT_MANAGED as proposed in #127 that can be enumerated/listed in the API) and a "reason" (RFC9457 "title" equivalent) describing the original issue. Any other parameter should be optional, in the sense that "code" and "reason" should be enough to identify the issue. The parameters names can be changed to whatever makes sense, but they should be minimal and required, while providing sufficient information.

As for extensions, they are implementation-dependent and/or API-dependent and/or problem-dependent, so defining them might prove to be a daunting task, especially if we want to have a well-defined listing of potential problems.

@shilpa-padgaonkar
Copy link
Collaborator

Thanks @eric-murray for creating the consolidated issue out of the older discussions in #31.

As requested by @bigludo7 in this discussion comment, #125 (comment) , we kindly request all commonalities participants and other interested Camara members to provide a "formal" position on this issue.

@uwerauschenbach
Copy link
Collaborator

uwerauschenbach commented Feb 6, 2024 via email

@eric-murray
Copy link
Collaborator Author

Vodafone's preference is to adopt an RFC 9457 compliant error response.

The only mandatory requirements on API providers to support this change would be:

  • Rename the code field to title
  • Rename the message field to detail

Nothing else is required. Including a type field would be optional.

Discussions on future use of the title (currently code) and status fields are a separate issue to the one being discussed here. This issue is discussing the names of the fields, not the range of supported values.

@pjhac
Copy link
Collaborator

pjhac commented Feb 6, 2024

Hi @eric-murray

Discussions on future use of the title (currently code) and status fields are a separate issue to the one being discussed here. This issue is discussing the names of the fields, not the range of supported values.

I understand your point, but I was feeling that having an idea of the possible values (or their format, at a minimum) could have helped map the actual parameters to the RFC 9457 ones. In particular, "title" rather seems like a short sentence (and not a "code" type of value), which was why I was mentioning it as a "reason".

Now, with that said, I do not see another parameter in the RFC that could fit such "code" as defined at present time, so if only the two parameters you indicated would be mandatory requirements, we can converge to something common. I am then fine to go for a RFC 9457 compliant error response, and possible format and values to use for "title" and "detail" can be discussed separately.

Best regards,
Pierre

@gmuratk
Copy link
Collaborator

gmuratk commented Feb 6, 2024

T-Mobile US preference is to adopt RFC 9457 compliant error response.
• Rename the code field to title and maintain ‘HTTP Status code’ alignment
• Rename the message to title
• Add a cause value to describe API specific errors, as a new structure. (as showin in the example here)

@bigludo7
Copy link
Collaborator

bigludo7 commented Feb 8, 2024

We understood the value to move to a standard but from our perspective as the RFC 9457 is not widely used in the industry we did not see enough value to change all CAMARA assets. As such Orange preference is to not move to RFC9457.

@jlurien
Copy link
Contributor

jlurien commented Feb 9, 2024

The position from Telefónica is also to keep the current agreement and not move to RFC 9457. The current format is in line with the format used by big players and familiar to the developers, while the RFC is not widely adopted. The impact of a this change as this moment is huge, as there are many integrations going on.

Moreover, the advantage of the proposal is to adhere to a standard but at the same time it changes key aspects of the standard.

@hdamker
Copy link
Collaborator

hdamker commented Feb 11, 2024

Nothing else is required. Including a type field would be optional.

@eric-murray Could you explain why do you think that the type field is optional when using RFC9457?

RFC9457 says

  • in 3.1.: "Consumers MUST use the "type" URI (after resolution, if necessary) as the problem type's primary identifier."
    • BTW: that does mean also that the URI can't be changed, as that would "create a new identity for the problem type and thus introducing a breaking change"
  • also in 3.1.: "When this member is not present, its value is assumed to be about:blank
  • in 4.2.1: "The "about:blank" URI [ABOUT], when used as a problem type, indicates that the problem has no additional semantics beyond that of the HTTP status code"
    • When "about:blank" is used, the title SHOULD be the same as the recommended HTTP status phrase for that code (e.g., "Not Found" for 404, and so on), although it MAY be localized to suit client preferences (expressed with the Accept-Language request header).

My reading of the above is that RFC9457 can't be used without using the type member, at least if we would like to introduce specific problem types beyond going beyond the HTTP status code. Also adding additional members (like the cause in @gmuratk's proposal) requires the definition of problem type.

BTW: on the mailing list of RFC9457 there are several concerns about using the type member at the same time as an identifier AND as a resolvable URI pointing to a description. The main reasons brought by the editors to not split this into two members (e.g. introducing a descriptionURL) was the backward compatibility to RFC 7807.

Note: this isn't yet a position of DT, just a personal comment/question from my side.

@lbertz02
Copy link

Speaking as myself (not my company)

wrt the comments of not being widely adopted by industry - the RFC is an update to one that was not widely adopted (likely a driver for the update). Given the update was published 7 months ago, I would not expect it to be widely adopted. I would also not enjoy a precedent that industry adoption has to occur within 7 months of publication for any specification as most would not pass that test.

wrt the proposal, I feel that is clear. Other comments appear to be:

  1. code, although an optional attribute, is insufficient. there would be a desire for more values and to remain spec compliant, another attribute such as cause but that should be another issue

  2. At the schema level, the ‘type’ property is optional. However, some argue that it’s unrealistic for it to be optional because it serves as the primary identifier for the type. The RFC suggests that the document should be human-readable and cites HTML 5 as an example with a focus on documentation. Alternatively, a JSON-LD document could be used at the URI with related links to the HTML 5 documentation as a more traditional 'type' specification.

Regardless, the RFC only recommends link de-referencing when debugging, effectively making it a no more than an unique identifier / tag.

In either case, the IANA registration is optional as it is "for common, widely used problem type URIs, to promote reuse." <- that implies to the Internet as a whole. I would argue that CAMARA types would not immediately qualify at that level.

Initially I was in agreement with posts above but not now I feel it can be done as the RFC says and work for both secanrios. I am just uncomfortable with it but I note we often use types like with similar information elements of a URI. The additional obligation on our part is to provide some pages that it can resolve to by default if CAMARA provides the URI.

Finally on this issue, I agree with the observation that 'type' does not have to be sent on production servers. They usually represent certified client and server code so there is no point in sending it unless programmatic disambiguation is required.

  1. It was mentioned here that other big players do something else. Could you please provide links as I would like to learn more? Thanks!

@hdamker
Copy link
Collaborator

hdamker commented Feb 13, 2024

3. It was mentioned here that other big players do something else. Could you please provide links as I would like to learn more? Thanks!

@lbertz02 You can find some hints within the original issue #31.

@eric-murray
Copy link
Collaborator Author

Hi @hdamker

Could you explain why do you think that the type field is optional when using RFC9457?

As noted, the type field is optional because it has a default value of about:blank (effectively the API provider saying "I can't be bothered to tell you any more about this problem"). We could make it mandatory to explicitly include this value when no better value is available if we wanted type to be required. Even the most obstinate code generators could cope with that.

I also agree there is generally no value to the type field in a production system, other than maybe for IANA registered errors. That was never the intention for introducing it. That arose from the observation from test users of our CAMARA APIs, who frequently do not understand why they are getting a particular error message, and thus ask for support. Of course, we could look to expand the current message field, but there is a limit to how much information can be conveyed in one line of text.

And I also agree that using type in this way is maybe not following the spirit of RFC 9457. The proposal is a workaround to solve a perceived problem in a way that developers would (hopefully) recognise, rather than having to learn a new proprietary error schema. If the view is that RFC 9457 can only be adopted if we mandate a "purist" approach to its implementation, then I'd agree that an extension to the existing proprietary error schema would be a better solution to the problem that I am trying to solve.

@jlurien
Copy link
Contributor

jlurien commented Feb 15, 2024

I think that the key to enhance the current errors is to analyse why test users do not understand why they are getting a particular error message, I don't think that renaming the keys from code/message to title/detail would solve the problem, as long as we keep the same values. We probably have to define more explicit codes for expected problems and document better examples in the specs.

@eric-murray
Copy link
Collaborator Author

@jlurien
The renaming of the existing code and message fields is just a "side-effect" of the proposed solution, which is to adopt a standardised error response format that includes a suitable field for an external documentation link (for RFC 7807, and now RFC 9457, the type field). Apologies if that was not clear.

Of course, RFC 9457 has somewhat changed the intended use of the type field, but such is life. RFC 7807 was a better fit. I've not found another error response format standard that supports dereferenceable URIs in the way this issue intends.

Better documentation is always preferable, but it is a universal truth that developers are not great fans of reading documentation. Requests for support generally come from developers who did not read through the documentation.

@hdamker
Copy link
Collaborator

hdamker commented Feb 28, 2024

Hi @eric-murray

Thanks for the clarification that your main motivation for your original issue #31 and your support here for RFC 7807 and now RFC 9457 was to introduce an external documentation link.

My view is that this can't be achieve with the type parameter of RFC 7807/9457. This parameter is the identifier of a problem definition type -- which means that every time you are using a different URI here you are defining a new problem definition type and you are breaking the API contract. That was already the intention in RFC 7807. RFC 9457 just clarified that and explained how non-resolvable URNs can be used as these identifiers.

There is a requirement within the RFCs that the URI value in type SHOULD be resolvable and point to some documentation, but seems to be difficult to bring together with the requirement that the identifier for a problem type must not change.

Here are two examples of organisations which have adopted RFC 7807/9457, but explicitly decided against resolvable URI in the type parameter (thus ignoring the SHOULD requirement within the RFCs):

  • https://opensource.zalando.com/restful-api-guidelines/#176
    "Note: Problem type and instance identifiers in our APIs are not meant to be resolved. RFC 9457 encourages that problem types are URI references that point to human-readable documentation, but we deliberately decided against that, as all important parts of the API must be documented using OpenAPI anyway. In addition, URLs tend to be fragile and not very stable over longer periods because of organizational and documentation changes and descriptions might easily get out of sync."
  • https://www.belgif.be/specification/rest/api-guide/#error-handling
    "Note that using href instead of type for documentation intentionally deviates from the recommendation in the RFC. href allows use of a URL for documentation purposes that may change over time, while type can be specified as a URN that must remain stable. This is especially useful for API-specific problem types for which the documentation URL may depend on technical aspects, like deployment environment."

Here is one example which has adopted RFC 9457 and at least defined how to build resolvable URLs for new problem types:

My main point is that the following parts of your proposal (from "Further edit") would not comply with RFC 9457:

  • Introduce an optional type field as follows:
    • If not included, the client will interpret this as about:blank

The RFC says for this predefined problem type: "When "about:blank" is used, the title SHOULD be the same as the recommended HTTP status phrase for that code", it MAY be localized. But that's in contradiction to your proposal "Rename the code field to title, with no other change to how that field is currently used". We are currently using the field also with API specific strings.

  • If included, the API provider can include any URI that they choose. No standardisation of the contents of this field is proposed by this issue.

That would be completely in contradiction to the definition in RFC 7807 and RFC 9457, as explained above.

So my personal position is that adapting the RFC 9457 partially without following the intention of the RFC and using "problem types" as defined is worse than staying with our current proprietary structure. The only advantage would be that we then in same club as 3GPP, but still without a usable documentation reference.

To address of your initial idea of having a link to documentation we could add a proprietary optional parameter to our proprietary CAMARA error format, like the href in the Belgium guidelines.

@lbertz02
Copy link

lbertz02 commented Feb 28, 2024

I think it is important to note from RFC 9457
“If the type URI is a locator … dereferencing it SHOULD provide human-readable documentation for the problem type” – A locator resolves to something and they have a recommendation of a problem type.

also “using relative URIs can cause confusion, and they might not be handled correctly by all implementations.” – the RFC discourages use of relative URIs

and “The type URI is allowed to be a non-resolvable URI … However, resolvable type URIs are encouraged by this specification because it might become desirable to resolve the URI in the future.”

Other solutions noted above discuss the challenges of this dual use and solved them, e.g., use of an accompanying href for developer documentation.

Based upon the RFC guidance and other solutions, I would propose adopting RFC 9457 with the following constraints:

A. The type property is restricted to URNs as defined in RFC 8141 from a managed URN namespace managed by CAMARA. These URNs are not relative nor are they intended to resolve to locators.

B. Developer targeted information generally describing the error or related documentation will be a URL contained in a href property in the error.

Per RFC 8141, URNs are assigned under a URN namespace with “intent that the URN will be a persistent, location-independent resource identifier. A URN namespace is a collection of such URNs, each of which is (1) unique, (2) assigned in a consistent and managed way, and (3) assigned according to a common definition.”

Item A avoids the pitfalls we have identified while B supports the of the type property being a locator as described in RFC 9457.

As for the URN namespace, we would be to set up the namespace (NSS). The format would be

urn:camara:

The type value then is interpreted as follows:

  • No type value present, aka “about:blank”
  • CAMARA wide type, e.g., urn:camara:common…
  • CAMARA API specific type, e.g., urn:camara:…

@gmuratk
Copy link
Collaborator

gmuratk commented Mar 4, 2024

Following slides were added to Meeting Minutes in Wiki to be reviewed during the March 4, 2024 call.
20240304-CAMARA-Issue 133 - RFC 9457 URN Option.pdf

@eric-murray
Copy link
Collaborator Author

@hdamker

But that's in contradiction to your proposal "Rename the code field to title, with no other change to how that field is currently used". We are currently using the field also with API specific strings.

Now this is an interesting but separate issue, and one I did want to raise at some point. The current valid values for the code field are listed here, and it can be seen there is (more or less) a one-to-one correspondence between the IANA registry and the valid code values. The reason why, for example, "Not Found" had to become "NOT_FOUND" is maybe lost to history (developers have an unfounded fear of spaces in names for some reason), but I think the correspondence is anyway close enough to satisfy a "SHOULD".

But, as you point out, some APIs are introducing their own code values without any attempt to update the API design guidelines. Maybe we should extend this largesse to allow APIs to introduce additional API-specific error response fields according to their own requirements.

That would be completely in contradiction to the definition in RFC 7807 and RFC 9457, as explained above.

Can you clarify why you think the type value will be constantly changing?

  • If it is because, for example, a Vodafone API responds with a Vodafone type URL and DT respond with a DT URL, then I would argue that a Vodafone problem is different to a DT problem, and justifies a different problem type, even if the underlying client error is the same. Aggregators would need to provide their own type URLs (or remove the type field).
  • If it because the API provider can choose to vary it well, equally they can choose not to. It will always be true that an API provider can choose not to follow an agreed standard. Those who want to strictly follow RFC 9457 can choose to keep their type URLs constant. Those who cannot keep their type URLs constant should choose not to include the type field at all. And for those who want to include a type URL but need to constantly change the URL, well they should be encouraged (by our documentation) not to do that, but they can do it anyway and maybe they have a good reason, hence my proposal to acknowledge this possibility.
  • If it is because my original proposal was to link to specific locations in the API YAML for each error, I did not bring that proposal over to this issue. The proposal is now that type would be optional, and that those who use it can use it as their conscience allows.

As my intended use case is non-prod API testing, I'd be happy if we mandate strict RFC 9457 compliance for the type field in production, because Vodafone wouldn't use the type field in production anyway unless specific customers had a requirement for that. In non-prod, mandating the type URL is standardised across all API providers and unchanging is unnecessary, as I seriously doubt many non-prod API clients would be broken by a change in the type URL.

I also agree that my intended use case can be supported by a proprietary error response. Proprietary formats can support almost anything.

@eric-murray
Copy link
Collaborator Author

Now closed and replaced by #156.

@lbertz02 or @gmuratk will open a separate issue to discuss adopting RFC 9457 in place of the current proprietary error response format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants