Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion issue for #830: balloting of error handling #831

Closed
aphillips opened this issue Jul 17, 2024 · 19 comments
Closed

Discussion issue for #830: balloting of error handling #831

aphillips opened this issue Jul 17, 2024 · 19 comments
Labels
errors Issues related to the errors section of the spec resolve-candidate This issue appears to have been answered or resolved, and may be closed soon.

Comments

@aphillips
Copy link
Member

Discussion thread for error handling

This issue provides a discussion space for questions or comments on the balloting of 'error handling' currently (2024-07-16 through 2024-07-21) taking place in issue #830.

Useful references:

Some terminology:

Formatting attempt means a call to a message format implementation for a given message with a set of arguments intended for formatting.

Signal an error is a deliberately vague, generic, neutral way of referring to how an implementation registers that an error has occurred during a formatting attempt with the caller. Common signaling mechanisms include throwing exceptions, returning a value that indicates an error, setting an error flag on the formatter object, and many more.

Provide a fallback representation means that there is some way for the caller to obtain a version of the message that is partially formatted according to the rules already provided in Formatting and notably, but not exclusively, here and here

MUST and SHOULD have their normal RFC2119 / BCP14 meaning.

@aphillips
Copy link
Member Author

aphillips commented Jul 17, 2024

@macchiati commented:


I can't really answer unless the question is a bit more clear.

  1. "for signaling errors" - If this were "Must provide a mechanism for
    detecting errors" I would pick a higher number. That is, it could be
    satisfied by throwing an exception, or by having an additional return
    parameter, or by providing a separate function to query whether there was
    an error.

  2. I think the question might depend on the type of errors (This division
    doesn't align with the typology in the spec, because it is "behavior based"
    based.)

    1. no matter what the input parameters are — eg syntax errors like{$abc
      $def}
    2. call-site mismatch errors — eg format(myDateMessage,
      date="Einstein"), or missing input parameter
    3. others

Definitely for (1) I don't think there has to be a fallback message result

@aphillips
Copy link
Member Author

aphillips commented Jul 17, 2024

@macchiati

for signaling errors" - If this were "Must provide a mechanism for detecting errors" I would pick a higher number. That is, it could be satisfied by throwing an exception, or by having an additional return parameter, or by providing a separate function to query whether there was an error.

This is precisely the meaning of "signal an error". See above. That is, we cannot (because of diversity in languages and frameworks) say exactly how errors are signaled to users.

Definitely for (1) I don't think there has to be a fallback message result

There can be a fallback message result for syntax and data model errors, but it will not be a very useful message, since the user's intention generally cannot be intuited from a broken message. The fallback string (unless overridden by an implementation-specific fallback, which is permitted by the spec) for syntax and data model errors is what we euphemistically call "the logo", i.e. the string "{�}".

One of the key questions in this balloting is whether we require that implementations provide access to a fallback representation in all cases or whether it is optional.

@macchiati
Copy link
Member

macchiati commented Jul 17, 2024 via email

@macchiati
Copy link
Member

macchiati commented Jul 17, 2024 via email

@macchiati
Copy link
Member

macchiati commented Jul 17, 2024

What I really think is that the policy should be:

  1. Must provide a mechanism for detecting any errors specifically listed in the spec.
  2. May also support detecting errors not listed in the spec.
  3. In cases of an error:
    1. Should provide fallback string result, where it can convey useful fallback for the end-user
    2. May provide fallback string result, where it cannot.

Examples:
"{�}" conveys no useful information to users
"3 cm" conveys useful information to users, even though it should have been expressed as "3 Zentimeter"

@sffc
Copy link
Member

sffc commented Jul 17, 2024

I don't understand the difference between "(1) MUST signal errors and MUST provide fallback" and "(2) MUST signal errors or MUST provide fallback".

I think implementations must do both, but not necesarily at the same time. For example, an implementation with two functions formatWithFallback and formatOrThrowException should be conformant, but an implementation with one but not the other perhaps should not be conformant.

I do not think an implementation should be required to have a function that does both at the same time in order to be conformant. For example, a function that throws an exception that has a fallbackMessage field would be doing both at the same time. This is fine, but not required for confornace.

Does that mean I should vote for option 1 or option 2?

@eemeli
Copy link
Collaborator

eemeli commented Jul 17, 2024

@sffc Option 2 would mean that an implementation would need to provide at least one of the methods formatWithFallback and formatOrThrowException, but would not be required to provide both.

Option 1 allows for a compliant implementation to provide formatWithFallback and formatOrThrowException separately, or as a single method as in the current Intl.MessageFormat JS proposal.

@sffc
Copy link
Member

sffc commented Jul 17, 2024

Examples: "{�}" conveys no useful information to users

To jam off this a bit: in ICU4X 2.0 DateTimeFormatter, we have the following error handling:

Pattern: 'It is:' E MMM d y G 'at' h:mm:ssSSS a zzzz

Error Type Output String
(success) It is: Monday November 20 2023 CE at 11:35:03.000 a.m. Greenwich Mean Time
Missing Data It is: mon M11 20 2023 ce at 11:35:03.000 AM +0000
Missing Input Fields It is: {E} {M} {d} {y} {G} at {h}:{m}:{s}{S} {a} {GMT+?}

"Missing Input Fields" attempts to say what type of thing was expected in a particular placeholder position, which is more useful than completely omitting it.

@eemeli
Copy link
Collaborator

eemeli commented Jul 17, 2024

According to our current formatting spec, the only way to get to "{�}" is if the source message contains a syntax or data model error. In all other cases, we end up with at least some pattern to format. There is a possibility that if, as proposed in #603, we drop the * pattern requirement, which would open up another path to "{�}".

@macchiati
Copy link
Member

macchiati commented Jul 17, 2024

I think we need to be careful in talking about who it is useful for.

Looking at Shane's,

{E} {M} {d} {y} {G} at {h}:{m}:{s}{S} {a} {GMT+?}

  1. It might be useful for the developer (in debugging). Although for that I would argue that an even more useful message would be something like "Missing $datetime parameter". But for debugging, a good error message is even more valuable, with internal details.
  2. It is not at all useful for the end user; you really wouldn't want that message to show up in production software — or show error messages with internal details.

@aphillips
Copy link
Member Author

@eemeli suggested:

@sffc Option 2 would mean that an implementation would need to provide at least one of the methods formatWithFallback and formatOrThrowException, but would not be required to provide both.

This is correct as far as it goes. It would also be compliant to have a format method that threw for some errors and returned a fallback string for others (this is the example @mihnita gave in the call, the word "or" in the name is perhaps a logical OR). Whether that's a good idea or not is a separate question. It would also be valid with option 2 to do both at the same time:

int result = format(message, argMap, target);
if (result == NO_ERROR) {
   // happy path
   print(target);
} else {
   // target contains the fallback string
}

@sffc
Copy link
Member

sffc commented Jul 17, 2024

I think we need to be careful in talking about who it is useful for.

That's a good point that clarifies things. There are cases where an error is more helpful (usually a programmer error), and there might cases where a fallback string is more helpful, but I don't feel super confident in specifically enumerating those caes.

@aphillips
Copy link
Member Author

aphillips commented Jul 17, 2024

I think this thread might be focused on the wrong thing.

The question here is really "what does MF2 normatively require for an implementation to call itself 'conformant'?" or maybe "what can we normatively require?"

Our spec carefully enumerates the error conditions and provides tests for them (in a way that implementers can "hook up" to their own implementation, indeed are required to "hook up" more-or-less by hand). But we don't, necessarily, require that you create a specific special error state/value/class for each one. It might be perfectly valid for a Java implementation to just throw RuntimeException with the message "Stuff happened" for every error type and still be conformant with "MUST" for signaling errors. Would that suck? Yes. But that's on the implementer.

Trying to require specific error behavior (including fallbacking) is tricky because we want to allow the signal to be shaped however the implementer feels is most natural for their users, including in environments where MF2 is wrapped by resource or string management APIs and including existing APIs, which are already called by existing code that cannot be changed.

@eemeli
Copy link
Collaborator

eemeli commented Jul 18, 2024

My take on the overall scope of what we're considering here is that the spec should define what happens when messages are formatted, and that this should include error cases. In fact, this is one of our deliverables:

A specification for resolving messages at runtime, including runtime errors.

If we leave error handling completely out of the spec, then I think we'd need to revisit this deliverable. And I at least would much rather not need to do so.

I'm also glad that we're taking error handling and fallback behaviour seriously, as my experience indicates that message formatting/localization fails in general more often than many other parts of UX code, as it includes additional steps due to the localization of said messages. Failures in message formatting are far more often not discovered until production, as automated tests very rarely test all localizations. Therefore, it's important for the MF2 spec to ensure that users can always get at least some representation of a message via fallbacking, so that a message formatting failure can be considered only a partial failure, rather than a complete failure, of the UI.

Well-defined fallbacking (which we currently have) also ensures that any two MF2 implementations will produce the same output for the same inputs, effectively a requirement for hydration and other techniques allowing a server and client to cooperate in building a UI.

@aphillips
Copy link
Member Author

Well-defined fallbacking (which we currently have) also ensures that any two MF2 implementations will produce the same
output for the same inputs, effectively a requirement for hydration and other techniques allowing a server and client to
cooperate in building a UI.

This might be true for fallbacking, but it is emphatically not the case for non-erroring messages. Differences in runtime environment, formatting function implementation, and locale data means that the same source message with the same inputs can produce different (but recognizably correct) outputs.

A specification for resolving messages at runtime, including runtime errors.

If we leave error handling completely out of the spec, then I think we'd need to revisit this deliverable.
And I at least would much rather not need to do so.

I think you might be reading too much into the deliverable goal? We absolutely do identify the error conditions that arise in the resolution of a message at runtime. A "bad operand" error is a bad operand error. In some cases these are implementation-defined (such as type mismatches), but in most cases they are defined by our spec. So it is fair to say that any two MF2 implementations will produce the same error state. What has been suggested in this discussion over several weeks is that we don't say how that state is communicated.

Revisiting this:

effectively a requirement for hydration and other techniques allowing a server and client to cooperate in building a UI.

I agree that this is somewhat desirable, although, as noted by @sffc and @macchiati and others, the fallback message has limited utility. There's not that much utility variation between these fallbacks in a hydrated message:

You have {$count} attempts remaining on {$date}
You have {�} attempts remaining on {�}
{�}

The end user is still shaking their head because there is an error preventing usability.

@macchiati
Copy link
Member

I think basically, we are identifying a set error conditions (eg syntax), giving them IDs, and saying (with the first 3 cases) that the implementation has to recognize those conditions and be able to communicate them to the caller in some way (exception, different function call, etc).

Now, we are not (and should not) specify the nature of that communication, nor the the format of the error message that results, nor that they have to communicate those precise IDs. That really depends heavily on the capabilities and idioms of the programming language and library.

As to fallbacks message results, I'm rethinking my vote after considering Eemeli's thoughts. It is pretty low effort to return "{�}", so that is not much of an imposition on implementations. It does make it slightly less natural for some environments where the natural idiom would be (in pseudocode) to return null if there is any error:

if (result == null) {
   errorInfo =formatGetError(myMess, parameters);
}
// not that I'm recommending that idiom

But it isn't huge, because it could easily change to

if (result.equals("{�}")) {
   errorInfo =formatGetError(myMess, parameters);
}

So given that, I'm ok with Must(error) and Must(fallback).

@stasm
Copy link
Collaborator

stasm commented Jul 19, 2024

According to our current formatting spec, the only way to get to "{�}" is if the source message contains a syntax or data model error. In all other cases, we end up with at least some pattern to format.

Are implementations expected to allow users to format messages that contain syntax or data model errors? Or should there be 2 separate steps in the API: parsing and formatting? In which case syntax and data model errors can be detected early, before the user attempts to format the broken message.

I’d like to challenge the current spec where it reads:

For example, a message with a Syntax Error and no fallback string defined in the formatting context would format to a string as {�}.

If we drop the above and require the API to be two-step, we could then map the two logical alternatives in (2) MUST signal errors -or- MUST provide fallback to these two steps: parsing and formatting.

That said, my preference would be similar to @macchiati’s #831 (comment):

  • During parsing, MUST signal errors.
  • During formatting, MUST signal errors and MUST provide fallback.

@eemeli
Copy link
Collaborator

eemeli commented Jul 19, 2024

Are implementations expected to allow users to format messages that contain syntax or data model errors? Or should there be 2 separate steps in the API: parsing and formatting? In which case syntax and data model errors can be detected early, before the user attempts to format the broken message.

#816 is a draft PR based on the consensus we'd reached during the 24 June call adopting what's presented in the ballot as Option (1), and so could be taken as a representation of that option, should we reaffirm here our earlier choice of it.

With that approach, fallback formatting would not be required for messages with syntax or data model errors. The intent is to allow for (but not require) a two-step approach as you describe, so that the earlier parsing step could emit an error, rather than requiring it to produce a formatted result for a broken message.

This would still allow a single-step implementation that always returned a formatted result.

That said, my preference would be similar to @macchiati’s #831 (comment):

  • During parsing, MUST signal errors.
  • During formatting, MUST signal errors and MUST provide fallback.

This would be fulfilled by Option (1). Note that it requires fallback only for "a message that produces a formatting or selection error".

@aphillips
Copy link
Member Author

@stasm

I agree with you, except that there can be static APIs that "do it all in one go". Such an API, if it provided a fallback, would need to use the logo (or some other string). Parsing can be separated from formatting and, indeed, the specification separates these operations. But it doesn't have to to be separate.

Again, I think my concern is that, as an implementer I'm going to make responsible decisions for my users. Elsewhere (in calls and in the various issues linked above) I pushed hard on "MUST signal errors". But I got to thinking: what is our concern in creating this requirement? What specific benefits are we trying to ensure as a standard with "MUST"? We need to clearly define what the bar is for "conformance" and not make it too onerous.

One benefit I see is that, for various conditions specified in our prose, implementations need to be consistent about "being in an error condition"--not succeeding where other implementations were told to fail. So, we can require that, for example, if you pass the operand |horse| to the built-in :number function, you should be in the bad-operand error state--however you define and signal that state. This is always an error. There are some SHOULD or MAY errors in the spec as well.

So, my tendency would be to go to parts of the text that describe error conditions and not say "signal an X error" but rather say "this is an X error and the fallback is Z". In the section on errors we then say "Hey, implementer, signal errors however seems best to you. You are not conformant if parsing/formatting succeeds for any defined error condition. If you implement fallback formatting, you need to emit whatever is defined as the fallback."

Others have said that they want to provide non-erroring formatting functions (to serve up the fallback). but a non-erroring fallback function might not have to be used in an erroring context only. If it's a public API you can just do all your formatting through it, right? If that's true, is that "MUST || MUST" (you have to do one or the other, and MAY do both but never neither)?

@aphillips aphillips added resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. errors Issues related to the errors section of the spec labels Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
errors Issues related to the errors section of the spec resolve-candidate This issue appears to have been answered or resolved, and may be closed soon.
Projects
None yet
Development

No branches or pull requests

5 participants