Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 48 additions & 10 deletions spec/data-model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,31 +117,40 @@ String values include all processing of the underlying _text_ values,
including escape sequence processing.
`Expression` values wrap each of the _expression_ shapes.

Implementations MUST NOT rely on the set of `Expression` interfaces being exhaustive,
as future versions of this specification MAY define additional expressions.
Implementations MUST NOT rely on the set of `Expression` and
`Markup` interfaces defined in this document being exhaustive.
Future versions of this specification might define additional
expressions or markup.

```ts
type Pattern = Array<string | Expression>;

type Expression = LiteralExpression | VariableExpression | FunctionExpression |
UnsupportedExpression;
type Expression =
| LiteralExpression
| VariableExpression
| FunctionExpression
| UnsupportedExpression;

interface LiteralExpression {
type: "expression";
arg: Literal;
annotation?: FunctionAnnotation | UnsupportedAnnotation;
}

interface VariableExpression {
type: "expression";
arg: VariableRef;
annotation?: FunctionAnnotation | UnsupportedAnnotation;
}

interface FunctionExpression {
type: "expression";
arg?: never;
annotation: FunctionAnnotation;
}

interface UnsupportedExpression {
type: "expression";
arg?: never;
annotation: UnsupportedAnnotation;
}
Expand Down Expand Up @@ -172,17 +181,13 @@ interface VariableRef {
```

A `FunctionAnnotation` represents a _function_ _annotation_.
In a `FunctionAnnotation`,
the `kind` corresponds to the starting sigil of a _function_:
`'open'` for `+`, `'close'` for `-`, and `'value'` for `:`.
The `name` does not include this starting sigil.
The `name` does not include the `:` starting sigil.

Each _option_ is represented by an `Option`.

```ts
interface FunctionAnnotation {
type: "function";
kind: "open" | "close" | "value";
name: string;
options?: Option[];
}
Expand Down Expand Up @@ -214,11 +219,44 @@ that the implementation attaches to that _annotation_.
```ts
interface UnsupportedAnnotation {
type: "unsupported-annotation";
sigil: "!" | "@" | "#" | "%" | "^" | "&" | "*" | "<" | ">" | "/" | "?" | "~";
sigil: "!" | "@" | "%" | "^" | "&" | "*" | "+" | "<" | ">" | "?" | "~";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what order is this list in? Probably we should sort the reserved sigils into code point order?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's currently something close to the layout order on a US keyboard? I've no opinion on the order here.

source: string;
}
```

## Markup

A `Markup` object is either `MarkupOpen`, `MarkupStandalone`, or `MarkupClose`,
which are differentiated by `kind`.
The `name` in these does not include the starting sigils `#` and `/`
or the ending sigil `/`.
The optional `options` for open and standalone markup use the same `Option`
as `FunctionAnnotation`.

```ts
type Markup = MarkupOpen | MarkupStandalone | MarkupClose;

interface MarkupOpen {
type: "markup";
kind: "open";
name: string;
options?: Option[];
}

interface MarkupStandalone {
type: "markup";
kind: "standalone";
name: string;
options?: Option[];
}

interface MarkupClose {
type: "markup";
kind: "close";
name: string;
}
```

## Extensions

Implementations MAY extend this data model with additional interfaces,
Expand Down
15 changes: 10 additions & 5 deletions spec/data-model/message.dtd
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<!ELEMENT key (#PCDATA)>
<!ATTLIST key default (true | false) "false">

<!ELEMENT pattern (#PCDATA | expression)*>
<!ELEMENT pattern (#PCDATA | expression | markup)*>

<!ELEMENT expression (
((literal | variable), (functionAnnotation | unsupportedAnnotation)?) |
Expand All @@ -35,12 +35,17 @@
<!ATTLIST variable name NMTOKEN #REQUIRED>

<!ELEMENT functionAnnotation (option)*>
<!ATTLIST functionAnnotation
kind (open | close | value) #REQUIRED
name NMTOKEN #REQUIRED
>
<!ATTLIST functionAnnotation name NMTOKEN #REQUIRED>

<!ELEMENT option (literal | variable)>
<!ATTLIST option name NMTOKEN #REQUIRED>

<!ELEMENT unsupportedAnnotation (#PCDATA)>
<!ATTLIST unsupportedAnnotation sigil CDATA #REQUIRED>

<!-- A <markup kind="close"> MUST NOT contain any <option> elements -->
<!ELEMENT markup (option)*>
<!ATTLIST markup
kind (open | standalone | close) #REQUIRED
name NMTOKEN #REQUIRED
>
55 changes: 46 additions & 9 deletions spec/data-model/message.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,8 @@
"type": "object",
"properties": {
"type": { "const": "function" },
"kind": { "enum": ["open", "close", "value"] },
"name": { "type": "string" },
"options": {
"type": "array",
"items": { "$ref": "#/$defs/option" }
}
"options": { "$ref": "#/$defs/options" }
},
"required": ["type", "kind", "name"]
},
Expand All @@ -70,32 +66,36 @@
"literal-expression": {
"type": "object",
"properties": {
"type": { "const": "expression" },
"arg": { "$ref": "#/$defs/literal" },
"annotation": { "$ref": "#/$defs/annotation" }
},
"required": ["arg"]
"required": ["type", "arg"]
},
"variable-expression": {
"type": "object",
"properties": {
"type": { "const": "expression" },
"arg": { "$ref": "#/$defs/variable" },
"annotation": { "$ref": "#/$defs/annotation" }
},
"required": ["arg"]
"required": ["type", "arg"]
},
"function-expression": {
"type": "object",
"properties": {
"type": { "const": "expression" },
"annotation": { "$ref": "#/$defs/function-annotation" }
},
"required": ["annotation"]
"required": ["type", "annotation"]
},
"unsupported-expression": {
"type": "object",
"properties": {
"type": { "const": "expression" },
"annotation": { "$ref": "#/$defs/unsupported-annotation" }
},
"required": ["annotation"]
"required": ["type", "annotation"]
},
"expression": {
"oneOf": [
Expand All @@ -106,6 +106,43 @@
]
},

"markup-open": {
"type": "object",
"properties": {
"type": { "const": "markup" },
"kind": { "const": "open" },
"name": { "type": "string" },
"options": { "$ref": "#/$defs/options" }
},
"required": ["type", "kind", "name"]
},
"markup-standalone": {
"type": "object",
"properties": {
"type": { "const": "markup" },
"kind": { "const": "standalone" },
"name": { "type": "string" },
"options": { "$ref": "#/$defs/options" }
},
"required": ["type", "kind", "name"]
},
"markup-close": {
"type": "object",
"properties": {
"type": { "const": "markup" },
"kind": { "const": "close" },
"name": { "type": "string" }
},
"required": ["type", "kind", "name"]
},
"markup": {
"oneOf": [
{ "$ref": "#/$defs/markup-open" },
{ "$ref": "#/$defs/markup-standalone" },
{ "$ref": "#/$defs/markup-close" }
]
},

"pattern": {
"type": "array",
"items": {
Expand Down
61 changes: 44 additions & 17 deletions spec/formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ their handling during formatting is specified here as well.

Formatting of a _message_ is defined by the following operations:

- **_Expression Resolution_** determines the value of an _expression_,
- **_Expression and Markup Resolution_** determines the value of an _expression_ or _markup_,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make markup resolution a separate bullet in this list. Resolving markup is a very different operations from resolving expressions due to markup's not calling any functions, due to the default ignorable property in formatToString, and due to the fact that it can only happen inside patterns.

In fact, I'm not even sure if we should call it resolution. I guess we still resolve options, but crucially, in its current design, markup is encoded in the data model rather directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make markup resolution a separate bullet in this list. Resolving markup is a very different operations from resolving expressions due to markup's not calling any functions, due to the default ignorable property in formatToString, and due to the fact that it can only happen inside patterns.

I would prefer to keep these together, because the similarities are much greater than the differences.

  • Markup resolution does not call external functions, but its results may be defined in the same way as for expressions. We also have expressions such as literals and variables which do not necessarily call any functions.
  • Markup resolution uses the same option resolution as functions.
  • The behaviour of markup during string formatting is a concern during formatting, but not resolution.
  • Markup resolution only being used for pattern placeholders does not change how it is done.

In fact, I'm not even sure if we should call it resolution. I guess we still resolve options, but crucially, in its current design, markup is encoded in the data model rather directly.

Could you clarify what you mean by "data model" here? The only data model we define in detail is for the message before formatting, from which we need to select, resolve, and format a pattern during formatting. We left out formatted parts, so the output structure is not well defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first reaction to the draft was like @stasm's: it's okay to be somewhat repetitious in a spec. Keeping a wall between the two similar-but-unlike concepts might be useful.

I didn't make that comment in the end, because there is a lot of commonality and the resolution into different buckets does happen later.

We also have expressions such as literals and variables which do not necessarily call any functions.

Are you sure? If there is a placeholder {|zebra|}, I'm pretty sure that invokes the :string formatting function. If there is a placeholder {$foo}, I'd allow the implementation to introspect foo's type, but I would expect some function to be called in the end. We don't want it to be an error to write messages like Hello {$user}...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a placeholder {|zebra|}, I'm pretty sure that invokes the :string formatting function. If there is a placeholder {$foo}, I'd allow the implementation to introspect foo's type, but I would expect some function to be called in the end.

I agree that this is a reasonable thing for an implementation to do, but I don't think we should expect it to always happen. There are also reasonable further questions, such as: What formatter is called for {|zebra|} if an implementation allows for overriding default formatters, and :string is so overridden? Is it the core :string formatter, or the user-provided :string formatter that handles the zebra?

I think that's a question we ought to not resolve in this spec, leaving it instead to implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a question we ought to not resolve in this spec, leaving it instead to implementations.

I agree. Hmm... what concrete change are we proposing with this thread?

with reference to the current _formatting context_.
This can include multiple steps,
such as looking up the value of a variable and calling formatting functions.
Expand Down Expand Up @@ -83,9 +83,10 @@ At a minimum, it includes:

Implementations MAY include additional fields in their _formatting context_.

## Expression Resolution
## Expression and Markup Resolution

_Expressions_ are used in _declarations_, _selectors_, and _patterns_.
_Markup_ is only used in _patterns_.

In a _declaration_, the resolved value of the _expression_ is bound to a _variable_,
which is available for use by later _expressions_.
Expand All @@ -98,7 +99,7 @@ but also the _variable_ to which the resolved value of the _variable-expression_

In _selectors_, the resolved value of an _expression_ is used for _pattern selection_.

In a _pattern_, the resolved value of an _expression_ is used in its _formatting_.
In a _pattern_, the resolved value of an _expression_ or _markup_ is used in its _formatting_.

The shapes of resolved values are implementation-dependent,
and different implementations MAY choose to perform different levels of resolution.
Expand Down Expand Up @@ -210,17 +211,7 @@ the following steps are taken:
Implementations are not required to implement _namespaces_ or installable
_function registries_.

3. Resolve the _options_ to a mapping of string identifiers to values.
If _options_ is missing, the mapping will be empty.
For each _option_:
- Resolve the _identifier_ of the _option_.
- If the _option_'s _identifier_ already exists in the resolved mapping of _options_,
emit a Duplicate Option Name error.
- If the _option_'s right-hand side successfully resolves to a value,
bind the _identifier_ of the _option_ to the resolved value in the mapping.
- Otherwise, bind the _identifier_ of the _option_ to an unresolved value in the mapping.
Implementations MAY later remove this value before calling the _function_.
(Note that an Unresolved Variable error will have been emitted.)
3. Perform _option resolution_.

4. Call the function implementation with the following arguments:

Expand All @@ -247,6 +238,37 @@ the following steps are taken:
If the call fails or does not return a valid value,
emit a Resolution error and use a _fallback value_ for the _expression_.

#### Option Resolution

The result of resolving _option_ values is a mapping of string identifiers to values.

For each _option_:

- Resolve the _identifier_ of the _option_.
- If the _option_'s _identifier_ already exists in the resolved mapping of _options_,
emit a Duplicate Option Name error.
- If the _option_'s right-hand side successfully resolves to a value,
bind the _identifier_ of the _option_ to the resolved value in the mapping.
- Otherwise, bind the _identifier_ of the _option_ to an unresolved value in the mapping.
Implementations MAY later remove this value before calling the _function_.
(Note that an Unresolved Variable error will have been emitted.)

Errors MAY be emitted during _option resolution_,
but it always resolves to some mapping of string identifiers to values.
This mapping can be empty.

### Markup Resolution

Unlike _functions_, the resolution of _markup_ is not customizable.

Comment on lines +262 to +263
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be too opinionated? If I'm understanding this correctly, this doesn't permit implementations to support plug-in markup handlers that work similar to functions. While that might be the right solution for some implementations, it might be be what everyone expects.

We do want, I think, to say that markup's fallback is to nothing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this statement is specifically about resolution, and does not limit what may be done afterwards. A "plug-in markup handler" could post-process resolved values into different shapes, and it could even have access to the whole span of parts between open & close elements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm saying is: it's not "afterwards" at all, it might be now as part of the resolution, if the implementation chooses to make it work that way.

Instead of saying this (which doesn't really help implementations either way), perhaps state what our limits are:

Suggested change
Unlike _functions_, the resolution of _markup_ is not customizable.
This specification does not fully define how _markup_ is resolved.
Implementations can handle _markup_ according to their own needs, so long as the following
requirements are met.

The resolved value of _markup_ includes the following fields:

- The type of the markup: open, standalone, or close
- The _identifier_ of the _markup_
- For _markup-open_ and _markup_standalone_,
the resolved _options_ values after _option resolution_.

The resolution of _markup_ MUST always succeed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look quite right for a use of RFC 2119 "MUST".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage here is intentional, at least on my part: An implementation must not allow for the resolution of any {#foo} part to ever fail.


### Fallback Resolution

Expand Down Expand Up @@ -304,7 +326,7 @@ The _fallback value_ depends on the contents of the _expression_:
> the message formats to `{$arg}`.

- _function_ _expression_ with no _operand_:
the _function_ starting sigil followed by its _identifier_
U+003A COLON `:` followed by the _function_ _identifier_

> Examples:
> In a context where `:func` fails to resolve, `{:func}` resolves to the _fallback value_ `:func`.
Expand Down Expand Up @@ -599,7 +621,7 @@ one {{Category match}}
## Formatting

After _pattern selection_,
each _text_ and _expression_ part of the selected _pattern_ is resolved and formatted.
each _text_ and _placeholder_ part of the selected _pattern_ is resolved and formatted.

_Formatting_ is a mostly implementation-defined process,
as it depends on the implementation's shape for resolved values
Expand All @@ -618,13 +640,18 @@ appropriate data type or structure. Some examples of these include:
- A string with associated attributes for portions of its text.
- A flat sequence of objects corresponding to each resolved value.
- A hierarchical structure of objects that group spans of resolved values,
such as sequences delimited by "open" and "close" _function_ _annotations_.
such as sequences delimited by _markup-open_ and _markup-close_ _placeholders_.

Implementations SHOULD provide _formatting_ result types that match user needs,
including situations that require further processing of formatted messages.
Implementations SHOULD encourage users to consider a formatted localised string
as an opaque data structure, suitable only for presentation.

When formatting to a string, the default representation of all _markup_
MUST be an empty string.
Implementations MAY offer functionality for customizing this,
such as by emitting XML-ish tags for each _markup_.
Comment on lines +650 to +653
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would tighten this up a bit. The normative text is appropriate, but I would avoid the MUST. I would also make the example a bit more fleshed out.

Suggested change
When formatting to a string, the default representation of all _markup_
MUST be an empty string.
Implementations MAY offer functionality for customizing this,
such as by emitting XML-ish tags for each _markup_.
The default representation of all types of _markup_ placeholder
is an empty string.
That is, unless otherwise customized, open, close and standalone
markup placeholders are omitted from the output.
Implementations MAY offer functionality to customize this behavior.
> For example, an implementation might replace markup with
> styling attributes in an attributed string API
> or it might replace the placeholders with HTML markup.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion leaves out the "When formatting to a string" qualifier. Is that intentional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.


### Examples

_This section is non-normative._
Expand Down
Loading