Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store messages as FluentMessage in the runtime #365

Closed
stasm opened this issue Apr 29, 2019 · 12 comments
Closed

Store messages as FluentMessage in the runtime #365

stasm opened this issue Apr 29, 2019 · 12 comments

Comments

@stasm
Copy link
Contributor

stasm commented Apr 29, 2019

This is a proposal for a new API for formatting messages from bundles. It's an alternative to #208. It solves the same problems that #208 does, but in a way which doesn't regress the ability to retrieve a message representation without formatting it and cache it for subsequent formatting operations. See also #358.

The current API (fluent 0.12)

bundle.addMessages(`
foo = Foo
    .bar = Bar
`);
let message = bundle.getMessage("foo");
bundle.format(message, args, errs);
if (message.attrs && message.attrs["bar"]) {
    bundle.format(message.attrs["bar"], args, errs]);
}

This API exposes the internal representation of th e message to the consumer. The message might be a primitive string, an array, or an object of {value,attrs}. In the last case, value maybe undefined, a string, or an array; attrs may be undefined or an object of attributes whose values may again be strings or arrays.

This complexity comes from the days of Firefox OS where it was imperative to minimize the size of the JSON representing translations because the storage on low-end devices was limited.

The compound API proposed in #208

bundle.addMessages(`
foo = Foo
    .bar = Bar
`);

// The "pull" API.
bundle.format("foo", args, errs);
bundle.format("foo.bar", args, errs);

// The "push" API; returns a `{value, attributes}` shape.
bundle.compound("foo", args, errs);

In this API design, there are two entry points. The format method allows to specify the attribute to format after a dot. Each attribute formatted in this manner would cause the message to be re-retrieved by the bundle internally, which isn't optimal performance-wise.

The compound entry-point returns an object of formatted values and attributes. I'm calling it "push" because all attributes would be formatted at all times (which is probably what we want anyways for the most part). However, if a need to implement performant "pull" API arises, this method could evolve to return a lazy attributes field which would format attributes only when queried. This idea was the gist of my comment in #208 (comment).

This design hides the details of how messages are stored internally by the bundle from consumers, but it doesn't allow to retrieve a message once for later re-use when formatting the same UI widget multiple times.

The FluentMessage API (this proposal)

In this proposal, I'd like to formalize the interface of FluentMessage, returned by getMessage. FluentMessages would provide methods for resolving their value and attributes, and a iterable over their attributes' names. Obviously, in current JS we can't really hide the fields used for storing the parsed value and attributes, but we can hint at their being private with the _ prefix, and later on switch to proper private fields when they're standardized.

bundle.addMessages(`
foo = Foo
    .bar = Bar
`);

// An atomic retrieve-and-format API for when attributes are not defined on the message.
// Good for quickly getting started with the bundle and for imperative callsites.
bundle.format("foo", args, errs);

// `message` can be cached for re-use. Attributes can 
let message = bundle.getMessage("foo");
bundle.formatValue(message, args, errs);

// The "pull" API. `name` is known up front.
bundle.formatAttribute(message, name, args, errs);

// The "push" API. `name` comes from the message itself.
for (let name of message.attributeNames) {
    bundle.formatAttribute(message, name, args, errs);
}

This API provides better encapsulation of the runtime AST inside of the FluentMessage class. It supports both the "pull" and the "push" API, and allows caching of the message for multiple formatting operations.

@eemeli
Copy link
Member

eemeli commented Apr 29, 2019

With the proposed API, using a FluentMessage requires access to the bundle, which seems a bit clumsy. Also, are the "push" and "pull" APIs intentionally the same in the proposal?

Alternatively, getMessage could just a return a function with attributes as properties. Here's what I mean:

bundle.addMessages(`
foo = Foo
    .bar = Bar
`);

bundle.format("foo", args, errs); // 'Foo'

const message = bundle.getMessage("foo");
message(args, errs); // 'Foo'
message.bar(args, errs); // 'Bar'

for (let name of Object.keys(message)) {
    message[name](args, errs); // 'Bar'
}

@stasm
Copy link
Contributor Author

stasm commented Apr 29, 2019

Thanks for the feedback, I was going to CC you here once I submit a draft PR in a few minutes :)

With the proposed API, using a FluentMessage requires access to the bundle, which seems a bit clumsy.

This is intentional, but I'm open to revisiting the rationale. The bundle is required for formatting anyways, because it stores other messages and terms, as well as Intl objects for the current locale. I like the fact that the API makes it explicit by putting the format methods in the bundle rather than on the message. Otherwise, it would be tempting to pass the message around without realizing that it holds the reference to the entire bundle. This in turn could lead to memory leaks.

Also, are the "push" and "pull" APIs intentionally the same in the proposal?

Yes. The default is "pull", but thanks to the message.attributeNames iterable, the same API can be used in the "push" model, when the consumer doesn't know which attributes it should ask for.

Alternatively, getMessage could just a return a function with attributes as properties.

Interesting! I haven't seen this style of API often, but I like its terseness. Would it work in typed dialects of JavaScript without specifying interfaces for each message, though? Also, using a single class with well-known methods on it might benefit performance due to how JS engines optimize classes with the constant set of fields.

@Pike
Copy link
Contributor

Pike commented Apr 29, 2019

I'm with Eemeli on the question of passing a message of one bundle to evaluate in another, or, if the format() APIs on should be on Bundle or Message. Let's not cross the streams of js object namespaces and attribute names, though.

Looking at this from the perspective of DOMLocalization, this would re-introduce the problem of having to cross the C++ - JavaScript language barrier 2N+2 times?

@stasm
Copy link
Contributor Author

stasm commented Apr 29, 2019

This is intentional, but I'm open to revisiting the rationale. The bundle is required for formatting anyways, because it stores other messages and terms, as well as Intl objects for the current locale. I like the fact that the API makes it explicit by putting the format methods in the bundle rather than on the message.

I also filed #364 with the intent to remove the errors parameter of format methods. I'd like to emit errors on the bundle instead. I feel like having the format methods on the bundle then makes more sense, too.

@stasm
Copy link
Contributor Author

stasm commented Apr 29, 2019

Looking at this from the perspective of DOMLocalization, this would re-introduce the problem of having to cross the C++ - JavaScript language barrier 2N+2 times?

Is N the number of attributes?

The answer to this question depends, I think, on where exactly we set the boundary. The base Localization class can (and probably should) wrap formatValue and formatAttribute calls in a single method, reducing the amount of back and forth.

@Pike
Copy link
Contributor

Pike commented Apr 29, 2019

Yes, N being the number of attributes.

The way I see it, you have Value, and .attributes, which is 2. And then you need to call getNext from C++ into js, and then formatAttribute, which I count as 2N.

@stasm
Copy link
Contributor Author

stasm commented Apr 29, 2019

I filed #366 to discuss the implementation of this proposal.

@stasm
Copy link
Contributor Author

stasm commented Apr 29, 2019

Thanks, @eemeli and @Pike, for the quick feedback. I'd also like to ask @spookylukey and @zbraniecki for their thoughts on this. I know that the Python and Rust implementations use the dotted approach: format("foo.bar") accesses and formats attributes. I had considered adding the same functionality to fluent.js, but I now think that this proposal offers a more robust and flexible API without it.

@eemeli
Copy link
Member

eemeli commented Apr 29, 2019

Interesting! I haven't seen this style of API often, but I like its terseness. Would it work in typed dialects of JavaScript without specifying interfaces for each message, though?

At least TypeScript should be able to represent such a FluentMessage a bit like this:

interface FluentMessage {
    (args: Object): string
    [key: string]: (args: Object) => string
}

I know that the Python and Rust implementations use the dotted approach: format("foo.bar") accesses and formats attributes. I had considered adding the same functionality to fluent.js, but I now think that this proposal offers a more robust and flexible API without it.

Besides what might happen with getMessage() or compound(), I at least would also see value in having format("foo.bar") "just work". For a usage example, it might make sense to collect a set of error messages as attributes of a single Fluent message. When that needs to be accessed, there's unlikely to be a need for anything more than the formatted value of a single attribute. In that case, being able to call format("errors.foo") would seem like an obvious API.

@stasm
Copy link
Contributor Author

stasm commented Apr 29, 2019

Thanks for the TypeScript example!

Re. format("errors.foo"), I'd like to balance the usefulness of the API with the "Fluent" mindset. It's possible to use attributes like you described, but doing so might be orthogonal to a number of assumptions that we make about what is a message, and what is an attribute. I know from my own experience that this is sometimes hard and that the line is sometimes blurry.

The reason why I'm pushing against the format("errors.foo") API is that in my mind, attributes shouldn't really be used as dictionaries, or keyed collections. For that, separate messages with a group comment above are a better choice. They make each error message an independent unit of translation, they make it possible to add attributes to each error if needed, and they play well with the practice of changing the message identifier when the meaning of the error message in English changes. CAT tools can benefit from this assumption by building experiences which notify translators about pending work when ids change, and they can estimate how much work it is to review and edit the new translation. Translation memory systems work best when the message is the smallest unit of translation possible, too.

That's why in this design I'm actually trying to make it harder to query attributes on their own. Attributes are best used as translations relating to the same unit of the UI. I think of them as props passed to a React component. A tuple of question, yes answer, and no answer could be a single compound message with attributes. A list of error message, city names, or status codes should—in my opinion—be represented by multiple separate messages.

@stasm
Copy link
Contributor Author

stasm commented Jun 7, 2019

One of goals of the new design is to expose only the minimal API surface to the consumers. In the discussion on Discourse, @eemeli has pointed out that the format and addMessages could be removed, to further align the API with this goal.

@stasm
Copy link
Contributor Author

stasm commented Jul 11, 2019

We're going to the go with the formatPattern proposal in #380.

@stasm stasm closed this as completed Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants