Skip to content

adam-26/tag-messageformat-parser

Repository files navigation

Tag MessageFormat Parser

Parses [ICU Message strings][ICU] into an AST via JavaScript, with added support for tags.

npm npm CircleCI branch Maintainability Test Coverage Conventional Commits

This is a fork of intl-messageformat-parser

Differences from the original package:

  • Tags are supported in messages - this is not part of the ICU message "spec"
  • The other option is required for plural, select and selectordinal as is required by other ICU parsers
  • Whitespace in plural messages is preserved
  • . is permitted to be used in argument and tag names

What is a tag?

A tag enables style placeholders to be included in the translation message without including any of the style information in the translation message.

This provides 3 benefits:

  1. It decouples the styling of the text from the translations, allowing the styling to change independently of translations.
  2. It allows translation messages to retain context for text that will be styled
  3. Tags can be named to provide hints to translators

A tag must adhere to the following conventions:

  • begin with <x:
  • The tag name can include only numbers, ascii letters, underscore and dot ..
  • must be closed, self-closing tags are supported but should be used sparingly as they can be confusing for translators
  • Valid tag examples:
    • <x:0>hello</x:0>
    • <x:link>click me</x:link>
    • <x:emoji />

Here's an simple example:

var parser = require('tag-messageformat-parser');

parser.parse('By signing up you agree to our <x:link>terms and conditions</x:link>');

Using descriptive names for tag names can provide hints to translators about the purpose of the tags. In the above example, the text terms and conditions will be used to display a link the user can click on.

Tags and arguments can be used in combination in ICU message formats.

This example uses a {name} argument in a tag.

parser.parse('Welcome back <x:bold>{name}</x:bold>');

Overview

This package implements a parser in JavaScript that parses the industry standard [ICU Message strings][ICU] — used for internationalization — into an AST. The produced AST can then be used by a compiler, like tag-messageformat, to produce localized formatted strings for display to users.

This parser is written in [PEG.js][], a parser generator for JavaScript. This parser's implementation was inspired by and derived from Alex Sexton's [messageformat.js][] project. The differences from Alex's implementation are:

  • This project is standalone.
  • It's authored as ES6 modules compiled to CommonJS and the Bundle format for the browser.
  • The produced AST is more descriptive and uses recursive structures.
  • The keywords used in the AST match the ICU Message "spec".

Usage

Loading in the Browser

The dist/ folder contains the version of this package for use in the browser, and it can be loaded and used like this:

<script src="tag-messageformat-parser/dist/tag-messageformat-parser.min.js"></script>
<script>
    TagMessageFormatParser.parse('...');
</script>

Loading in Node.js

This package can also be require()-ed in Node.js:

var parser = require('tag-messageformat-parser');
parser.parse('...');

Example

Given an ICU Message string like this:

On {takenDate, date, short} {name} took {numPhotos, plural,
    =0 {no photos.}
    =1 {one photo.}
    other {# photos.}
}
// Assume `msg` is the string above.
parser.parse(msg);

This parser will produce this AST:

{
    "type": "messageFormatPattern",
    "elements": [
        {
            "type": "messageTextElement",
            "value": "On "
        },
        {
            "type": "argumentElement",
            "id": "takenDate",
            "format": {
                "type": "dateFormat",
                "style": "short"
            }
        },
        {
            "type": "messageTextElement",
            "value": " "
        },
        {
            "type": "argumentElement",
            "id": "name",
            "format": null
        },
        {
            "type": "messageTextElement",
            "value": " took "
        },
        {
            "type": "argumentElement",
            "id": "numPhotos",
            "format": {
                "type": "pluralFormat",
                "offset": 0,
                "options": [
                    {
                        "type": "optionalFormatPattern",
                        "selector": "=0",
                        "value": {
                            "type": "messageFormatPattern",
                            "elements": [
                                {
                                    "type": "messageTextElement",
                                    "value": "no photos."
                                }
                            ]
                        }
                    },
                    {
                        "type": "optionalFormatPattern",
                        "selector": "=1",
                        "value": {
                            "type": "messageFormatPattern",
                            "elements": [
                                {
                                    "type": "messageTextElement",
                                    "value": "one photo."
                                }
                            ]
                        }
                    },
                    {
                        "type": "optionalFormatPattern",
                        "selector": "other",
                        "value": {
                            "type": "messageFormatPattern",
                            "elements": [
                                {
                                    "type": "messageTextElement",
                                    "value": "# photos."
                                }
                            ]
                        }
                    }
                ]
            }
        }
    ]
}

License

This software is free to use under the Yahoo! Inc. BSD license. See the LICENSE file for license text and copyright information.

About

Parses ICU message strings to an AST that can be used to format the messages for a person's locale.

Resources

License

Stars

Watchers

Forks

Packages

No packages published