Skip to content

Commit

Permalink
discussion about design choices added,
Browse files Browse the repository at this point in the history
some cleanup, SpecificationText.md added
  • Loading branch information
gkurzbach committed Feb 8, 2022
1 parent 31a4ac7 commit e78573b
Show file tree
Hide file tree
Showing 2 changed files with 194 additions and 123 deletions.
252 changes: 129 additions & 123 deletions RationaleMCP/0035/Readme.md
Original file line number Diff line number Diff line change
@@ -1,123 +1,129 @@
# Modelica Change Proposal MCP-0035
# Multilingual support of Modelica
## Authors: Anett Kloß, Gerd Kurzbach, Olaf Oelsner, (Thomas Beutlich)
(In development)
--

# Summary
Modelica currently supports only one language for description texts. This is usually English. In order to better support users from other language areas and to make Modelica libraries more attractive and easier to understand for them, it should be possible to provide translations of the texts in any language in a standardized form, so that tools can find, read and display them to the user in his or her preferred language.
The definition of the interface to the translated texts is the subject of the MCP.

In the discussion in [#302](https://github.com/modelica/ModelicaSpecification/issues/302) it was decided to provide the translation externally of the Modelica files using the [GNU gettext](https://www.gnu.org/software/gettext/) format. The exact use of this format in the Modelica context is described in the MCP.

# Revisions
| Date | Description |
| --- | --- |
| 2021-06-14 | Anett Kloß: Prepared document based on ticket #302 |

# Contributor License Agreement
All authors of this MCP or their organizations have signed the "Modelica Contributor License Agreement".

# Rationale
Why should this feature be included?
* A better understanding of the libraries by users with different language background will make the libraries more attractive.
* Support cases might occur less often, due to better understanding of the description.
* A new translation is additionally a review of the current state and can result in a better quality of both language versions.
* Users can even write their own translation, when having a template *.pot-File for the library.

Use case for your proposal that show how it is applied.
* POT-File of current MSL 4.0 library --> this translation template is provided by ESI and filled for the Hydraulics Pump ...
(*add pictures*)

*For each use case state how it could be implemented by current Modelica at best and why the current Modelica does not suffice for this application.*

Proposed changes in Specification:
* Add section 13.6 into "external ressources" which describes following:
* Where shall the translation file be stored (Subfolder Language in the Resources Folder; proposal see below in tool implementation [step 3](#3.-file-storage))
* How does a GNU gettext translation template for a library look like and which kind of descriptive texts shall be included (proposal in tool implementation [step 1](#1.-read-out-a-translation-template-file))?
* How does a GNU gettext translation of a library look like (proposal in tool implementation [step 2](#2.-translation-of-the-template-file-into-a-language-file))?
* Which strings should be translated:

The following Modelica constructs shall be translated:
* Description / comment
* choices annotation
* missingInnerMessage, obsolete, unassignedMessage annotations
* Dialog.[group|tab] annotations
* Dialog.[load|save]Selector.[caption|filter] annotations
* Documentation.[info|revisions] annotations
* Text in diagram

The precise updated text of the specification is part of this branch/pull-request: No.

# Backwards Compatibility
The proposal is backwards compatible. All multilingual text changes can be applied without effect at existing Modelica code.
After the introduction of translation files for libraries, this files need to be adjusted with every library change which adds new commented parts or changes existing comments.

# Tool Implementation

## Experience with Prototype
### Implementation steps
#### 1. Read out a translation template file
Create at first a translation template file \*.pot for a Modelica library. Means a (tool dependent) readout of all needed text strings into a *LibraryName*.pot-File. It contains:
- a header which concludes all necessary information
![Header of a *.pot file](POT_Header.png)

- for all original strings entries with following structure:
```
msgctxt "Hydraulics.Basics.TankOverflow"
msgid "Volume Flow Port A"
msgstr ""
```

Where *msgctxt* contains the name of the Modelica Class and *msgid* contains the textstring which is situated in this name space and shall be translated. Please regard that if a *msgid* string (in this example "Volume Flow Port A") is given more than once in a namespace, all occurrences are translated with the same translation!

It shall be mandatory to use for *msgctxt* the full name of a class:
- so a specific translation depending on the content in this namespace is possible, as every textstring may differ depending on the context,
- using the class makes it easier for the Modelica tool to copy and re-arrange models and packages without loosing the already existing language information.

The following Modelica constructs shall be read out / translated:
* Description / comment
* choices annotation
* missingInnerMessage, obsolete, unassignedMessage annotations
* Dialog.[group|tab] annotations
* Dialog.[load|save]Selector.[caption|filter] annotations
* Documentation.[info|revisions] annotations
* Text in diagram

Having read out the file it just needs to be changed, if the library is changed (e.g. commented parts are added or changed or when having changed the Modelica name of the text strings containing element).

#### 2. Translation of the template file into a language file
Starting from the template file create a \*.po file for each needed language (e.g. german: *LibraryName*.de.po-File).

Edit the *msgstr* with the translation in the wanted language (here for german: `` Hydraulics.de.po``) :
```
msgctxt "Hydraulics.Basics.TankOverflow"
msgid "Volume Flow Port A"
msgstr "Volumenstrom an Anschluss A"
```

Hereby no error shall occur if there is one comment not having an translation. In this case the not translated text (the content of *msgid*) shall be used.
Again the file needs to be adjusted if there are changes at the library (e.g. commented parts are added or changed or when having changed the Modelica name of the text strings containing element).

---
**Note:** One may use the tool POedit for the translation, as it can be coupled with commercial translation tools for a faster translation.

---
#### 3. File storage
Both files need to be situated in the language directory ``Resources\Language`` of the top-level library e.g.:
``C:\...\<ModelicaPath>\<LibraryName>\Resources\Language``
#### 4. Read in the translation
The tool dependent language setting enables the reading of the current needed *LibraryName*.*Language*.po-File.

### Tool implementation effort
1. Readout of any comment / annotation without loosing content and with handling of exceptions, e.g. ``/"`` in comments. This can e.g. be created by the command:
``xgettext --language=EmacsLisp --sort-output --extract-all --from-code=UTF-8 --output=TranslationTest.pot TranslationTest.mo``
Certainly, the string extraction based on the EmacsLisp language can only be considered as proof of concept, since there is no Modelica language available in xgettext. It is recommended to create the POT file directly from the Modelica tool, like e.g., demonstrated by Wolfram SystemModeler or ESI ITI.
2. Import the comments with handling of exceptions ("").

# Required Patents
None.
# References
[GNU gettext](https://www.gnu.org/software/gettext/)

[Poedit](https://poedit.net/) - useful editor
# Modelica Change Proposal MCP-0035
# Multilingual support of Modelica
## Authors: Anett Kloß, Gerd Kurzbach, Olaf Oelsner, Thomas Beutlich

--

# Summary
Modelica currently supports only one language for description texts. This is usually English. In order to better support users from other language areas and to make Modelica libraries more attractive and easier to understand for them, it should be possible to provide translations of the texts in any language in a standardized form, so that tools can find, read and display them to the user in his or her preferred language.
The definition of the interface to the translated texts is the subject of the MCP.

In the discussion in [#302](https://github.com/modelica/ModelicaSpecification/issues/302) it was decided to provide the translation externally of the Modelica files using the [GNU gettext](https://www.gnu.org/software/gettext/) format. The use of this format in the Modelica context is described in the MCP.

# Revisions
| Date | Description |
| ------------ | --- |
| 2021-06-14 | Anett Kloß: Prepared document based on ticket #302 |
| 2022-01-07 | Gerd Kurzbach: add SpecificationText.md, some reformulations, discussion of design choices added |

# Contributor License Agreement
All authors of this MCP or their organizations have signed the "Modelica Contributor License Agreement".

# Rationale
Why should this feature be included?
* A better understanding of the libraries by users with different language background will make the libraries more attractive.
* Support cases might occur less often, due to better understanding of the description.
* A new translation is additionally a review of the current state and can result in a better quality of both language versions.
* Users can even write their own translation, when having a template *.pot-File for the library.

The use case for the proposal is the MSL 4.0.
* POT-File of current MSL 4.0 library --> this translation template is provided and filled for the Hydraulics Pump.
(*add pictures*)

Proposed changes to the specification are described in SpecificationText.md. It consists of a new section **13.6 Multilingual Descriptions** which contains:
* Where the translation files to be stored
* How does a GNU gettext translation template for a library look like and which kind of descriptive texts shall be included?
* Which strings should be translated:

The precise updated text of the specification is part of this branch/pull-request: Yes.

# Backwards Compatibility
The proposal is backwards compatible. All multilingual text changes can be applied without effect at existing Modelica code.
After the introduction of translation files for libraries, this files need to be adjusted with every library change which adds new commented parts or changes existing comments.

# Tool Implementation

## Experience with Prototype
### Implementation steps
#### 1. Read out a translation template file
Create at first a translation template file \*.pot for a Modelica library. Means a (tool dependent) readout of all needed text strings into a *LibraryName*.pot-File. It contains:
- a header which concludes all necessary information
![Header of a *.pot file](POT_Header.png)

- for all original strings entries with following structure:
```
msgctxt "Hydraulics.Basics.TankOverflow"
msgid "Volume Flow Port A"
msgstr ""
```

Where *msgctxt* contains the name of the Modelica Class and *msgid* contains the textstring which is situated in this name space and shall be translated. Please regard that if a *msgid* string (in this example "Volume Flow Port A") is given more than once in a namespace, all occurrences are translated with the same translation!

It shall be mandatory to use for *msgctxt* the full name of a class:
- so a specific translation depending on the content in this namespace is possible, as every textstring may differ depending on the context,
- using the class makes it easier for the Modelica tool to copy and re-arrange models and packages without loosing the already existing language information.

The following Modelica constructs shall be read out / translated:
* syntactical comments
* strings in following annotations:
* Text.string, Text.textString
* missingInnerMessage, obsolete, unassignedMessage
* Dialog.[group|tab]
* Dialog.[load|save]Selector.[caption|filter]
* Documentation.[info|revisions]
* Figure.title, Plot.title, Curve.legend

Having read out the file it just needs to be changed, if the library is changed (e.g. commented parts are added or changed or when having changed the Modelica name of the text strings containing element).

#### 2. Translation of the template file into a language file
Starting from the template file create a \*.po file for each needed language (e.g. german: *LibraryName*.de.po-File).

Edit the *msgstr* with the translation in the wanted language (here for german: `` Hydraulics.de.po``) :
```
msgctxt "Hydraulics.Basics.TankOverflow"
msgid "Volume Flow Port A"
msgstr "Volumenstrom an Anschluss A"
```

Hereby no error shall occur if there is one comment not having a translation. In this case the not translated text (the content of *msgid*) shall be used.
Again the file needs to be adjusted if there are changes at the library (e.g. commented parts are added or changed or when having changed the Modelica name of the text strings containing element).

---
**Note:** One may use the tool POedit for the translation, as it can be coupled with commercial translation tools for a faster translation.

---
#### 3. File storage
Both files need to be situated in the language directory ``Resources\Language`` of the top-level library e.g.:
``C:\...\<ModelicaPath>\<LibraryName>\Resources\Language``
#### 4. Read in the translation
The tool dependent language setting enables the reading of the current needed *LibraryName*.*Language*.po-file.

### Tool implementation effort
1. Readout of any comment / annotation without loosing content and with handling of exceptions, e.g. ``/"`` in comments. This can e.g. be created by the command:
``xgettext --language=EmacsLisp --sort-output --extract-all --from-code=UTF-8 --output=TranslationTest.pot TranslationTest.mo``
Certainly, the string extraction based on the EmacsLisp language can only be considered as proof of concept, since there is no Modelica language available in xgettext. It is recommended to create the POT file directly from the Modelica tool.

2. When reading in a library read corresponding the *LibraryName*.*Language*.po-file into an internal table and lookup the translated strings just before presenting them to the user.

## Discussion of the "One File per Library" design
There is a design choice how to structure the files for translation information either
* one file per library, or
* one file per class.

In this proposal the design "One file per library" is chosen, because it simplifies the handling of the translation data, the translation process and the generation of the template. The information of the header need to be stored only once.

If after finishing a library one wants to translate them into another language, he can use the preferred tool and feed the file as a whole into it. After translating the strings, he can then save the file them and is done.

One file per class, possibly stored in a directory structure that corresponds to the class hierarchy, result in a large number of files, with each file having to be translated separately. While this has the advantage of allowing parallel translations, it can also lead to undesirably different translations of the same texts, as different people translate things differently.

Also a Modelica tool only has to take care of one file per library. Normally, for performance reasons, the information from the files is stored internally in one big table anyway, so there is no need to split the information.

Multiple files have an advantage when renaming or moving classes, then the files only need to be renamed. The keyword msgctxt can then be omitted. In the case of only one file this can be realized by a search and replace of the msgctext information, which is not too complicated. The use case of moving classes in a finished library does not occur so often, because then also the models which use them must be adapted.


# Required Patents
None.
# References
[GNU gettext](https://www.gnu.org/software/gettext/)

[Poedit](https://poedit.net/) - useful editor
65 changes: 65 additions & 0 deletions RationaleMCP/0035/SpecificationText.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# 13.6 Multilingual Descriptions

[*Descriptive texts in a model or library are usually formulated in English. To support users who are not so familiar with this, a Modelica tool can support the translation of these texts into any language. This should be done using external files, so that no modification of the actual Modelica text is required.*]

The files to support translation must be provided by the developer of the library. They must be stored in a subdirectory of the Resources directory of the library with the name `Language`.

Two kind of files have to be provided:
* a template file `<LibraryName>.pot`.
It contains all necessary information to translate all descriptions, but no translations. The pattern `<LibraryName>` denotes the toplevel class name of the library.
* one file for each supported language with the name `<LibraryName>.<language>.po`. This file is a copy of the associated template file, but extended with the translations in the specified language. The pattern `<language>` stands for the ISO 639 language code, e.g. de or en.

The files consist of a header and a body. All text strings are in double quotes and encoded with UTF-8 characters. Comments start with an `#` and are continued until the end of line. Spaces outside strings are ignored and used as separators. The detailed format of these files is described in [GNU gettext](https://www.gnu.org/software/gettext/manual/gettext.pdf).

The header looks like this:
```
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2019-03-15 10:52+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"Language: \n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
```
All general terms in the header should be replaced by specific information.

In the body, directly following the header, for each descriptive text there is an entry of the form
```
#: <file name>:<line number>
msgctxt "<context identifier>"
msgid "<descriptive text>"
msgstr "<translation>"
```
Only the keywords `msgctxt`, `msgid` and `msgstr` are used.

At first there can be an optional comment describing the location (file name and line number) of the text to translate. Multiple occurences of the same string can be listed here, separated by space.

Then, the `<context identifier>` behind the keyword `msgctxt` is the full name of the Modelica class (e.g. `Modelica.Blocks.Math.Sin` ) where the text appears in. Short class definitions do not appear here. Texts in such classes belong to the enclosing full class definition.

After the `msgid` keyword the text string which shall be translated follows. [*Please regard that if a `msgid` string is given more than once in the same context, all occurrences are translated with the same (last) translation!*]

The keyword `msgstr` is followed by the translation of `msgid`. In the template file this string is empty by definition. If this is empty in a language specific file the contents of `msgid` may be used instead.

The texts in following Modelica constructs should be translated:
* syntactical comments
* strings in following annotations:
* Text.string, Text.textString
* missingInnerMessage, obsolete, unassignedMessage
* Dialog.[group|tab]
* Dialog.[load|save]Selector.[caption|filter]
* Documentation.[info|revisions]
* Figure.title, Plot.title, Curve.legend

[*To support the translation of that strings a number of free and commercial tools exist in context of GNU gettext.
A Modelica tool should provide a function to create the initial template file from an existing Modelica library.*]

0 comments on commit e78573b

Please sign in to comment.