Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for translating all messages #608

Open
certik opened this issue Aug 17, 2022 · 14 comments
Open

Implement support for translating all messages #608

certik opened this issue Aug 17, 2022 · 14 comments

Comments

@certik
Copy link
Contributor

certik commented Aug 17, 2022

All messages the LFortran prints should be translated (errors, warnings, hints, etc.; later also documentation).

Options (not necessarily mutually exclusive):

CC @awvwgk.

@awvwgk
Copy link
Member

awvwgk commented Aug 17, 2022

I can have a look into this.

@certik
Copy link
Contributor Author

certik commented Aug 17, 2022

Thanks @awvwgk !

@awvwgk
Copy link
Member

awvwgk commented Aug 17, 2022

For gettext the general reference found here is quite helpful: https://www.gnu.org/software/gettext/manual/gettext.html#Preparing-Strings. The main takeway is that a program / library with translatable strings should not use any string concatenation and any reported string should be self-contained enough for translators to make sense of (which is generally a good design decision for user facing messages).

For gettext there are two modes, either using gettext or using dgettext, the former is used for standalone executables and the latter for libraries. The C version will have a global state for using gettext / dgettext, note that if this global state is not initialized querying the message catalog will always yield an empty string. Since the global state is handled outside of the application / library using the message catalog we don't have to worry about passing an additional object around.

In the C++ version we get a messages instance which holds all message strings, we don't really care about much other functionality for the time being available with the locale header. This message catalog needs to be available everywhere in LFortran where a string could be produced which will eventually be displayed to the user. This message catalog also must be initialized or we will have similar issues as with the C version.

The location of the MO files which are loaded is usually a path determined at configure time, note that this might interfere when the path is relative to the installation prefix and we run LFortran out of the build directory for quick testing.

@certik
Copy link
Contributor Author

certik commented Aug 17, 2022

Is the usual approach to ship a binary (of lfortran) with external files (.mo files) and then at runtime you decide which language to use (either a compiler option or some environment variable, perhaps $LANG), and then you look for the right translation MO file and use it, if it does not exist, use the default English messages?

@awvwgk
Copy link
Member

awvwgk commented Aug 17, 2022

Indeed, we would install the MO files along with the binaries. The repository would only store the PO files and we might run the msgfmt program in the build0.sh script or as part of the CMake build. The MO files will than be loaded according to the current locale settings, internally in LFortran we have to resolve the path to the installed MO files using the locale information and open the message catalog if it exists, otherwise fallback to the default language (English).

@certik
Copy link
Contributor Author

certik commented Aug 17, 2022

I see. So we can ship it together with our runtime library, we already have a mechanism to load files at runtime.

One issue that we are facing is for running in WASM, where we currently ship one lfortran.wasm, and currently it does not have the runtime library and we are thinking of optionally embedding the runtime library in the binary. We could similarly optionally embed these generated MO files in the binary, so that translations work in WASM as well.

@awvwgk
Copy link
Member

awvwgk commented Aug 17, 2022

All I could find on the machine object (MO) format is this reference: https://www.gnu.org/software/gettext/manual/html_node/MO-Files.html. The binary blob contains a couple of offsets from the start of the file to identify the location of the strings. Embedding should work okay as long as we can read an arbitrary chunk of our own binary as mo file.

@meow464
Copy link
Contributor

meow464 commented Aug 19, 2022

Please make it possible to get the English messages. Sometimes search engines give no meaningful results and translating back to English can be hard.

@certik
Copy link
Contributor Author

certik commented Aug 19, 2022

You can always get the English messages.

@meow464
Copy link
Contributor

meow464 commented Aug 19, 2022

Sometimes that's hard, specially for students or non tech savy people. Having a --english-messages option could be very helpful. If that's not too hard of course.

@awvwgk
Copy link
Member

awvwgk commented Aug 19, 2022

Having a --english-messages option could be very helpful. If that's not too hard of course.

How is this different from setting LANG=en_US.UTF-8 or LC_ALL=en_US.UTF-8?

@14NGiestas
Copy link
Contributor

LANG=en (...) also works and it is shorter xD

@meow464
Copy link
Contributor

meow464 commented Aug 19, 2022

All those years....

@certik
Copy link
Contributor Author

certik commented Aug 20, 2022

@meow464 I see your request: say I get some Czech error messages:

$ lfortran a.f90
semantická chyba: Proměnná 'dp' není deklarovaná
  --> a.f90:19:11
   |
19 |     real (dp) :: x
   |           ^^ 'dp' není deklarovaná

but want to search online if somebody else hit the same error. The way to do that would be to switch to English temporarily:

$ LANG=en lfortran a.f90
semantic error: Variable 'dp' is not declared
  --> a.f90:19:11
   |
19 |     real (dp) :: x
   |           ^^ 'dp' is undeclared

So it seems the LANG solution would work. (We could of course add a compiler option too for selecting the language, I think that would be easy.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants