Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client-side translations: gettext and plural support #13687

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Ekdohibs
Copy link
Member

@Ekdohibs Ekdohibs commented Jul 24, 2023

This is a rebase of #12446.

This PR is complete, see instructions on the previous PR for how to test.

How to test

For now, take a mod using client-side translations, and replace the .tr file with a .po one, with the following entries for each original translated string:

msgctx "<textdomain>"
msgid "<original string>"
msgstr "<translated string>"

@Ekdohibs Ekdohibs added @ Translation Testing needed Feature ✨ PRs that add or enhance a feature labels Jul 24, 2023
@Ekdohibs Ekdohibs force-pushed the translations-gettext3 branch 2 times, most recently from db06c26 to 0819a09 Compare July 24, 2023 17:08
@rubenwardy rubenwardy added the Concept approved Approved by a core dev: PRs welcomed! label Sep 17, 2023
@Zughy Zughy added Supported by core dev Not on the roadmap, yet some core dev decided to take care of this PR and removed Concept approved Approved by a core dev: PRs welcomed! labels Jan 21, 2024
@rubenwardy
Copy link
Member

rubenwardy commented Mar 3, 2024

I will have time to review this soon, so please rebase. I've given it a quick look through, and have two comments:

  • The parser is in the Translation class, would be nice to have a separate class for this with unit tests. Although, I suppose same for .tr
  • I wonder if the script is outdated now that we have a util script for extracting .tr

@rubenwardy rubenwardy added the Rebase needed The PR needs to be rebased by its author. label Mar 3, 2024
@Ekdohibs
Copy link
Member Author

Ekdohibs commented Mar 7, 2024

I just rebased it, it should be good now. I'll try to write some unit tests and move the parser to its own class, it might be a bit harder to do for .mo files as they are a binary format.
Concerning the script, I can't seem to find the util script in the minetest/ repository, has it been merged there? The objective of including it was so that it is distributed with mintest itself; it is also updated to support plural translations and .po output. If the script is hosted somewhere else, I can make the PR there, however.

@Ekdohibs Ekdohibs removed the Rebase needed The PR needs to be rebased by its author. label Mar 7, 2024
Copy link
Contributor

@y5nw y5nw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking.

doc/lua_api.md Show resolved Hide resolved
src/translation.h Show resolved Hide resolved
line.resize(line.length() - 1);

if (line.empty() || line[0] == '#')
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since xgettext can generate fuzzy entries it would make sense to check for (and skip) those.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation of gettext, it seems to say that fuzzy entries are considered as normal entries for most purposes, so I wouldn't drop them. In particular, the .po to .mo compilation would use them, so we should use them as well if we don't want to introduce a discrepancy.

Copy link
Contributor

@y5nw y5nw Mar 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation of gettext, it seems to say that fuzzy entries are considered as normal entries for most purposes, so I wouldn't drop them.

While that is the case, the problem with fuzzy matching is that it is enabled by default in msgmerge (unless the -N flag is passed) and can often match entries with different (sometimes opposite, e.g. "enabled" and "disabled") meanings while assigning the same translation if one is missing, and there are also reasons to generate fuzzy entries, e.g. if the source string does not change much (adding a period to the end of the source string) or to give translators a base string to modify/work with.

In particular, the .po to .mo compilation would use them, so we should use them as well if we don't want to introduce a discrepancy.

msgfmt does not seem to include fuzzy entries by default (there is the -f flag for including them which GNU does not recommend using).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the documentation! I'll implement matching the fuzzy entries in that case.

doc/lua_api.md Outdated Show resolved Hide resolved
@rubenwardy rubenwardy self-requested a review March 8, 2024 20:30
@rubenwardy
Copy link
Member

Concerning the script, I can't seem to find the util script in the minetest/ repository, has it been merged there? The objective of including it was so that it is distributed with mintest itself; it is also updated to support plural translations and .po output. If the script is hosted somewhere else, I can make the PR there, however.

I was referring to https://github.com/minetest/minetest/blob/master/util/mod_translation_updater.py

Makes sense for someone to update that in another PR though

description = "Test translations",
privs = {},
func = function(name, param)
minetest.chat_send_player(name, "Please ensure your locale is set to \"fr\"")
Copy link
Member

@rubenwardy rubenwardy Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
minetest.chat_send_player(name, "Please ensure your locale is set to \"fr\"")
if minetest.get_player_information(name).lang_code ~= "fr" then
return false, "Please restart Minetest with language = fr"
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you resolved the comment without applying the suggestion? Please reply if you don't like a suggestion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the suggestion, but I must have forgotten to pull before rebasing, deleting the commit. I'll commit it again, thanks for catching it!
Although, now that I'm thinking about it, we really want to test this both with fr and default locale (to see both untranslated and translated strings), I'll need to check what lang_code is in that case.

src/translation.cpp Outdated Show resolved Hide resolved
@rubenwardy rubenwardy added the Action / change needed Code still needs changes (PR) / more information requested (Issues) label Mar 9, 2024
@Ekdohibs Ekdohibs removed the Action / change needed Code still needs changes (PR) / more information requested (Issues) label Mar 17, 2024
@Ekdohibs
Copy link
Member Author

Hmmmm, I'm not why the CI fails, do you have an idea?

msgid "Testing .mo files: untranslated"
msgstr "Testing .mo files: translated"

msgctxt "testtranslations"
Copy link
Member

@rubenwardy rubenwardy Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the duplicate msgctxt?

Looks like this is required on every translation to set the text domain, are you sure gettext requires this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this msgctxt is required for every message, this should be checked by the engine and warned/errored about

Copy link
Contributor

@y5nw y5nw Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this file is used to generate locale/translation_mo.fr.mo? The translation file reader uses the base filename (here: translation_mo) as the text domain if one is not specified.

The GNU Gettext doc on PO files does not seem clear about about the scope (i.e. the entries that the msgctxt applies to) of msgctxt. However, considering that

  • The documentation states that "an empty context string and an absent msgctxt line do not mean the same thing";
  • Entries with msgctxt are considered as "entries with a context specifier"; and
  • The entire documentation does not mention "scopes" of other specifiers (that said, having msgid and msgtxt to multiple entries would make little sense anyway)

It would be safe to assume that entries without an explicit context specifier would be considered as entries without context information, which, combined with the fact that Minetest does not differentiate between "text domain" and "context" (and, in this PR, uses the former if the latter is absent), means that any entry without an explicit msgctxt specifier would be put under the default (here: translation_mo) domain, which would explain the duplicate msgctxt as later entries would otherwise be put into the translation_mo domain.

It is probably also worth noting that the generated .mo file includes context information on every entry as well.

That said, while entries without msgctxt are allowed (see #13687 (comment)), there does not seem to be any testcase/testfile related to this, so it would make sense to expand the test file to include entries without an explicit context specifier.

@@ -2551,8 +2551,8 @@ bool Server::addMediaFile(const std::string &filename,
".png", ".jpg", ".bmp", ".tga",
".ogg",
".x", ".b3d", ".obj",
// Custom translation file format
".tr",
// Translation file format
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Translation file format
// Translation file formats

@@ -0,0 +1,2 @@
# textdomain: testtranslations
Testing .tr files: untranslated=Testing .tr files: translated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing EOF new line

Suggested change
Testing .tr files: untranslated=Testing .tr files: translated
Testing .tr files: untranslated=Testing .tr files: translated

Comment on lines +4126 to +4129
```lua
local S, NS = minetest.get_translator()
NS("@1 file", "@1 files", n, tostring(n))
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code blocks need blank lines around them

Suggested change
```lua
local S, NS = minetest.get_translator()
NS("@1 file", "@1 files", n, tostring(n))
```
```lua
local S, NS = minetest.get_translator()
NS("@1 file", "@1 files", n, tostring(n))

Comment on lines +4148 to +4149
Old translation file format
---------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Old translation file format
---------------------------
Old translation file format (.tr)
---------------------------------

Translation file format
-----------------------
Old translation file format
---------------------------

A translation file has the suffix `.[lang].tr`, where `[lang]` is the language
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A translation file has the suffix `.[lang].tr`, where `[lang]` is the language
Note: You should use the Gettext translation file format instead.
A translation file has the suffix `.[lang].tr`, where `[lang]` is the language

description = "Test translations",
privs = {},
func = function(name, param)
minetest.chat_send_player(name, "Please ensure your locale is set to \"fr\"")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you resolved the comment without applying the suggestion? Please reply if you don't like a suggestion

@rubenwardy rubenwardy added Action / change needed Code still needs changes (PR) / more information requested (Issues) Rebase needed The PR needs to be rebased by its author. labels Mar 24, 2024
@rubenwardy
Copy link
Member

rubenwardy commented Mar 24, 2024

Here's a patch to fix content translation with .po files

commit cfb914572ebd41066587f59af1022ecee87529f3
Author: rubenwardy <rw@rubenwardy.com>
Date:   Sun Mar 24 18:24:30 2024 +0000

    Fix content translation and Gettext format

diff --git a/src/gui/guiEngine.cpp b/src/gui/guiEngine.cpp
index 4e22c3ae8..529079ea2 100644
--- a/src/gui/guiEngine.cpp
+++ b/src/gui/guiEngine.cpp
@@ -214,15 +214,28 @@ GUIEngine::GUIEngine(JoystickController *joystick,
 
 
 /******************************************************************************/
-std::string findLocaleFileInMods(const std::string &path, const std::string &filename)
+std::string findLocaleFileWithExtension(const std::string &path)
+{
+	if (fs::PathExists(path + ".mo"))
+		return path + ".mo";
+	if (fs::PathExists(path + ".po"))
+		return path + ".po";
+	if (fs::PathExists(path + ".tr"))
+		return path + ".tr";
+	return "";
+}
+
+
+/******************************************************************************/
+std::string findLocaleFileInMods(const std::string &path, const std::string &filename_no_ext)
 {
 	std::vector<ModSpec> mods = flattenMods(getModsInPath(path, "root", true));
 
 	for (const auto &mod : mods) {
-		std::string ret = mod.path + DIR_DELIM "locale" DIR_DELIM + filename;
-		if (fs::PathExists(ret)) {
+		std::string ret = findLocaleFileWithExtension(
+				mod.path + DIR_DELIM "locale" DIR_DELIM + filename_no_ext);
+		if (!ret.empty())
 			return ret;
-		}
 	}
 
 	return "";
@@ -235,19 +248,26 @@ Translations *GUIEngine::getContentTranslations(const std::string &path,
 	if (domain.empty() || lang_code.empty())
 		return nullptr;
 
-	std::string filename = domain + "." + lang_code + ".tr";
-	std::string key = path + DIR_DELIM "locale" DIR_DELIM + filename;
+	std::string filename_no_ext = domain + "." + lang_code;
+	std::string key = path + DIR_DELIM "locale" DIR_DELIM + filename_no_ext;
 
 	if (key == m_last_translations_key)
 		return &m_last_translations;
 
 	std::string trans_path = key;
-	ContentType type = getContentType(path);
-	if (type == ContentType::GAME)
-		trans_path = findLocaleFileInMods(path + DIR_DELIM "mods" DIR_DELIM, filename);
-	else if (type == ContentType::MODPACK)
-		trans_path = findLocaleFileInMods(path, filename);
-	// We don't need to search for locale files in a mod, as there's only one `locale` folder.
+
+	switch (getContentType(path)) {
+	case ContentType::GAME:
+		trans_path = findLocaleFileInMods(path + DIR_DELIM "mods" DIR_DELIM,
+				filename_no_ext);
+		break;
+	case ContentType::MODPACK:
+		trans_path = findLocaleFileInMods(path, filename_no_ext);
+		break;
+	default:
+		trans_path = findLocaleFileWithExtension(trans_path);
+		break;
+	}
 
 	if (trans_path.empty())
 		return nullptr;

@rubenwardy
Copy link
Member

rubenwardy commented Mar 24, 2024

Here's a zip file for testing gettext content translation

gettext_content_translation.zip

Not sure why this happens just for games, will check master as well. See list title vs title on the right

image

@rubenwardy
Copy link
Member

rubenwardy commented Mar 24, 2024

The parser is in the Translation class, would be nice to have a separate class for this with unit tests. Although, I suppose same for .tr

The reason I say this is because it's a good idea to have unit tests when writing a parser for a file format as it gives some confidence that the parser actually works. The tests should cover a lot of edge cases (missing ", missing msgctxt, comments, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Action / change needed Code still needs changes (PR) / more information requested (Issues) Feature ✨ PRs that add or enhance a feature Rebase needed The PR needs to be rebased by its author. Supported by core dev Not on the roadmap, yet some core dev decided to take care of this PR Testing needed @ Translation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants