Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to generate translation catalog for the class reference #37114

Merged

Conversation

ThakeeNathees
Copy link
Contributor

@ThakeeNathees ThakeeNathees commented Mar 17, 2020

Fix: #37109
the generated file (in *.txt format): translation_catalog.txt
EDITED
newer generated file :
translation_catalog.txt

C:\dev\godot\doc\translations>python extract.py -h
usage: extract.py [-h] [--path PATH] [--output OUTPUT]

optional arguments:
  -h, --help            show this help message and exit
  --path PATH, -p PATH  The directory containing XML files to collect.
  --output OUTPUT, -o OUTPUT
                        The path to the output file.

@ThakeeNathees ThakeeNathees requested a review from a team as a code owner March 17, 2020 16:56
@kuruk-mm
Copy link
Contributor

I was finishing this T.T

Good job anyway

@akien-mga akien-mga self-requested a review March 17, 2020 17:05
@akien-mga akien-mga added this to the 4.0 milestone Mar 17, 2020
@ThakeeNathees ThakeeNathees force-pushed the translation-catalog-maker branch 4 times, most recently from 0e4913a to 1f29e0a Compare March 18, 2020 12:09
@ThakeeNathees ThakeeNathees force-pushed the translation-catalog-maker branch 2 times, most recently from 8f17605 to 2fbe515 Compare March 18, 2020 14:26
@akien-mga
Copy link
Member

akien-mga commented Mar 18, 2020

Looks pretty good already, good job! I left a few nitpicks on style.

In the catalog you attached there's a mix of CR, LF and CRLF line endings, but I guess that's from your system/my download, the script itself doesn't seem to generate a mismatch in line endings.

I'd place the file in doc/translations/extract.py, that's where we'll host the docs .pot and .po files.

Also a suggestion to make the commit log a bit more explicit:

Add script to generate translation catalog for the class reference

Fixes #37109.

@ThakeeNathees ThakeeNathees force-pushed the translation-catalog-maker branch 5 times, most recently from 71e87df to ec4d53c Compare March 18, 2020 16:54
@akien-mga akien-mga changed the title Added: translation_catalog_maker.py Add script to generate translation catalog for the class reference Mar 18, 2020
@ThakeeNathees ThakeeNathees force-pushed the translation-catalog-maker branch 2 times, most recently from 6df8b99 to bce10c1 Compare March 18, 2020 21:41
@akien-mga
Copy link
Member

akien-mga commented Mar 19, 2020

I've been testing the latest state of this PR together with #37164, and it seems to work great overall.

There's one string however extracted from the classref which breaks the parsing of the resulting .po file (after copying the docs.pot to e.g. fr.po for translation). It's not surprising given the number of escape and special characters it has:


#: modules/regex/doc_classes/RegEx.xml:7
msgid ""
"A regular expression (or regex) is a compact language that can be used to "
"recognise strings that follow a specific pattern, such as URLs, email "
"addresses, complete sentences, etc. For instance, a regex of [code]ab[0-9][/"
"code] would find any string that is [code]ab[/code] followed by any number "
"from [code]0[/code] to [code]9[/code]. For a more in-depth look, you can "
"easily find various tutorials and detailed explanations on the Internet.\n"
"To begin, the RegEx object needs to be compiled with the search pattern "
"using [method compile] before it can be used.\n"
"[codeblock]\n"
"var regex = RegEx.new()\n"
"regex.compile(\"\\\\w-(\\\\d+)\")\n"
"[/codeblock]\n"
"The search pattern must be escaped first for GDScript before it is escaped "
"for the expression. For example, [code]compile(\"\\\\d+\")[/code] would be "
"read by RegEx as [code]\\d+[/code]. Similarly, [code]compile(\"\\\"(?:\\\\\\"
"\\.|[^\\\"])*\\\"\")[/code] would be read as [code]\"(?:\\\\.|[^\"])*\"[/"
"code].\n"
"Using [method search] you can find the pattern within the given text. If a "
"pattern is found, [RegExMatch] is returned and you can retrieve details of "
"the results using functions such as [method RegExMatch.get_string] and "
"[method RegExMatch.get_start].\n"
"[codeblock]\n"
"var regex = RegEx.new()\n"
"regex.compile(\"\\\\w-(\\\\d+)\")\n"
"var result = regex.search(\"abc n-0123\")\n"
"if result:\n"
"    print(result.get_string()) # Would print n-0123\n"
"[/codeblock]\n"
"The results of capturing groups [code]()[/code] can be retrieved by passing "
"the group number to the various functions in [RegExMatch]. Group 0 is the "
"default and will always refer to the entire pattern. In the above example, "
"calling [code]result.get_string(1)[/code] would give you [code]0123[/code].\n"
"This version of RegEx also supports named capturing groups, and the names "
"can be used to retrieve the results. If two or more groups have the same "
"name, the name would only refer to the first one with a match.\n"
"[codeblock]\n"
"var regex = RegEx.new()\n"
"regex.compile(\"d(?<digit>[0-9]+)|x(?<digit>[0-9a-f]+)\")\n"
"var result = regex.search(\"the number is x2f\")\n"
"if result:\n"
"    print(result.get_string(\"digit\")) # Would print 2f\n"
"[/codeblock]\n"
"If you need to process multiple results, [method search_all] generates a "
"list of all non-overlapping results. This can be combined with a [code]for[/"
"code] loop for convenience.\n"
"[codeblock]\n"
"for result in regex.search_all(\"d01, d03, d0c, x3f and x42\"):\n"
"    print(result.get_string(\"digit\"))\n"
"# Would print 01 03 3f 42\n"
"# Note that d0c would not match\n"
"[/codeblock]\n"
"[b]Note:[/b] Godot's regex implementation is based on the [url=https://www."
"pcre.org/]PCRE2[/url] library. You can view the full pattern reference "
"[url=https://www.pcre.org/current/doc/html/pcre2pattern.html]here[/url].\n"
"[b]Tip:[/b] You can use [url=https://regexr.com/]Regexr[/url] to test "
"regular expressions online."
msgstr ""

That causes:

ERROR: :40661 Expected '"' at end of message while parsing file: 
   at: load_translation (core/io/translation_loader_po.cpp:131)

Here's the original unescaped string: https://github.com/godotengine/godot/blob/master/modules/regex/doc_classes/RegEx.xml#L6-L40

@ThakeeNathees
Copy link
Contributor Author

ThakeeNathees commented Mar 19, 2020

It's because of this line

"read by RegEx as [code]\\d+[/code]. Similarly, [code]compile(\"\\\"(?:\\\\\\"

the load_translation() tries to find the ending quote by checking if there is a quote and the previous character is not \. if the last character is \ it thinks the quote is escaped, but the back slash have escaped as well \\\\\\" here the quote is not escaped.

if (l[i] == '"' && (i == 0 || l[i - 1] != '\\')) {

possible fix (not tested!!)

bool escape_next = false;
for (int i = 0; i < l.length(); i++) {
	if (l[i] == '\\' && !escape_next){
		escape_next = true;
		continue;
	}

	if (l[i] == '"' && !escape_next) {
		end_pos = i;
		break;
	}

	escape_next = false;
}

@akien-mga

@akien-mga
Copy link
Member

@ThakeeNathees Awesome, that works well!

And I confirm that with my PR a translation for the RegEx description works too (I just added a single sentence at the beginning, but all the escaped text is properly matched):
Screenshot_20200320_084456

akien-mga pushed a commit to akien-mga/godot that referenced this pull request Mar 20, 2020
@akien-mga akien-mga merged commit 63f77ef into godotengine:master Mar 20, 2020
@akien-mga
Copy link
Member

Thanks!

@ThakeeNathees ThakeeNathees deleted the translation-catalog-maker branch March 20, 2020 10:13
akien-mga pushed a commit to akien-mga/godot that referenced this pull request Oct 7, 2021
sairam4123 pushed a commit to sairam4123/godot that referenced this pull request Nov 10, 2021
lekoder pushed a commit to KoderaSoftwareUnlimited/godot that referenced this pull request Dec 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write Python script to generate translation catalog (.pot file) from the XML class reference
3 participants