-
-
Notifications
You must be signed in to change notification settings - Fork 958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gettext comments in .po files downloaded from Weblate are broken #5695
Comments
I can't reproduce this.
|
This issue looks more like a support question than an issue. We strive to answer these reasonably fast, but purchasing the support subscription is not only more responsible and faster for your business but also makes Weblate stronger. In case your question is already answered, making a donation is the right way to say thank you! |
Thansk for your reply, The uploaded file is UTF-8 without BOM having CRLF line breaks
In both cases all multi-line comments were broken and the downloaded file has mixed line endings, like: #5624 (See right half of screenshot below). After your hint I tried the converted download and selected .po format. Here is a windiff view of the two downloads:
|
The header comment is expected to be missing from the converted file. The original file seems to have mixed newlines (at least your diff shows different symbols on some lines). Is that something that has happened inside Weblate or was present originally (I guess it was present originally and is cause of the problem we're seeing now). |
Both files in the diff above are downloaded from Weblate. So yes, it happened inside Weblate. Here's a diff of the originial (left) vs the "simple" download from Weblate (right), which is the same file on the right side as in the previous diff. |
This morning I tried to investigate the issue further. It looks like we have a case of self-healing software. Today all downloaded translations, no matter if .po or .islu format have consistent line endings and the comments in the .po files are ok. |
As far as I can tell, the issue went away over the weekend without any changes on my part. I don't know if the docker container was restarted, in case this matters. |
It might depend on changes in the PO file content, the newlines detection is not that simple here: |
There is no code to deal with newlines. There are maintenance tasks to clean up the database, or to fetch updates from remote repositories, but that should have no effect in this. Do you run the server on Linux? |
We are running weblate in Docker. Don't know the details, because my admin hasn't responded yet, but I guess the short answer is: "yes it runs under Linux." This is very interesting indeed. Years ago, we wrote a pre-processor for the .po(t) files which converted them to Unix newline style. In lines 31-33 on the right hand side of the screenshots above you even see three different newline styles in just three lines of .po file. So the files saved by Virtaal contained mixed newlines when reading in DOS style and Unix newlines when reading in Unix style. Our solution was to give Virtaal what it needed and be done with it. Looking at the author(s), the code that you pointed me to looks very much like it could be the same that was sitting at the core of Virtaal years ago. :) But still: we are uploading consistent newlines. Either DOS or Unix, Weblate's choice. So what made Weblate break the files that were downloaded? Are they merged from the originial file (style A) and translations that were made by users (style B)? Maybe weblate needs a (per project/per component) newline style setting that it uses for downloads, no matter which style was used during upload? VCS handle this quite well nowadays. They store files with a standard newline style on the server and the client converts to the platform specific style upon download. Since Weblate can't know the platform to which the file is downloaded, it has to be told beforehand. Alternatively the newline style could be specified in the get request like the |
The underlying library for handling the translation files is the same as in Virtaal (we both use translate-toolkit). Probably it still has some issues with non-unix newlines. AFAIK GNU gettext only parses unix newlines, so this is not well tested area. |
i've added tests exposing this in translate/translate#4301, I will look into fixing it later. |
Keep the newlines during round-trip. This reduces amount of changes and avoids producing mixed newlines files. Fixes WeblateOrg/weblate#5695
The issue you have reported is resolved now. If you don’t feel it’s right, please follow it’s labels to get a clue and take further steps.
|
The issue you've reported needs to be addressed in the translate-toolkit. Please file the issue there, and include links to any relevant specifications about the formats (if applicable). |
Fantastic! |
Yes, it's likely - the comments are parsed as a block, so they keep the newlines, while the rest of the file is using system newlines on serializing (as it relies on ConfigParser). |
Gettext comments in .po files downloaded from Weblate are broken
I already tried
Describe the steps you tried to solve the problem yourself.
If you didn’t try already, try to search there what you wrote above.
To Reproduce the issue
Steps to reproduce the behavior:
Expected behavior
That the uploaded and downloaded files are identical apart from minor formatting differences and that the downloaded file is syntactically correct.
Screenshots
A snippet of the uploaded file:
The same snippet in the downloaded file:
In the weblate UI the string is correctly shown as belonging to the two items in the original comments
Exception traceback
Server configuration and status
Weblate installation: weblate.org Docker
Weblate version 4.5.1
Weblate deploy checks
Additional context
The text was updated successfully, but these errors were encountered: