Markdown files support #3106

maicol07 · 2019-10-12T15:22:22Z

Describe the solution you'd like
Markdown files support (maybe similar to Crowdin system)

nijel · 2019-10-13T19:22:48Z

We currently do not support translating any documents, only formats designed for localization.

Related to #2592

maicol07 · 2019-10-13T19:24:46Z

@nijel do you think will be implemented?

nijel · 2019-10-14T11:32:28Z

I think it will be implemented at some point. It should not be hard to implement (we already do something similar for the appstore metadata). Right now it's just not a priority for me, but this can change if somebody comes with funding for this :-).

RMStoica-zivver · 2019-11-25T15:12:40Z

@nijel What sort of funding are we talking about here? Asking for a friend.

nijel · 2019-11-25T22:45:54Z

@RMStoica-zivver You can use https://www.bountysource.com/issues/81891384-markdown-files-support to put funds on this issue to motivate contributors.

wetneb · 2020-03-26T15:39:20Z

I have added a bounty on this issue - it's not exactly clear to me what the integration should look like but I trust @nijel to steer the idea in the right direction.

Our own requirement is to translate documentation, represented by a set of Markdown files. Since these Markdown files will be stored in the same git repository as our UI i18n files (and the rest of our code), it would be ideal if the Markdown files could be added as an extra component in the Weblate project.

nijel · 2020-03-30T10:09:48Z

Depends on translate/translate#3956 which depends on miyuchina/mistletoe#162

guoyunhe · 2020-04-28T10:08:02Z

I think it can be easily supported with some simple script, converting Markdown files to JSON files.

eighthave · 2020-04-28T11:39:47Z

If you want to go the conversion route, I recommend using po4a, a 20 year old project for doing just that. It recently got some key improvements in v0.58. Its used for things like f-droid.org and Fedora documentation.

guoyunhe · 2020-04-28T11:52:06Z

@eighthave good to know. thanks!

mquinson · 2020-04-29T10:44:13Z

About the idea of using po4a here, I think that it perfectly makes sense (disclamer: I'm one of the authors of po4a).

We already have an existing markdown parser, but it's ... not rock stable and changing it may be more complex than rewriting a new parser. What would remain is the surrounding infrastructure of po4a, which makes the conversion between documentation formats and PO files easier, and the tests.

My plan to improve the support of Markdown in po4a is to simplify the existing parser (its code is convoluted), and then improve its robustness using for example the tests from https://github.com/bobtfish/text-markdown/tree/master/t

Markdown is not very complex compared to other formats we handler pretty well in po4a (eg, groff of man pages or XML plus the docbook and HTML variants). For both formats, we use internal parses with no dependency to external tool or library. This is because the kind of parsing that we are doing is specific, so we felt it easier this way.

The groff parser is interesting in the sense that it really normalizes the input. There is maybe 6 ways to specify the inline formatting (bold, italic), and po4a converts them all to one form only to ease the life of translators. I'm not sure that it will be mandated for the markdown parser, but that's something to consider. The XML parser is interesting because it is difficult to have a line-by-line parser of XML, just as it happens to be in markdown. So the solution built in the XML parser could be useful to rework the markdown parser: instead of the line by line parser that we currently have, we could go for a block by block approach. That would help supporting the bits that are currently not supported.

Edit: we also have some format parsers that are using external tools in po4a. The POD parser is using a dedicated Perl library while the SGML parser is using the onsgml external parser. On another front, I am considering whether asciidoctor could be used as an external parser for the AsciiDoc format. If someone knows a parser for markdown that works a bit like a SAX parser, that may be an interesting starting point, maybe.

I'm willing to help any volunteer, but my personal schedule does not allow me to address this issue alone anytime soon.

Oh. And po4a is written in Perl. Sorry about that...

eighthave · 2020-04-29T11:57:52Z

What I mean is that there are a lot of important ideas in po4a that can be used in an Python/AST implementation: * the --keep option, e.g. a percentage translated that must be met, or the document reverts to the source language * automatic metadata like "markdown-text" * removing pure syntax strings from the translators view * custom YAML Front Matter handling But in a broader sense, I think a po4a mode might make sense for Weblate to handle formats like asciidoc, groff/man, etc. I wasn't thinking to use po4a directly in Weblate to handle Markdown, though that might be a quick fix for this. I think that having access to the Markdown AST will enable so many really useful possibilities, it will be worth the work.

RMStoica-zivver · 2020-04-29T12:00:33Z

If I may, from the point of view of the translator block-by-block makes a lot of sense, as one block will be one paragraph, or one list etc.
Also I would suggest maybe taking a look at pandoc

akumar-xyz · 2020-05-04T15:34:27Z

Considering how all "inline HTML" is valid markdown, I would suggest approaching markdown files by doing md -> html then simply using translate-toolkit's html support.
Perhaps another small layer to handle front matter. That should make it a little more straight forward to implement a complete markdown support.

I do apologize in case I've missed something about the problem. Only came across this project on bountysource a few hours back

eighthave · 2020-05-05T13:11:15Z

The Markdown AST parser libraries will understand the HTML components, and then let us work directly with the AST.

erciccione · 2021-02-25T14:58:07Z

I'm adding 1 Monero (XMR value at the moment: $346 - updated value) to the bounty to be awarded to the person who will resolve this issue.

I would have used Bountysource but the related bounty can only be funded through Paypal. To send the bounty i will need a Monero address (or, if preferred, the address of another cryptocurrency. Like BTC or ETH).

erciccione · 2021-05-23T12:36:03Z

Reminder that there are two bounties on this issue: $240 + 1XMR. This feature is very much needed.

Svetlana-T · 2021-05-28T04:48:34Z

Hi, how can I do this now? With po4a? What are the steps? Say I have hello.md with "Hello world \n Bye all" in it, and would like it to be translatable, for example.

ilmari-lauhakangas · 2021-05-28T05:34:24Z

Hi, how can I do this now? With po4a? What are the steps? Say I have hello.md with "Hello world \n Bye all" in it, and would like it to be translatable, for example.

Create a po4a.conf file (name doesn't matter) in a po/ subdirectory with the content

[po4a_langs] fr es it de
[po4a_paths] po/mysite.pot $lang:po/mysite.$lang.po

[options] opt:"--addendum-charset=UTF-8" opt:"--localized-charset=UTF-8" opt:"--master-charset=UTF-8" opt:"--master-language=en_US" opt:"--msgmerge-opt='--no-wrap'" opt:"--porefs=file" opt:"--wrap-po=newlines"

[po4a_alias:markdown] text opt:"--option markdown" opt:"--option yfm_keys=title" opt:"--addendum-charset=UTF-8" opt:"--localized-charset=UTF-8" opt:"--master-charset=UTF-8" opt:"--keep=0"

[type: markdown] content/hello.md $lang:content/$lang/hello.md
[type: markdown] content/goodbye.md $lang:content/$lang/goodbye.md

Then run

po4a po/po4a.conf

If using git, you can add these sorts of rules into .gitignore:

# no need to translate the source language, but po4a gens this file
po/mysite.en.po

# po4a auto-generated markdown files from translations
content/[a-z][a-z]/*.md
content/[a-z][a-z][a-z]/*.md
content/[a-z][a-z][a-z]_[A-Z]*/*.md
content/[a-z][a-z]_[A-Z]*/*.md

Now there is also this tool from KDE: https://invent.kde.org/websites/hugo-i18n

eighthave · 2021-05-28T06:32:15Z

Here are some examples of sites doing this with po4a: https://gitlab.com/fdroid/fdroid-website/ fsfe/reuse-docs#61

wetneb · 2021-05-28T09:15:02Z

I have not tried it but I suspect the main issue with this workflow is that translators get to translate parts of markdown files out of context, no?

In OpenRefine we are sadly going to go for Crowdin (for now), because it seems to be the only solution which offers a real markdown editor where you can see the entire file being translated while still working on individual parts.

If people are interested in adding a similar Markdown support in Weblate, I could imagine finding some funding for it (the existing bounties will not get us very far I am afraid). Maybe we could pool resources with other projects interested in the feature?

ilmari-lauhakangas · 2021-05-28T09:18:38Z

I have not tried it but I suspect the main issue with this workflow is that translators get to translate parts of markdown files out of context, no?

In OpenRefine we are sadly going to go for Crowdin (for now), because it seems to be the only solution which offers a real markdown editor where you can see the entire file being translated while still working on individual parts.

I wouldn't extend the scope of this issue to include such a nice-to-have feature.

yarons · 2021-11-28T16:54:18Z

Is this issue is only about getting the strings out of Markdown and translating them? I would suggest supporting something similar to Crowdin's documentation localization offering.

eighthave · 2021-11-29T13:27:23Z

last I looked, Crowdin's Markdown support was limited but better than nothing. The best way would be to actually use the AST (Abstract Syntax Tree). That means Markdown becomes structured data like JSON, YAML, XML, etc.

yarons · 2021-11-29T17:25:02Z

I'm talking about the fact that you can see the end result in a preview pane while translating, Mozilla's Pontoon also offer such capability.

Mihonarium · 2022-05-27T14:57:43Z

We're translating markdown articles via weblate. Initially, we wanted to translate by inserting plain text into Weblate, but Weblate was pretty bad at handling insertions and deletions of paragraphs in the original text.

I've looked into po4a and other ways to convert the text into formats that would allow us to easily translate and update the text, but haven't found anything that would be easy to use and wouldn't generate lots of overhead.

So I've written a simple Golang package that splits text into paragraphs, compares to the previous version of the text (if there is one), and produces JSON of a map from keys to paragraphs in a way that keeps the paragraphs in the right order, doesn't change the keys if the text wasn't significantly changed and handles insertions and deletions in a way that avoids key collisions. It should be pretty easy to write something similar and add to how Weblate handles plain text; but if anyone's interested, I can add the documentation, examples, etc. to the tool I've written.

nijel · 2022-06-06T13:39:03Z

Handling of plain text files will work better since 4.13, see #7585

github-actions · 2023-08-01T18:30:36Z

Thank you for your report; the issue you have reported has just been fixed.

In case you see a problem with the fix, please comment on this issue.
In case you see a similar problem, please open a separate issue.
If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.

nijel added the enhancement Adding or requesting a new feature. label Oct 13, 2019

nijel added this to TODO in File format support via automation Oct 13, 2019

nijel mentioned this issue Dec 3, 2019

Markdown syntax check #2495

Closed

nijel changed the title ~~Markdown files support~~ Markdown files support [$250] Mar 26, 2020

nijel added the bounty label Mar 26, 2020

wetneb mentioned this issue Mar 26, 2020

Migrate to a new documentation platform OpenRefine/OpenRefine#2273

Closed

2 tasks

wetneb mentioned this issue Apr 28, 2020

Markdown as a native source format translate/translate#3956

Closed

nijel mentioned this issue Jun 15, 2020

Translating issues WeblateOrg/docker#687

Closed

diegodlh mentioned this issue Sep 4, 2020

Allow internationalization jamulussoftware/jamulus#77

Closed

erciccione mentioned this issue Oct 24, 2020

Introduce i18n system for user guides monero-project/monero-site#1270

Closed

4 tasks

slorber mentioned this issue Jan 15, 2021

Is there a way to restrict sending text for translation to Crowdin? (Docusaurus) facebook/docusaurus#2278

Closed

MuntashirAkon mentioned this issue Mar 1, 2021

README reworked 2 MuntashirAkon/AppManager#302

Merged

askvortsov1 mentioned this issue Mar 13, 2021

Use Translation Collaboration Platform flarum/docs#177

Open

dingyifei mentioned this issue Apr 28, 2021

Feature request: Metadata headers mondeja/mdpo#85

Closed

SethFalco mentioned this issue May 21, 2021

Decide on a free web hosted translation service tldr-pages/tldr#3591

Open

nijel mentioned this issue May 28, 2021

using weblate to translate .MD files #6096

Closed

1 task

github-actions bot removed the bounty label Jan 7, 2022

askvortsov1 mentioned this issue Jan 14, 2022

About translation with Crowdin flarum/docs#401

Open

not-my-profile mentioned this issue Jan 16, 2022

Introduce searx/help/ searxng/searxng#758

Merged

mquinson mentioned this issue Mar 29, 2022

Why does neverwrap add newlines to plain text entries? mquinson/po4a#359

Open

nijel changed the title ~~Markdown files support [$250]~~ Markdown files support Jun 1, 2023

jorgesumle mentioned this issue Jun 3, 2023

Relicensing: CC BY-SA 4.0 privacyguides/privacyguides.org#2097

Merged

4 tasks

nijel added this to the 5.0 milestone Jul 19, 2023

nijel self-assigned this Jul 19, 2023

nijel linked a pull request Jul 27, 2023 that will close this issue

formats: Add Markdown support #9597

Merged

5 tasks

nijel closed this as completed in #9597 Aug 1, 2023

File format support automation moved this from TODO to Done Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Markdown files support #3106

Markdown files support #3106

maicol07 commented Oct 12, 2019 •

edited by nijel

Loading

nijel commented Oct 13, 2019 •

edited

Loading

maicol07 commented Oct 13, 2019

nijel commented Oct 14, 2019

RMStoica-zivver commented Nov 25, 2019

nijel commented Nov 25, 2019

wetneb commented Mar 26, 2020

nijel commented Mar 30, 2020 •

edited

Loading

guoyunhe commented Apr 28, 2020

eighthave commented Apr 28, 2020

guoyunhe commented Apr 28, 2020

mquinson commented Apr 29, 2020 •

edited

Loading

eighthave commented Apr 29, 2020 via email

RMStoica-zivver commented Apr 29, 2020

akumar-xyz commented May 4, 2020

eighthave commented May 5, 2020 via email

erciccione commented Feb 25, 2021 •

edited

Loading

erciccione commented May 23, 2021

Svetlana-T commented May 28, 2021

ilmari-lauhakangas commented May 28, 2021 •

edited

Loading

eighthave commented May 28, 2021 via email

wetneb commented May 28, 2021

ilmari-lauhakangas commented May 28, 2021

yarons commented Nov 28, 2021

eighthave commented Nov 29, 2021 via email

yarons commented Nov 29, 2021

Mihonarium commented May 27, 2022 •

edited

Loading

nijel commented Jun 6, 2022 •

edited

Loading

github-actions bot commented Aug 1, 2023

Markdown files support #3106

Markdown files support #3106

Comments

maicol07 commented Oct 12, 2019 • edited by nijel Loading

nijel commented Oct 13, 2019 • edited Loading

maicol07 commented Oct 13, 2019

nijel commented Oct 14, 2019

RMStoica-zivver commented Nov 25, 2019

nijel commented Nov 25, 2019

wetneb commented Mar 26, 2020

nijel commented Mar 30, 2020 • edited Loading

guoyunhe commented Apr 28, 2020

eighthave commented Apr 28, 2020

guoyunhe commented Apr 28, 2020

mquinson commented Apr 29, 2020 • edited Loading

eighthave commented Apr 29, 2020 via email

RMStoica-zivver commented Apr 29, 2020

akumar-xyz commented May 4, 2020

eighthave commented May 5, 2020 via email

erciccione commented Feb 25, 2021 • edited Loading

erciccione commented May 23, 2021

Svetlana-T commented May 28, 2021

ilmari-lauhakangas commented May 28, 2021 • edited Loading

eighthave commented May 28, 2021 via email

wetneb commented May 28, 2021

ilmari-lauhakangas commented May 28, 2021

yarons commented Nov 28, 2021

eighthave commented Nov 29, 2021 via email

yarons commented Nov 29, 2021

Mihonarium commented May 27, 2022 • edited Loading

nijel commented Jun 6, 2022 • edited Loading

github-actions bot commented Aug 1, 2023

maicol07 commented Oct 12, 2019 •

edited by nijel

Loading

nijel commented Oct 13, 2019 •

edited

Loading

nijel commented Mar 30, 2020 •

edited

Loading

mquinson commented Apr 29, 2020 •

edited

Loading

erciccione commented Feb 25, 2021 •

edited

Loading

ilmari-lauhakangas commented May 28, 2021 •

edited

Loading

Mihonarium commented May 27, 2022 •

edited

Loading

nijel commented Jun 6, 2022 •

edited

Loading