Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Support RTL in output formats (particularly HTML) #1601
Signaling the back-ends about the text direction. At the moment in HTML it is LTR by default. Adding such thing as:
Should signal the back-end to apply the correct direction.
Please note that being RTL or LTR can change at element level, so for each element that should be an option to change it.
I knew that this issue would one day come :) I've often thought about it. Thanks for submitting it!
We really have two concerns here, the input and and the output.
On the input side, we want AsciiDoc to be friendly for RTL languages. But that is going to be tough because it may require changes to the parser, and with it some design changes.
On the output side, we should definitely allow the RTL to be controlled because that is critical for the reader. Let's make this issue about RTL in the output so we get one part of it down. Agreed?
There are two things about RTL at the input. 1: to write in RTL language and 2: to have Asciidoc tags and macros in RTL. I created an issue for number 2 here: #1600 .
But about number 1 I was thinking it is already there. Actually I should test a long and sophisticated enough text (well, just in Persian) to see if things work or not. I would put the result here.
Ok good news. I guess that Asciidoctor is very close to claim it supports RTL (bidi) languages at least to some extents. I tested an article in Persian (Farsi) with some levels of sophistication, first transferred it to Asciidoc and then converted to HTML through Asciidoctor. A custom css (sass) accompanied the output. The input and the output are attached and they look quite good and readable. The only problem I noticed is the automated-numberings in section headings, image numberings ... which are in English.
Also, Asciidoctor was able to convert Persian roles into correct HTML classes in Persian. As an example, section 1.1 which is actually the references section was given role
I wish that macros and types (such as [appendix]) can also be written in the destination language so there would be more readability and homogeneity in the asciidoctor RTL files.
I guess the problem of English numbering should also be fixed very easily, by an attribute one can determine the language of the document (e.ge
Please note that the the order of numbers in section numbering should be RTL .For example, section 2 subsection 5 (2.5) it should be 2 then 5 from the right (not left) when the numbers are converted.
One more destination to check is Docbook, although it seems OK in the translation but I tried to check it more closely by converting to Latex and then converting to PDF (using Xelatex) but DBLatex was escaping Persian characters and still I could not fix it to check the final result.
Results from DocBook conversion (attached the result):
At the moment we have :
While this should be probably:
I don't know if
The test which is attached renders DocBook to PDF through (Xe)latex and because of lacking the above mentioned RTL delimitations it renders English phrases in reverse order of words. Also because there is no RTL/LTR signaling all numbers are rendered as Persian numbers which is not desirable when a Latin one is intended (such as in URLs). At the same time, because the auto-numbers are not normally generated in DocBook (? not sure it is always the case) the problem with auto-numbers (as in HTML) does not exist.
referenced this issue
Dec 30, 2015
I discovered this remarkably enlightening and timely article on the subject of bidirectional text on the web (posted on opensource.com). It was almost as though this article was written to help us resolve this issue :)
Unicode bidirectional algorithm detects when a bidi source changes from left to right (or vice versa) and applies the correct (costume) markup (so no need to manually specify it)