Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI mirroring for RTL languages (Arabic, Farsi, Hebrew...) #5667

Merged
merged 7 commits into from Dec 8, 2019

Conversation

@poire-z
Copy link
Contributor

poire-z commented Dec 3, 2019

Follow up to #5598 (TextWidget/TextBoxWidget: enhanced text shaping) that plugs the UI language and its text direction as defaults into the XText C module, which should be all that's needed to have RTL text right aligned and drawn correctly. (Should also close #2936 No support for font variants in UI based on language).

Also adds UI mirroring support in many widgets and apps (many files modified, but it's usually just a few lines in each of them).
Should finally close #5359 when the full arabic translation is in to have a nice mirrored UI in arabic (pinging @WaseemAlkurdi :)

@Frenzie @NiLuJe @robert00s
The 2nd commit defines some helpers in the bidi.lua module, that are used by stuff in the 3rd commit.
Can you have a quick look at the function names defined in the 2nd commit, if they are obvious/concise/short enough, and if the way they are used in the 3rd commit looks natural and non-obtrusive enough? And if it's worthwhile introducing all this little noise to get a RTL UI none of us will ever use :)

I may split the 3rd commit in multiple smallers ones (low-level widgets, symbols mirroring, gesture mirroring, geometry arithmetic mirroring, textboxwidget parameters forwarding...), but it will be easier to do that while all is still flat :)
So some quick OK or comments on the BD API are appreciated :)

This mirrors all the UI elements I met that felt they needed mirroring.
The simple stuff in the low level widgets (*Container, HorizontalGroup, OverlapGroup, FrameContainer) was enough to make 90% of the stuff mirrored naturally - and we can just continue developping in LTR - and things should work in RTL. There are just some special cases with symbols, swipe directions, and some direct geometry arithmetic or direct painting, that need additional care.

What's left to do is some additional BD.wrap(), BD.filename(), BD.path() where needed, where they are give as parameters to some T(L()), but that's probably for another PR to not make this one even larger.


This change is Reviewable


-- Setup UI mirroring and RTL text for UI language
function Bidi.setup(lang)
local is_rtl = Language:isLanguageRTL(lang)

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

I assume you didn't use isRtlLanguage because it looks a bit odd (as does `isRTLLanguage) but I'd still prefer it for easier comprehension.

This comment has been minimized.

Copy link
@poire-z

poire-z Dec 3, 2019

Author Contributor

Yes, flat camelCAse is a a bit strange.
But also, it puts the accent on what's checked for at the end: IsThisLanguageRTL., while your proposal sounds like IsItSomeRtlLanguage (among others) - less obvious.

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

That depends on if you think of it as more of a query or more of a boolean, I suppose. As a natural language question I'd agree with you, but I see it in line with isKobo and isTouchDevice.

if isKobo() then
if isRtlLanguage() then
if isLanguageRTL() then

This comment has been minimized.

Copy link
@poire-z

poire-z Dec 5, 2019

Author Contributor

But these Device: methods refer to the Device object they are methods of, so Device:isKobo() makes sense - a boolean/property.
Language here is a module, not an object, so Language:isLanguageRTL(lang) does not refer to a property/boolean of the Language - but it's a function that takes an argument and should tell if that argument is RTL or not - a query.
Anyway, it's only used at 2 places, so I don't mind a name that feel strange to me - if you still feel isRtlLanguage is preferable (isRTL(lang) could work, but it sound even more like a method than the 2 other longer solutions).

@poire-z poire-z force-pushed the poire-z:ui_mirroring branch from eee1adc to 7eb0b43 Dec 3, 2019
Copy link
Member

Frenzie left a comment

A few quick comments; may not have time to take a good look till tomorrow night/day after tomorrow.

-- Mirror UI if language is RTL
Bidi._mirrored_ui_layout = is_rtl
-- Unless requested not to (or requested mirroring with LTR language)
if G_reader_settings:isTrue("reverse_lang_ui_layout_mirroring") then

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

What about something like ui_direction_rtl?

This comment has been minimized.

Copy link
@poire-z

poire-z Dec 3, 2019

Author Contributor

To make things less ambiguous, I tried to always use "rtl" for text, and "mirrored" for UI layout.
These 2 things are quite orthogonal to each other (in the implementation, even if in the real world, they should be either both enabled or both disabled).
I'd like to keep that as separate as possible, so we can all distinquish RTL issues (text, text aligmnent) and Mirroring issues (widgets, gestures...), so also for the settings names.

southwest = "southeast",
}

function Bidi.mirrorDirectionIfMirroredUILayout(direction)

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

Um, what? :-)

This comment has been minimized.

Copy link
@poire-z

poire-z Dec 3, 2019

Author Contributor

You should see where they are used.
But ok, flipDirectionIfMirroredUILayout() to be less heavy ? :)

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

Since flip is used elsewhere already it's probably better that way.

text = "UI layout mirroring and text direction",
sub_item_table = {
{
text = _("Reverse language natural UI layout mirroring"),

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

Just Reverse UI layout mirroring?

end
},
{
text = _("Reverse language natural UI text direction"),

This comment has been minimized.

Copy link
@Frenzie

Frenzie Dec 3, 2019

Member

Same.

@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 4, 2019

A 3mn watch with 2 examples, that might show/explain why/when we'll need to BD.wrap() individual bits that will be concatenated (like in our footers): https://www.youtube.com/watch?v=xpumLsaAWGw&t=1692

@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 5, 2019

So, no veto on this?
I can go on with coloring all our code with stuff like that?

local BD = require("ui/bidi")
if BD.mirroredUILayout() then ... end
if BD.rtlUIText() then ... end -- very rare
rotation_angle = BD.mirroredUILayout() and 90 or 0,
local direction = BD.flipDirectionIfMirroredUILayout(ges.direction)
if BD.flipIfMirroredUILayout(ges_ev.pos.x < Screen:getWidth()/2) then ... end
BD.wrap(prefix)
@Frenzie

This comment has been minimized.

Copy link
Member

Frenzie commented Dec 5, 2019

I haven't really had time to think more about the names but the principle always looked good to me. :-)

@poire-z poire-z force-pushed the poire-z:ui_mirroring branch from 7eb0b43 to 097e92d Dec 6, 2019
@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 6, 2019

^ force-pushed just a rebase to master - can be ignored

poire-z added 7 commits Dec 6, 2019
Use some HorizontalSpan for padding between key and value,
instead of prepending a space to the value text (which
won't work as expected if value text is RTL, as it would
be put on the right side of screen).
Hardware first, then canvas & fonts, and then the UI widgets
as late as possible.
Set default language (for Harfbuzz to pick up localized glyphs
in a font), default text direction, and UI element mirroring
depending on the UI language.
Allow TextBoxWidget new text direction/lang parameters to be
set on upper widgets, and propagate them all the way to it
(ScrollTextWidget, InputText, InputDialog, TextViewer).

Use specific non-default ones in some specific cases:
- Force LTR text direction when showing HTML and CSS, and
  configuration files (in some plugins).
- Use Wikipedia server language and text direction when
  showing an article.
- Use auto with Dictionary results, as we don't know the
  dictionary language, and they may contain mixed content.
- Force LTR when showing some paths (still needs more of them)

TextEditor plugin: add 2 new options "Auto paragraph direction"
and "Force paragraph direction LTR".

Footnotes popup: grab HTML direction, and forward it
to MuPDF for proper display.
These updated low-level widgets will handle 90%
of the needed UI mirroring.
Small tweaks all around to handle UI mirroring:
- swap existing symbols like arrows, or use alternative ones
- rotate some images, like chevrons and dogear icons
- flip some left and right swipe handling
- flip some geometry arithmetic like tap on left or right
  side of page or dict window
- use new ProgressWidget:getPercentageFromPosition() instead
  of geometry arithmetic
- BD.wrap() some concatenated string bits, like in reader
  and menu footers
- flip inverse_reading_order when UI is mirrored

More specific tweaks:
- ReaderGesture: reset some specific gestures when UI direction
  has changed (tap on top/bottom left/right corners, for
  bookmarks and FileManager "Plus menu").
- ReaderRolling: show markers on the correct side of page,
  in single or dual page mode.
- KoptOptions: swap left and right icons in Alignment toggle
- CheckMark: proper rendering in all 4 mirroring/rtl combinations.
- VirtualKeyboard: forbid any mirroring
- Move util.getMenuText into Menu.lua
@poire-z poire-z force-pushed the poire-z:ui_mirroring branch from 097e92d to 00c60b7 Dec 6, 2019
@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 6, 2019

Prepended 2 commits that fix some stuff not necessarily related to UI mirroring - but that actually showed up more often when mirroring.
Splitted the initial 3rd commit in 3 commits. People working on fixing mirroring issues might only need to look at the last for inspiration.

@Frenzie: may be for you to later think about, a comment in bidi.lua:

-- See at having GetText_mt.__call() wrap untranslated strings in Bidi.ltr()
-- so they are fully displayed LTR.
@poire-z poire-z changed the title [WIP] UI mirroring for RTL languages UI mirroring for RTL languages Dec 6, 2019
@poire-z poire-z changed the title UI mirroring for RTL languages UI mirroring for RTL languages (Arabic, Farsi, Hebrew...) Dec 6, 2019
@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 6, 2019

Some screenshots (with the current incomplete Arabic translation):

image

image

image

image

image

image

image

image

@NiLuJe

This comment has been minimized.

Copy link
Member

NiLuJe commented Dec 7, 2019

Looks fun :).

@poire-z poire-z merged commit 7952fa2 into koreader:master Dec 8, 2019
1 check passed
1 check passed
ci/circleci: build Your tests passed on CircleCI!
Details
@poire-z poire-z deleted the poire-z:ui_mirroring branch Dec 8, 2019
@robert00s robert00s added this to the 2019.12 milestone Dec 8, 2019
@WaseemAlkurdi

This comment has been minimized.

Copy link
Contributor

WaseemAlkurdi commented Dec 9, 2019

From what I see, it's gorgeous. Absolutely gorgeous. Pure gold.
Lots of people, at least people around me, are going to love this!
The only imperfection is the file name w/ extension. See my comment on the issue. It's minor, though, and I initially didn't even notice it.

@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 11, 2019

@Frenzie: may be for you to later think about, a comment in bidi.lua:
See at having GetText_mt.__call() wrap untranslated strings in Bidi.ltr() so they are fully displayed LTR.

Would you be OK with something like that:

--- a/frontend/gettext.lua
+++ b/frontend/gettext.lua
@@ -30,2 +30,3 @@ local GetText = {
     plural_default = "n != 1",
+    -- wrap_untranslated_noop = function(text) return text end,
 }
@@ -36,2 +37,5 @@ local GetText_mt = {

+GetText.wrap_untranslated_noop = function(text) return text end
+GetText.wrap_untranslated = GetText.wrap_untranslated_noop
+
 --[[--
@@ -50,3 +54,3 @@ Returns a translation.
 function GetText_mt.__call(gettext, msgid)
-    return gettext.translation[msgid] or msgid
+    return gettext.translation[msgid] or GetText.wrap_untranslated(msgid)
 end
@@ -293,5 +297,5 @@ function GetText_mt.__index.ngettext(msgid, msgid_plural, n)
     if plural == 0 then
-        return GetText.translation[msgid] and GetText.translation[msgid][plural] or msgid
+        return GetText.translation[msgid] and GetText.translation[msgid][plural] or GetText.wrap_untranslated(msgid)
     else
-        return GetText.translation[msgid] and GetText.translation[msgid][plural] or msgid_plural
+        return GetText.translation[msgid] and GetText.translation[msgid][plural] or GetText.wrap_untranslated(msgid_plural)
     end
@@ -326,5 +330,5 @@ function GetText_mt.__index.npgettext(msgctxt, msgid, msgid_plural, n)
     if plural == 0 then
-        return GetText.context[msgctxt] and GetText.context[msgctxt][msgid] and GetText.context[msgctxt][msgid][plural] or msgid
+        return GetText.context[msgctxt] and GetText.context[msgctxt][msgid] and GetText.context[msgctxt][msgid][plural] or GetText.wrap_untranslated(msgid)
     else
-        return GetText.context[msgctxt] and GetText.context[msgctxt][msgid] and GetText.context[msgctxt][msgid][plural] or msgid_plural
+        return GetText.context[msgctxt] and GetText.context[msgctxt][msgid] and GetText.context[msgctxt][msgid][plural] or GetText.wrap_untranslated(msgid_plural)
     end
@@ -359,3 +363,3 @@ See [gettext contexts](https://www.gnu.org/software/gettext/manual/html_node/Con
 function GetText_mt.__index.pgettext(msgctxt, msgid)
-    return GetText.context[msgctxt] and GetText.context[msgctxt][msgid] or msgid
+    return GetText.context[msgctxt] and GetText.context[msgctxt][msgid] or GetText.wrap_untranslated(msgid)
 end
--- a/frontend/ui/bidi.lua
+++ b/frontend/ui/bidi.lua
@@ -88,2 +88,15 @@ function Bidi.setup(lang)
     end
+    -- If RTL UI text, have untranslater string (so english) still rendered LTR
+    if Bidi._rtl_ui_text then
+        _.wrap_untranslated = function(text)
+            -- if true then return text end
+            lines = {}
+            for s in text:gmatch("[^\r\n]+") do
+                table.insert(lines, Bidi.ltr(s))
+            end
+            return table.concat(lines, "\n")
+        end
+    else
+        _.wrap_untranslated = _.wrap_untranslated_noop
+    end
 end

It would turn this (that we have currently):
image
into something a bit more readable (I guess, even for Arabic readers) until the thing is translated:
image

@Frenzie

This comment has been minimized.

Copy link
Member

Frenzie commented Dec 11, 2019

Seems okay. It does add some minor complication in case of switching from our Lua GetText to GetText through FFI so it should also mention that somewhere in a comment.

@WaseemAlkurdi

This comment has been minimized.

Copy link
Contributor

WaseemAlkurdi commented Dec 12, 2019

@poire-z There's a slight problem here. In the first screenshot ("before"), things like bullet points started from right to left, as they should be. However, in the second screenshot ("after"), they seem to start from the left. Is that normal or a regression?

@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 12, 2019

No regression for something that does not yet exist :)

That was just some idea to display yet-not-translated english text a bit more correctly, I thought.
For this multi-lines english text fully non-translated, with bullets and punctuations, it looks to me quite bogus when displayed in the RTL UI right aligned with the RTL text rules, with a comma just after the bullet, and punctuations out of place.
I thought it would be more readable displayed LTR if it is plain english, even in the RTL UI - but you tell me.
Also, it may jump more at your eyes (if it needed help to be noticed...) that it needs to be translated that way :)
But once it's translated (it it exists in the translation, even if you let it be plain english), it would not suffer this trick, and would be fully RTL.
But you tell me if that's a plus - or really not. If the "before" is totally fine with you, no need to follow up on my idea :)

@WaseemAlkurdi

This comment has been minimized.

Copy link
Contributor

WaseemAlkurdi commented Dec 14, 2019

@poire-z

No regression for something that does not yet exist :)
That was just some idea to display yet-not-translated english text a bit more correctly, I thought.

Ah, I misunderstood the purpose then. I thought that this is going to be the case for both localized and English.

For this multi-lines english text fully non-translated, with bullets and punctuations, it looks to me quite bogus when displayed in the RTL UI right aligned with the RTL text rules, with a comma just after the bullet, and punctuations out of place.
I thought it would be more readable displayed LTR if it is plain english, even in the RTL UI - but you tell me.

Agreed.
But why not go all the way and LTR flow the whole untranslated dialog then? If there's a single RTL character in the dialog, then render it RTL. What do you think?

Also, it may jump more at your eyes (if it needed help to be noticed...) that it needs to be translated that way :)
But once it's translated (it it exists in the translation, even if you let it be plain english), it would not suffer this trick, and would be fully RTL.
But you tell me if that's a plus - or really not. If the "before" is totally fine with you, no need to follow up on my idea :)

You're right ... the second variation is better, now that I understand the reasoning.
However, it does need to be LTR in that case, doesn't it? If it can't be, then it's alright.

@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 14, 2019

If there's a single RTL character in the dialog, then render it RTL

There are 4 or 5 RTL chars in the above screenshots, at the bottom :)

But why not go all the way and LTR flow the whole untranslated dialog then? If there's a single RTL character in the dialog, then render it RTL. What do you think?

Because that would be quite too much overhead - and very complicated to do.

And we can have bits translated and bits untranslated concatenated in a same paragraph.
My idea was just to have the GetText module which, from the english string, returns the arabic string, and if none, the english string as is - just return in the 2nd case the english string wrapped in LTR-isolate.

The paragraph direction is now something quite global (RTL for arabic, hebrew.. LTR for most others), and changing a single TextBoxWidget (that would happen to display untranslated text) paragraph direction would need kludge in all the places that use it.

In the example above, there are 6 translatable strings (the title "Font hinting", the long description, "Current value", the translated value, "Default value", "not set"), , and some of them can be either translated or not translated. All of them displayed in a single TextBoxWidget, that doesn't know about the little bits that made out the full text.

@WaseemAlkurdi

This comment has been minimized.

Copy link
Contributor

WaseemAlkurdi commented Dec 14, 2019

There are 4 or 5 RTL chars in the above screenshots, at the bottom :)

Six! ;-P
ت ل ق ا ئ ي

and changing a single TextBoxWidget (that would happen to display untranslated text) paragraph direction would need kludge in all the places that use it.

Thought that this could be done by GetText (passing the text orientation along with the translated/untranslated text), but then again, I know nothing of Lua.

All of them displayed in a single TextBoxWidget

This is the dealbreaker here.
So if I understand correctly, TextBoxWidgets take only one value for paragraph direction and apply it throughout the text?
Not something like a word processor where you can specify paragraph direction for each paragraph separately?
If it's the first scenario, then what we have at present is good.

@poire-z

This comment has been minimized.

Copy link
Contributor Author

poire-z commented Dec 14, 2019

So if I understand correctly, TextBoxWidgets take only one value for paragraph direction and apply it throughout the text?

Correct.
There are unicode bidi control characters to specify some direction/isolation inside the text (that I use in this trick) - but they work mostly like HTML inline SPANs.
The HTML block direction is something that is specified/set outside of the text content, and it's what we set as parameters of each TextBoxWidget (see below).

So, for the HTML analogy: TextBoxWidget is a single outer <DIV>, that can have a dir= attribute (the TextBoxWidget parameters).
The text that it contains is inline, with the \n in it being like <br/>, that are just inline linefeeds.
And the unicode bidi control chars we can inject in the text are just <SPAN>s

Not something like a word processor where you can specify paragraph direction for each paragraph separately?

We can't specify paragraph direction for each line (as explained above), but we have the option to use "auto", like HTML <div dir="auto"> (I think):

-- Additional properties only used when using xtext
use_xtext = G_reader_settings:nilOrTrue("use_xtext"),
lang = nil, -- use this language (string) instead of the UI language
para_direction_rtl = nil, -- use true/false to override the default direction for the UI language
auto_para_direction = false, -- detect direction of each paragraph in text
-- (para_direction_rtl or UI language is then only
-- used as a weak hint about direction)
alignment_strict = false, -- true to force the alignemnt set by the alignment= attribute.
-- When false, specified alignment is inverted when para direction is RTL

But still, we use these defaults (unless we specify other values - e.g. when displaying CSS or HTML, we force para_direction=false to get LTR - and the only place we use auto_para_direction=true is for dictionary results, as there's no way for us to know the language we'll get).
So most textboxes use the UI language direction, and I think that's what you want: if you have a some real english text in a line among other lines, like a filename, a url, or some number/code, or some remote error message in english we are reporting to the user, you want them still right-aligned in the arabic UI - you don't want auto_para_direction=true to align them to the left and have a twisted text box :)

Anyway, we can do pretty nice things if needed.
But in this edge/temporary case, while we wait for you and others to translate strings to RTL languages, I guess we'll be fine with that simple & quick trick of wrapping the string in LTR-isolate, mostly just for the punctuations to not fly to the other side.
(So, for a full line or paragraph, the untranslated string becomes a "isolated LTR fragment" (or if multilines, each line becomes an isolated LTR fragment), but still in a RTL line/paragraph - so it's right aligned.)

Btw, for later: if you have room in the arabic keyboard, I think some of these unicode bidi control characters could be set to some key, they can be useful, like in the case you described elsewhere that if you have to start an arabic line with an english word, you rethink your line so to not start with that english word :) The alternative would be to just start the line with a, invisible strong RTL character:

Character Name Code point Equivalent markup Comment
RLM RIGHT-TO-LEFT MARK U+200F none strongly typed RTL character

(May be that does even work in your word processor :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.