Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve some problems with RTL Languages #6714

Open
AhmedElTabarani opened this issue Feb 10, 2022 · 12 comments
Open

Solve some problems with RTL Languages #6714

AhmedElTabarani opened this issue Feb 10, 2022 · 12 comments
Labels
👥 discussion This Repo is guided by its community! Let's talk! good first issue A good starting point for newcomers help wanted Needs help solving a blocked / stucked item

Comments

@AhmedElTabarani
Copy link
Contributor

AhmedElTabarani commented Feb 10, 2022

Here I will talk about some problems with RTL Languages and their solutions.
I will explain all the points here, and we can have a discussion about it.
And maybe add a section that talking about these problems & solution in Guidelines in CONTRIBUTING

The base discussion on this issue starts on this PR #6706 and #6715

What is the issue ?

If we have an RTL text here

* [تعلم البرمجة](URL) - Author Name

Note : تعلم البرمجة means that Learn Programming

It will appear on the website like this:
image

In this case, we can just dir="rtl"

<div dir="rtl">

* [تعلم البرمجة](URL) - Author Name
</div>

Result:
image

Is that it ?, No! The monster will show up below 😢

Mixing RTL with LRT languages issue !

The real problem is that when mixing RTL with LRT languages

Case 1

<div dir="rtl">

* [تعلم HTML](URL) - Author Name
</div>

Note : تعلم means that Learn.

Result:
image

Look, he put words in the mixer!

Case 2

If we need to make LTR to go right (both author name and title are LTR)

<div dir="rtl">

* [Learn HTML](URL) - Author Name
</div>

Result:
image

Both words have been swapped!!

Solution ?

We can solve these two problems with Unicode mark called RLM:
https://en.wikipedia.org/wiki/Right-to-left_mark

By adding &rlm; after the LRT word that we need to mark it as RTL (it will pretend as RTL word)

Solve case 1

<div dir="rtl">

* [تعلم HTML&rlm;](URL) - Author Name
</div>

Result:
image

We added &rlm; after HTML

Solve case 2

<div dir="rtl">

* [Learn HTML&rlm;](URL) - Author Name
</div>

Result:
image

You get the point!

Extra Cases!

Case 1

Try to make C# go to right!

<div dir="rtl">

* C#
* [تعلم لغة C# الرائعة](URL) - إسم المؤلف
</div>

Note: * [تعلم لغة C# الرائعة](URL) - إسم المؤلف means * [Learn the Cool C# Language] (URL) - Author Name

Result:
image

The Symbols have the same problem when we try to RTL it
And it has the same solution 😉, by LRM Unicode mark:
https://en.wikipedia.org/wiki/Left-to-right_mark

<div dir="rtl">

* C#&lrm;
* [تعلم لغة C#&lrm; الرائعة](URL) - إسم المؤلف
</div>

We use &lrm not &rlm, why?
The issue with the symbol is that when we try to add a RTL attribute to C# to make it get to right
It will render as a RTL word, so the symbol will reorder to the other side.

By adding &lrm; after the C# we mark it as LTR word, so it will render as LTR word

Case 1.1

Both Author Name and Title are LTR and end with a symbol as C#

<div dir="rtl">

* [Learn C#](URL) - Author Name
</div>

Result:
image

The first here will be simple, just put &rlm; at the end of the title

<div dir="rtl">


* [Learn C#&rlm;](URL) - Author Name
</div>

Result:
image

But note that the symbol # renders as a RTL word, so it will reorder to the other side.
so we must use &lrm; after this symbol.

<div dir="rtl">

* [Learn C#&lrm;&rlm;](URL) - Author Name
</div>

Result:
image

Case 2

If the Title in English and the Author Name in Arabic

* [Learn HTML](URL) - إسم المؤلف

Result:
image

It is enough to make the direction be RTL only without putting any Unicode mark

<div dir="rtl">

* [Learn HTML](URL) - إسم المؤلف
</div>

Result:
image

Case 3

Sometimes we add some information like (:construction: *in process*) after the author name

<div dir="rtl">

* [عنوان بالعربي](URL) - Author Name (meta data)
* [Title In LTR&rlm;](URL) - Author Name (meta data)
</div>

Result:
image

It seems like it is correct, but we read from right to left, so it would be nice if this information was in left to read the author name first then the information

So to solve this, we just put &rlm; after the name

<div dir="rtl">

* [عنوان بالعربي](URL) - Author Name&rlm; (meta data)
* [Title In LTR&rlm;](URL) - Author Name&rlm; (meta data)
</div>

Result:
image

@AhmedElTabarani
Copy link
Contributor Author

AhmedElTabarani commented Feb 10, 2022

if we set a section talking about this solution in Guidelines in CONTRIBUTING (after we finish discussing it here of course)

Other contributors can do the same with their own RTL languages

@eshellman
Copy link
Collaborator

Thanks for adding this. We can leave it open for a while.

@davorpa davorpa added good first issue A good starting point for newcomers help wanted Needs help solving a blocked / stucked item 👥 discussion This Repo is guided by its community! Let's talk! labels Feb 11, 2022
@davorpa
Copy link
Member

davorpa commented Feb 11, 2022

As commented in #6715 if this marks, HTML entity or unicode raw character breaks alphabetize plugin, even worst when are placed at the begining of sentence (the reason: see
https://github.com/vhf/remark-lint-alphabetize-lists/blob/ee5f968040acf941c9c4d61fefb2bb1e3b1e8a7b/lib/alphabetical-list-items.js#L5-L14)

From Windows11 charmap.exe
image

Moreover, non printable version should be used instead of HTML entity. Remember that Markdown markup should be HTML agnostic

davorpa added a commit to davorpa/remark-lint-alphabetize-lists that referenced this issue Feb 11, 2022
@AhmedElTabarani AhmedElTabarani changed the title Solve some problems & solutions with RTL Languages Solve some problems with RTL Languages Feb 11, 2022
@AhmedElTabarani
Copy link
Contributor Author

AhmedElTabarani commented Sep 21, 2022

@davorpa i can make regex patterns for all these cases
It that will help you to detect it automatically or something like that in future ?

@davorpa
Copy link
Member

davorpa commented Sep 21, 2022

@davorpa i can make regex patterns for all these cases It that will help you to detect it automatically or something like that in future ?

Go ahead 😉. It can be helpful to any maintainer ❤️

@Mayank7225
Copy link

@AhmedElTabarani Hello sir, can I work on this?

@AhmedElTabarani
Copy link
Contributor Author

@AhmedElTabarani Hello sir, can I work on this?

About regex putterns ? Ok no problems at all

I was working on it but i was very busy this weeks.

I was decided to make a JavaScript script to detect all of these and some unit tests to make everything organized

This is last thing I ended up with, maybe it will help you.

Case 0 (It is enough to make a div with dir='rtl')
* [تعلم البرمجة](URL) - Author Name
Regex:
^\* \[[^\w\d\?><;,\{\}\[\]\-_\+=!@\#\$%^&\*\|\']+\]\(.+\) - .+(?<!\(.+\))$


Case 1
* [تعلم HTML](URL) - Author Name
Regex:
^\* \[[\u04c7-\u0591\u05D0-\u05EA\u05F0-\u05F4\u0600-\u06FF-\u0621-\u064A\d\?><;,\{\}\[\]\-_\+=!@\#\$%^&\*\|\' ]+[\w\d]+\]\(.+\) - [\w\ ]+$

Case 2
* [Learn HTML](URL) - Author Name
Regex:
^\* \[[^\u04c7-\u0591\u05D0-\u05EA\u05F0-\u05F4\u0600-\u06FF-\u0621-\u064A]+[\w\d]\]\(.+\) - [\w\ ]+$

Extra Case 1
* C#
* [تعلم لغة C# الرائعة](URL) - إسم المؤلف


Extra Case 1.1
* [Learn C#](URL) - Author Name


Extra Case 2 (It is enough to make a div with dir='rtl')
* [Learn HTML](URL) - إسم المؤلف

Extra case 3
* [عنوان بالعربي](URL) - Author Name (meta data)
* [Title In LTR&rlm;](URL) - Author Name (meta data)

@avipars
Copy link
Contributor

avipars commented Oct 20, 2022

The main RTL languages are Arabic, Persian and Hebrew... which are only 3 out of all the languages translated on this repo... might be better to have a special section for these languages... as it is not relevant for all the LTR ones.

@CryptoMitch
Copy link

Have you tried the following?

  • Update the CONTRIBUTING.md file to include a section for RTL languages, explaining the issues, solutions, and usage of Unicode marks (RLM and LRM) for different cases.

  • Create a separate section or a separate file specifically for Arabic, Persian, and Hebrew languages in the repository, as @avipars suggested. This would help maintain a better organization for RTL languages and make it easier to manage content for these languages separately.

@eshellman
Copy link
Collaborator

some good ideas in this issue. Would welcome a PR.

@nerdberg792
Copy link

does this issue still needs to be fixed

@JatinSainiOO7
Copy link

Can i work on this issue.
Thankyou...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👥 discussion This Repo is guided by its community! Let's talk! good first issue A good starting point for newcomers help wanted Needs help solving a blocked / stucked item
Projects
None yet
Development

No branches or pull requests

8 participants