Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programming language (i.e. Java) code editor and Arabic text #113

Closed
tomerm opened this issue Apr 28, 2017 · 3 comments
Closed

Programming language (i.e. Java) code editor and Arabic text #113

tomerm opened this issue Apr 28, 2017 · 3 comments

Comments

@tomerm
Copy link

tomerm commented Apr 28, 2017

I would like to raise two use cases for discussion:

  1. Plain text which can appear inside comments / constants in Java (or any other programming language) code. The expected direction of such text would be RTL, while majority of code editors (as part of various Integrated Development Environments) don't enforce it.

  2. Layout of Java editor (or any other programming language editor) . While math formulas can appear in RTL order, Java code (even if it includes Arabic text in various contexts - such as constants, variable names etc.) should appear in LTR flow (which is a natural flow to Java or any other programming language syntax).
    Attaching example from Eclipse based IDE in which Java code editor is LTR oriented:

arabicdevenv_javacode

@behnam
Copy link
Member

behnam commented Apr 28, 2017

Thanks, @tomerm, for filing an issue, but I think "text rendering in the context of a programming language" is out of the scope of the work of this Task Force. Of course, most of what we cover in ALReq will be useful for various use cases, including IDEs, but because of the focus being publication of web documents and e-books, we won't be able to cover programming languages or IDEs even as a special case.

That said, I agree that IDEs are very inconsistent regarding handling RTL. GTK+-based editors usually apply an auto-direction-per-line (with cascading fallbacks) for any single ling, including comments. But, most other IDEs, stick with LTR paragraph direction.

Now, with almost no real RTL (localized) programming language still available, I would say sticking with LTR would be a better decision for now.

Besides usability issues, if there are any security concerns that may occur in real use cases, I suppose we can redirect the comment to Unicode's UAX #31: Unicode Identifier and Pattern Syntax.

@tomerm
Copy link
Author

tomerm commented Apr 30, 2017

@behnam thanks a lot for the detailed and swift response. I understand why you decided to exclude text rendering in the context of a programming language from the scope. Still, with your kind permission I would like to share several observation (by doing so I have no intention to affect your decision):

1.IDE moved to web for quite a while. Editing of text as part of program source code happens in web based editors. For example:

  1. Web documents also undergo transformation. This goes beyond:
  • a simple transition of various office software to web based world (Google Docs, IBM Docs etc.)
  • document becomes a live creature with its own life cycle. Several people can simultaneously work on the content of the same document and simultaneously author different portions of it.

In many cases document is served as an execution environment for data scientists. For example Jupyter notebook (http://jupyter.org/) may include both:

  • rich text paragraphs (which are characteristic of classical document or e-book) but also
  • pieces of code which will be executed inside the document itself. Results of this execution will be also embedded in the body of the same document (result of execution may be a text or a chart). Reminds me of living / interactive newspapers from Harry Potter movies (https://www.youtube.com/watch?v=xaBEFqFVSE8).

In other words web documents and e-books quite often include code written in some programming language.

I never dreamed of having localized programming language. I am aware of existence of localized language for creation of logical rules (if my salary is > 100 K and my ... then take mortgage (500 K) from bank ...), but not a programming language. Thus my assumption was always that programming language syntax / flow will be always LTR. However, as it is pointed out in UAX #31: Unicode Identifier and Pattern Syntax non Latin characters can be still present in such contexts (at least as part of constants / comments). Monaco editor handles such cases reasonably well. Namely it preserves the syntax of programming language on display and by doing so contributes tremendously to readability of the code (for more details please see: https://github.com/Microsoft/monaco-editor/issues/280). However even Monaco editor does not handle in any special way the constants / comments. They all are displayed with LTR text direction. This of course is not optimal for languages such as Arabic.

I am closing this issue as irrelevant.

@tomerm tomerm closed this as completed Apr 30, 2017
@behnam
Copy link
Member

behnam commented May 2, 2017

Right, @tomerm. Generally agree on most of the issues mentioned. There are few different situations discussed in your comment, so let's talk about each separately.

a) Block-level embedding in an rtl/bidi document

When images/videos/listings (like code examples) get embedded in a document. We will be covering this in Page layout chapter.

b) Block-level embedding of rtl/bidi content inside another document

I think this is an area that we can also cover in the Page layout chapter.

c) Inline rtl/bidi content in a non-text context

These non-text context are what Unicode calls a higher protocol and it's something that's totally application-dependent. For example, as mentioned in https://github.com/Microsoft/monaco-editor/issues/280, you may want to enforce LTR direction on all lines of a programming language, and then apply an auto paragraph direction detection on every string literal.

As much as I agree that there's a lot of room for work here, I think it's very application specific, and one of those areas that has no precedent nor any existing common practice, needing experiments and prototypes to get started. I think this lack of existing information would be the main reason that we cannot cover this as a Special Case in ALReq, at least not in version 1.0.

d) Cursor movement in bidi documents

This is one of the recurring issues, in general. For example, there still exist many problems with this in OpenOffice/LibreOffice authoring applications, specially when you need to jump words or lines.

The TF being focused on layout, I'm afraid we don't have plans for any authoring/editing UX material. The i18n WG may have plans about that for the near future. (@r12a?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants