-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
determining cell text direction #620
Comments
Thanks for the comment @r12a. The text direction annotation on a column (and hence on a cell) is determined by the We made a decision a while ago not to support setting annotations on individual rows as there were very few use cases where it was actually required. This might be something we revisit in vNext if we find that the global notes annotation is being used to set these kinds of annotations at individual row or cell level (this also answers your last point). We have referenced the Unicode Bidirectional Algorithm as the mechanism for determining how the contents of a cell should be displayed. This is a fairly complex algorithm, dealing with nesting of different directional markers within the text. I believe that we need to view CSV as a Higher-Level Protocol while using that defined algorithm, and that we should be customising it according to HL1 by setting the "paragraph embedding level" based on the text direction annotation on the cell. But this only sets the initial level (ie whether you start off thinking the paragraph is LTR or RTL): as I understand it this will be overridden by the first strong character. Plainly this isn't explained well enough in the current text (and possibly not above either). If you have a suggestion about how to rephrase this section to make it clearer, I'd welcome that. |
Setting the text direction in a cell according to its first strong character actually defeats setting the direction with a text direction annotation. From JeniT's comment, it seems to me that text direction annotation will always need to be specified for a RTL table, then this will be inherited by columns or respecified for some or all columns. If column text direction overrides the first strong algorithm, then this algorithm will only be used for tables where text direction is nowhere specified, which means LTR tables by default, which is a rather surprising way of supporting bidi. If column text direction is overridden by the first strong algorithm, then there is no point setting it, except for cells with no strong character at all, such as pure numbers. |
As Mati said, this is incorrect. Bidi is always a difficult area to discuss and understand. In order to help with answering this thread, i have begun (and so far only begun) an article that discusses bidi in plain text, and makes some observations about CSV. It's incomplete at the moment, but if you read it you may be able to help me finish it. See http://r12a.github.io/docs/bidi-plain-text/ |
This discussion continues at #638. The text there is sufficient long and complicated that i didn't try to bring the discussion back to this thread. |
the 'base direction' of a string is crucial to correct display when dealing with bidirectional text (see http://www.w3.org/International/articles/inline-bidi-markup/uba-basics for an explanation of why). The base direction can be determined by metadata (such as the dir attribute in html) or by testing the data (typically identifying the direction of the first strong character). The latter approach does not always produce the correct base direction, eg. where latin characters appear at the start of a bidirectional string.
as far as i can tell, the overall 'direction' annotation for the table, sets the direction of columns on display, but is not used to set the default direction of text in cells.
6.4 Parsing Cells says that the cell annotation for text direction takes its value from the column annotation for text direction.
it seems, then, that it is possible to indicate that the direction of all cells in a particular column should be, say, rtl, by specifying direction for that column.
it's not clear to me whether the last paragraph in 6.5.1 is contradicting this by suggesting that the presence of strong characters in the cell will cause the UBA's first-strong algorithm to kick in, or simply that the contents of the cell will be treated as normal by the UBA, given the base direction set by the 'text direction' cell annotation. I'm particularly unclear, since the preceding sentence seems to be applying different rules for strings with no strong characters, and is followed by 'However...'.
here are some thoughts/issues/questions:
i would have thought that it would be easier to use the direction annotation of the table to establish the default base direction for cell values, rather than only being able to specify it per column. (Then it would be analogous to the dir attribute on the html element.)
i also think it may be useful sometimes to specify the direction for a given row.
if the direction of a table is not set, or the direction is set to auto/default, then i would expect that the first-strong algorithm would be applied to cells to determine the base direction.
on the other hand, if the direction of a cell is specified by an annotation, then the cell should probably take its direction from that.
am i correct in thinking that it is also possible to use metadata to set the direction for a particular cell? If so, then the algorithm should not overwrite that with the value in the column annotation.
The text was updated successfully, but these errors were encountered: