Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the "backwards deletion" Q+A article #520

Merged
merged 22 commits into from
Jan 12, 2024
Merged

Conversation

aphillips
Copy link
Contributor

@aphillips aphillips requested review from xfq and r12a August 24, 2023 14:54
@netlify
Copy link

netlify bot commented Aug 24, 2023

Deploy Preview for i18n-drafts ready!

Name Link
🔨 Latest commit 179dfa0
🔍 Latest deploy log https://app.netlify.com/sites/i18n-drafts/deploys/65a1575c89752f00097a0ebd
😎 Deploy Preview https://deploy-preview-520--i18n-drafts.netlify.app/questions/qa-backwards-deletion.en
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@xfq
Copy link
Member

xfq commented Aug 25, 2023

I updated the article to match the latest template.

We also need to:

  • add a qa-backwards-deletion/translations.js file
  • add alt attribute to the images

<div class="try">
<h4>Try it in your browser</h4>
<p>Try selecting, cursoring, deleting, and backspacing with this word in Hindi (in the Devanagari script). The word means "Unicode" and contains <em>four</em> graphemes and <em>seven</em> Unicode code points.</p>
<p><input id="tryHindi" type="text" name="tryHindi" lang="hi" class="try" value="&#x92f;&#x942;&#x928;&#x93f;&#x915;&#x94b;&#x921;"></input>
Copy link
Member

@xfq xfq Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that input elements don't need a close tag. (Same for other instances.)

aphillips and others added 6 commits August 25, 2023 07:47
Co-authored-by: Fuqiao Xue <xfq@w3.org>
- add a note about vertical text
- fix tamil tag
- reword about thai's relationship to indic
I took it to the logical conclusion and just made a small section about
vertical text at the end. It makes the document more attractive.
@aphillips aphillips requested a review from xfq August 26, 2023 16:35
@aphillips aphillips requested a review from xfq August 28, 2023 14:40

<p>In vertical text, the left arrow moves left one row and the right arrow right one row, while up and down arrows move one visual character up or down in the row of text.</p>
<p>In vertical text, the left arrow moves left one row and the right arrow right one row, while up and down arrows move one user-perceived character (grapheme) up or down in the row of text.</p>

<h3 id="text_selection_desc">Selecting text via the keyboard</h3>

<p>Text selection begins much like cursoring, by positioning the cursor at the start (or end) of the desired text and then selecting to the other end of the desired text. This can be done using a pointing instrument, such as a mouse, or using keyboard gestures such as holding "shift" and cursoring through the text. Unlike cursoring, text selection is constrained by the need to select logical characters, so a different number of keystrokes or gestures may be required compared to simple cursoring. This is particularly true for bidirectional text.</p>

<p>Selection using a pointing device, such as a mouse, is subtly different in most implementations than using the cursor keys to extend a selection. When using a pointing device, the selection is entirely logical, between the start and end point of the selection. At least on most physical keyboards, the user can access text selection, usually by holding down the "shift" key while cursoring in the text. As noted before, the cursor keys always move visually and in the indicated direction of the key. For certain bidirectional texts this can mean that the entire text cannot be selected via the cursor keys alone!</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is incorrect. I just checked, using some bidi text in my Arabic picker, and when the shift key is held down and the cursor keys used to extend the highlight, the left and right cursor keys extend the highlight in the opposite direction to that observed when simply moving the cursor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try it on Firefox. Chrome does things differently.

@r12a
Copy link
Contributor

r12a commented Aug 29, 2023

Generally speaking, most text navigation and editing follows the user-perceived character boundaries. For most implementations this corresponds to Unicode's definition of "default extended grapheme cluster" boundaries [UAX29]. The main exception to this is backspacing, which usually follows Unicode code point boundaries in the underlying encoded text (although there are exceptions to this). For the simplest scripts and languages, these often amount to the same thing.

This and other parts of the document strike me as over-simplified and in places incorrect, but there are terminological problems (which we are already familiar with) that cloud the issue. My experience in working with these things has lead me to view the world in terms of code points, which are grouped into grapheme clusters, which are in turn grouped into orthographic syllables. (I'm in the process of writing that up more clearly, elsewhere...)

I'm inclined to agree with Norbert that this idea of user-perceived character boundaries is too vague and not clearly substantiated enough to be used as the name of a unit of segmentation. Rather it's merely a way of helping people imagine why code point units are not sufficient in some cases. The distinction between grapheme clusters and orthographic syllables is not informed by it's used, but is crucial in the information provided by this article.

My experience has shown that browsers use these 3 different units for text operations such as cursor movement and deletion, depending on the language, and sometimes inconsistently within a single language, but also from browser to browser. I've been investigating this and writing up results for the various browsers in my orthography notes, under the section "Graphemes". It may be worth going to https://r12a.github.io/scripts/switch.html and selecting the 'graphemes' segment id, then cycling throught the orthographies using the control "Select an orthography". You should especially look for the subheading "Browser behaviour", where it exists, to find the results per browser.

(I was wondering whether it would be useful to list behaviour against orthography in a table of some sort – not necessarily in this article, but somewhere.)

That said, it's not clear to me what is your source of authoritative information about how cursoring and deletion should work. I don't think that it is made clear in the UAX how things should work, but is rather left up to the application to decide the exact mechanism.(?) Or are you meaning to describe what browsers currently do? I think it would be good to make that much clearer.

I also think that the article should make it much clearer (actually, i think it's hardly mentioned at all other than for one Thai example) that very different segmentation rules may apply for other operations on the text, such as line breaking, justification, text spacing, and the like – and that this is not an issue, but is useful.

The exceptions section alludes to the importance of orthographic syllables, but this isn't really an exception - even in terms of current browser support. Again it varies by browsers and by orthography, but it's something that needs to be mentioned either together with or given equal importance to the section entitled "Combining characters".


<p>Indic scripts, such as the Devanagari and Tamil examples above, are not the only scripts affected by this. The same can be found for combining marks in many languages. For example, the first cluster in this Thai word: <q>คืออะไร</q>. [get better example; demonstrate middle cursor deletion effects in Thai]</p>
<p>South-Asian scripts, such as the Devanagari and Tamil examples above, are not the only ones affected by this; similar behavior can be found in any script that employs combining marks. For example, the first cluster in this Thai word <q lang="th">ห้องน้ำ</q> has similar behavior. The end of this word shows additional complexity: the <span class="codepoint" translate="no"><bdi lang="th">&#xe33;</bdi><code class="uname">U+0E33 THAI CHARACTER SARA AM</code></span> appears as a separate typographical unit for effects such as inter-character spacing, but behaves as a single grapheme for the purposes of selection, cursoring, and forward deletion.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of the SARA AM is at the end of the word is not quite correct. It doesn't appear as a single typographic unit for text spacing: rather it is split into a combining mark and a letter, and the space is introduced before the latter – so actually it is split into 2 typographic units, the first of which includes the preceding letter and its combining tone mark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Richard for your comments.

I'm not unaware that this article doesn't really do the job and I agree with the points in your long comment above. Part of me wants to rubbish the whole thing. A proper job would require a thorough rewrite, which is more work than I think I'm willing to do, so I may look for a volunteer to take it over.

@xfq
Copy link
Member

xfq commented Oct 29, 2023

I think it might be useful to add an example of IVS. For example, the characters on this page are made of two code points (U+9F8D + an ideographic variation selector), but for users, they should be input, selected, and deleted as a whole. Regarding input methods, many input methods can already input IVS. We can mention cursor movement, selection, and deletion here.

@aphillips
Copy link
Contributor Author

The working group elected not to complete work on this QA document. However, I don't want to lose the invested effort. I'm merging the changes for now. I should probably add a visible deprecation too.

@aphillips aphillips merged commit 3dc7ec7 into w3c:gh-pages Jan 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants