Composition dictionary should be changed [Bug 22059] #4

AFBarstow · 2015-10-12T13:51:21Z

This Issue is a copy of comment zero of W3C Bugzilla . Additional comments for this issue contain other comments for the original Bugzilla bug.

= Comment 0: Takayoshi Kochi 2013-05-16 05:42:53 EDT

As the spec dropped Javascript IME spec, Composition dictionary
doesn't have to be a separate dictionary but can be a part of InputMethodContext.

In 20130404 WD:
dictionary Composition {
readonly attribute Node text;
readonly attribute Range caret;
};

In Microsoft's proposal
https://dvcs.w3.org/hg/ime-api/raw-file/default/proposals/IMEProposal.html

interface InputMethodContext : EventTarget {
...
readonly attribute DOMString compositionText;
readonly attribute unsigned long compositionStartOffset;
readonly attribute unsigned long compositionEndOffset;
....
};

The rationale for this is "
For composition dictionary in current proposal, we can see exposing IME clauses as child nodes of text node, and making them real DOM nodes with styles being useful for a JS-based IME as the IME needs to tell the web application how to render the composition, but if JS IME is not a goal, is there any other scenarios that will benefit from this? If not, how about a simple design that expose the text being composed as DOMString?

For caret range, if it’s for enabling JS-based IME, then exposing the caret ranges of IME clauses is helpful, but if it’s not for JS IME, is there any other usage? We understand that web applications want to know about the whole string of the tentative composition, but we are not sure in which case they want to know how the whole tentative composition string is divided into several parts. Another issue is that the range type only tells the start and end offsets of the composition from its immediate parent. Web application usually wants to know the offset from the beginning of the text field so that it could combine the composition alternate with the text before it to create a full text string. But the beginning of the text field can be up in the parent tree if it’s a contentEditable element and requires JavaScript code to trace up in the parent tree to get the right offset.

So instead of a dictionary type for composition, we suggest compositionText, compositionStartOffset and compositionEndOffset as a simpler design. Please let us know if you have scenarios that need to be the other way."

AFBarstow · 2015-10-12T13:52:08Z

Comment #1 from https://www.w3.org/Bugs/Public/show_bug.cgi?id=22059 @TakayoshiKochi

Takayoshi Kochi 2013-07-04 03:35:40 EDT

As suggested by James Su, I'd like to incorporate composition dictionary
within InputMethodContext.

It would look like:
interface InputMethodContext {
...
readonly attribute DOMString text;
readonly attribute long selectionStart;
readonly attribute long selectionEnd;
readonly attribute Uint32Array segments;
....
}

where selectionStart/End means identical to that for /<textarea>,
and added segments information for dividing the text into clauses.

AFBarstow · 2015-10-12T13:53:09Z

Comment #2 https://www.w3.org/Bugs/Public/show_bug.cgi?id=22059#c2 @travisleithead

Travis Leithead [MSFT] 2013-08-20 14:21:57 EDT

(In reply to comment #1)

As suggested by James Su, I'd like to incorporate composition dictionary
within InputMethodContext.

It would look like:
interface InputMethodContext {
...
readonly attribute DOMString text;

The interface is labelled "InputMethodContext" and so "text" is a little ambiguous in my opinion. I liked "compositionText" better, but I could be OK with this.

readonly    attribute long        selectionStart;
readonly    attribute long        selectionEnd;

Selection & composition are two completely different underlying concepts that shouldn’t be combined. I think calling these "selection.." is confusing with normal text selection. The currently selected text will already be available via the input and textarea's selection properties--no need to duplicate the functionality. Offset (in the MS proposal) makes it clear that it’s character positions and not DOM nodes. These offset character positions mark the actual "active" composition range (which may be different from what is currently selected). Maybe for brevity: "startOffset"/ "endOffset"? or "textContentStart"/"textContentEnd"?

readonly    attribute Uint32Array segments;

OK. This is not relevant to all IMEs though. I suppose we could implement this for other IMEs by always returning only 1 segment.

where selectionStart/End means identical to that for /<textarea>,
and added segments information for dividing the text into clauses.

No need for the redundancy. What we found is that we actually needed the "active" composition offsets, not the selected text which varies depending on the state of the IME. See above.

AFBarstow · 2015-10-12T13:54:16Z

Comment 3 https://www.w3.org/Bugs/Public/show_bug.cgi?id=22059#c3 @TakayoshiKochi

Takayoshi Kochi 2013-10-02 01:42:00 EDT

Sorry for my belated response.

(In reply to Travis Leithead [MSFT] from comment #2)

(In reply to comment #1)
readonly    attribute long        selectionStart;
readonly    attribute long        selectionEnd;
Selection & composition are two completely different underlying concepts
that shouldn’t be combined. I think calling these "selection.." is confusing
with normal text selection. The currently selected text will already be
available via the input and textarea's selection properties--no need to
duplicate the functionality. Offset (in the MS proposal) makes it clear that
it’s character positions and not DOM nodes. These offset character positions
mark the actual "active" composition range (which may be different from what
is currently selected). Maybe for brevity: "startOffset"/ "endOffset"? or
"textContentStart"/"textContentEnd"?

I agree this is a fair argument.

I don't have strong preference of any of these,
1 startOffset / endOffset
2 textContentStart / textContentEnd
3 activeSegmentStart / activeSegmentEnd
4 activeSegmentStartOffset / activeSegmentEndOffset
5 etc. etc.

but 1 is too simple and maybe confusing, 2 may be also confusing against
DOM node's textContent. How about 3?

readonly    attribute Uint32Array segments;
OK. This is not relevant to all IMEs though. I suppose we could implement
this for other IMEs by always returning only 1 segment.

(FYI now it's spec'ed as "sequence getSegments();"
https://dvcs.w3.org/hg/ime-api/raw-file/default/Overview.html#widl-Composition-getSegments-sequence-unsigned-long )

For non-segmenting IMEs (most non-Japanese IMEs) return just one '0' element.

where selectionStart/End means identical to that for /<textarea>,
and added segments information for dividing the text into clauses.

No need for the redundancy. What we found is that we actually needed the
"active" composition offsets, not the selected text which varies depending
on the state of the IME. See above.

See above, too ;)

AFBarstow · 2015-10-12T13:56:54Z

Comments 4 through 16

= Comment 16 Takayoshi Kochi 2014-04-08 01:22:20 EDT

Reopening this.

= Comment 15 Takayoshi Kochi 2014-01-27 00:24:27 EST

Okay, thanks for the comment.
I'll work on updating the spec accordingly.

= Comment 14 Jianfeng Lin 2014-01-21 20:15:52 EST

We use offset because the key scenario we were trying to tackle is the search suggestion in , in which case it has to be an offset within the element's textContent. For contentEditable a range object could be more useful and the API could support both offset and range there.

= Comment 13 Takayoshi Kochi 2013-12-13 03:06:08 EST

I would like to make clarification - The original proposal[1] says:

on an element with the contentEditable flag set, then this is the
starting offset relative to the target's textContent property
(textContent is a linear view of all the text under an element)

But the current MSDN document[2](as of today, Dec. 13, 2013) doesn't
mention about behavior when compositionStartOffset/End used in
contenteditable.

The way that a browser generates textContent from DOM tree and
that a browser holds where an IME composition are not usually compatible -
is there really a use case to get offsets within contenteditable?

I personally suppose for contenteditable it is reasonable to return
Range's before and after IME composition within contenteditable
(to different attributes, of course) - but am not sure yet.

What do you think?

[1] https://dvcs.w3.org/hg/ime-api/raw-file/tip/proposals/IMEProposal.html#widl-InputMethodContext-compositionStartOffset
[2] http://msdn.microsoft.com/en-us/library/ie/dn433247(v=vs.85).aspx

= Comment 12 Jianfeng Lin 2013-12-02 21:40:10 EST

Closing the bug as we agree with having composition{Start,End}Offset directly under InputMethodContext interface and moving active segment to a separate document.

= Comment 11 Takayoshi Kochi 2013-12-02 21:09:17 EST

The document has been moved:
https://dvcs.w3.org/hg/ime-api/raw-file/default/Annex.html

See example 2.

= Comment 10 Takayoshi Kochi 2013-11-07 04:49:41 EST

(In reply to Takayoshi Kochi from comment #9)

For active segments, it will be used for rendering composition by webapps,
not browsers.

See example 2 of the spec.
https://dvcs.w3.org/hg/ime-api/raw-file/default/Overview.html

= Comment 9 Takayoshi Kochi 2013-11-06 22:38:51 EST

It is because composition{Start,End}Offset are relative to its parent's
"value" and external to the composition itself.

For active segments, it will be used for rendering composition by webapps,
not browsers.

= Comment 8 Jianfeng Lin 2013-11-06 20:03:53 EST

Thanks for accepting the proposal, Takayoshi. I saw that you put it right under InputMethodContext interface. Why not under the "composition" attribute of that interface? Since this is information about the composition, it makes more sense to be inside the composition attribute, and there you could simplify the name to be "startOffset/endOffset", so developers can reference them by element.inputMethodContext.composition.startOffset.

I'm still curious about the use cases for active segments.

= Comment 7 Takayoshi Kochi 2013-11-05 23:52:29 EST

As composition{Start,End}Offset added in the spec, closing this.

For hasComposition()/compositionText, see
https://www.w3.org/Bugs/Public/show_bug.cgi?id=22028

= Comment 6 Takayoshi Kochi 2013-11-05 23:29:42 EST

Thanks Jianfeng for clarification.

Added compositionStartOffset/compositionEndOffset.
https://dvcs.w3.org/hg/ime-api/raw-file/8c061ee19f99/Overview.html#widl-InputMethodContext-compositionStartOffset

= Comment 5 Jianfeng Lin 2013-10-21 21:31:17 EDT

Takayoshi, the compositionStartOffset / compositionEndOffset we proposed is different from activeSegmentStartOffset / activeSegmentEndOffset you suggested, so please don't replace them. For example when the user types "honnwoyomu" in Japanese IME and hits space, the whole sentence will be in composition while only the first part "本を" will be the active segment you mentioned. So the text in between compositionStartOffset and compositionEndOffset should be "本を読む"　while the text in between activeSegmentStartOffset and activeSegmentEndOffset should be "本を". We are not against exposing the information about where the active segment is, but exposing the position of the composition is more important.

= Comment 4 Takayoshi Kochi 2013-10-10 04:11:13 EDT

changed to activeSegmentStart/End
https://dvcs.w3.org/hg/ime-api/rev/10a3d6ec9336

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composition dictionary should be changed [Bug 22059] #4

Composition dictionary should be changed [Bug 22059] #4

AFBarstow commented Oct 12, 2015

AFBarstow commented Oct 12, 2015

AFBarstow commented Oct 12, 2015

AFBarstow commented Oct 12, 2015

AFBarstow commented Oct 12, 2015