innerText issues #1679

zcorpan · 2016-08-16T13:16:05Z

When working on #1678 I found the following issues.

cc @rocallahan

Getter

What should non-CSS UAs do? (45253f9)
Is "content order" for CSS boxes defined? (Step 2.5.) (Issue moved to [css-text] Define "content order" for innerText w3c/csswg-drafts#421 )
The rp check only works for direct children of rp, not descendants (e.g. <ruby><rp>(</rp>...).
Do we want that to work? Or change the content model or rp to "text"?
(Change <rp>'s content model to Text #1690)
Issue inline in the spec:

This algorithm is amenable to being generalized to work on a ranges. Then we can use it as the basis for Selection's stringifier and maybe expose it directly on ranges. See Bugzilla bug 10583.
@annevk said:

The only thing I'd consider changing further is making the recursion less declarative. Have that be some algorithm that is invoked for each child and also put the result in a variable of some kind the rest of the algorithm uses.

@zcorpan said:

Yeah, it could use some more cleanup. Possibly also switch to iterative traversal instead of recursive?

2202c6c

Setter

The setter needs to be better defined, in terms of DOM concepts so mutation observers are invoked correctly etc. (0783a61)
innerText = null; WebKit/Chromium "", Gecko/IE "null". (618de84 , Test innerText = ""/null/undefined web-platform-tests/wpt#3482)
Chromium throws for  .innerText = "x" etc, Gecko does not.
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLElement.cpp?sq=package:chromium&dr=CSs&rcl=1471276009&l=140
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/html/HTMLElement.cpp?sq=package:chromium&dr=CSs&rcl=1471276009&l=454

(Test that setting innerText on etc doesn't throw web-platform-tests/wpt#3491)
Only update existing text node's data if element has just 1 text node?
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/editing/serializers/Serialization.cpp?sq=package:chromium&dr=CSs&rcl=1471276009&l=640

(innerText setter should replace existing text node web-platform-tests/wpt#3493)
If the assigned value starts with a newline, WebKit/Chromium insert an empty text node, Gecko/IE don't. (Current spec matches Gecko/IE.)

(innerText setter should not result in empty text nodes web-platform-tests/wpt#3492)

(Filed chromium bug 639064, webkit bug 160971, edge bug 8536472 for some of the above bullet points)

Other known issues

The text was updated successfully, but these errors were encountered:

domenic · 2016-08-16T13:32:19Z

Which of these do you think we'd need to resolve before merging #1678? In my opinion the setter issue should block merging but the rest are OK-ish...

domenic · 2016-08-16T13:33:00Z

rocallahan/innerText-spec#5 also seems kind of bad

rocallahan · 2016-08-16T13:52:41Z

What should non-CSS UAs do?

Just return textContent as if the node was display:none?

Chromium throws for
.innerText = "x" etc, Gecko does not.

That adds complexity and it's hard to imagine it's required for Web compat.

Only update existing text node's data if element has just 1 text node?

Ditto.

rocallahan/innerText-spec#5 also seems kind of bad

Yeah that should be fixed, but you could fix it after merging.

zcorpan · 2016-08-17T11:17:39Z

In my opinion the setter issue should block merging but the rest are OK-ish...

Fixed the setter in 0783a61

zcorpan · 2016-08-17T11:34:43Z

Found a new issue today (updated OP)

If the assigned value starts with a newline, WebKit/Chromium insert an empty text node, Gecko/IE don't. (Current spec matches Gecko/IE.)

From https://rocallahan.github.io/innerText-spec/ with the following normative changes: * Defined behavior for non-CSS UAs. * The setter is better defined. * Added [CEReactions, TreatNullAs=EmptyString] to the IDL. Fixes #465. Remaining issues: #1679

domenic · 2016-08-17T21:43:10Z

I think we forgot to move a mention of innerText as nonstandard in the CEReactions section
we should send a PR to the downstream repo with a redirect or other pointer link so as not to cause confusion going forward.

zcorpan · 2016-08-18T13:31:46Z

The rendering section in HTML has this

User agents that do not support correct ruby rendering are expected to render parentheses
around the text of rt elements in the absence of rp elements.

Is this something that people want to do in innerText? (Edit: if we do that, maybe we should also change the algorithm to just ignore rp elements)

Ref. #1679 (comment)

See #1679 (comment) for context.

Ref. whatwg/html#1679

Ref. whatwg/html#1679.

domfarolino · 2017-07-10T10:26:18Z

I'd like to help clean up some of the aforementioned comments on how the algorithm performs recursion if that'd be helpful, as I think it is a little odd. Personally I prefer a recursive definition to an iterative one in this case given it is a classic DFS, but here are some things I noticed:

1.) Cleaning up recursion

Step 2 mentions much of what substep 1 mentions. It like having the following algorithm:

function topProcedure(element) {
  let list = [];
  for (let i = 0; i < element.children.length; ++i) {
    list += recursiveProcedure(element.children[i]);
  }
}

function recursiveProcedure(childElement) {
  let list = [];
  for (let i = 0; i < childElement.children.length; ++i) {
    list += recursiveProcedure(childElement.children[i]);
  }

  // do some parsing/concat with `list`
}

...we unnecessarily repeat ourselves in this when we could just call recursiveProcedure on the original element. With the above being said, could we just make the whole innerText algorithm recursive? Right now it isn't (only substeps are), is this for performance in that steps 3-6 are expensive enough such that we want to run them only once at the end? I also first thought that maybe it isn't recursive so we can prevent checking if a given element is being rendered at each stage (to prevent tons of style recalculations). Then I noticed substep 3 checks to see if a given node has a CSS box, which is pretty much the definition of being rendered and since this algorithm doesn't explicitly define how to act on SVG elements (I think?), could we replace the CSS box check (substep 3) with the full "being rendered" check benefit from that explicitness?

2.) Element (children) vs Node (childNodes) traversal

Step 2 mentions "applying the following recursive procedure to each child node node of...". Later in substep 1, the algorithm receives a given node node and mentions "recursively applying this procedure to each child of node in...". The former indicates that the recursive algorithm will be given all child nodes of the given element (I assume similar to what would be returned with element.childNodes). The latter indicates the substeps are applied only to the HTMLElement children of future nodes (similar to the collection returned from node.children). Does this matter? Is there a reason for this? I'm assuming no, and that substep 1 is just missing a word (thus making it even more like the outer algorithm + making it more enticing to eliminate duplication).

3.) This one gets me. Is there a reason why the two sections in fiddle should have different results? I can't really think of one but it is entirely possible I'm just missing something.

domenic · 2017-07-10T17:16:34Z

So @zcorpan is our innerText expert but he's out for a while. Let me see if I can answer these satisfactorily...

Overall it would indeed be great to clean up the recursion to make it more explicit. "Applying the following recursive procedure" is hard to understand; I think we all knew that. So cleaning it up to be more like normal programming would make a lot of sense.

I think you are right that it could be refactored nicely without any normative change.
I'm not sure where you're seeing children. "child" is a synonym for "child node" in the DOM tree context. We would say "element child node" if we meant only elements. It's very important we hit all child nodes, not just element child nodes, as otherwise we'd miss all the text.
This just seems like something where we're codifying majority legacy behavior, without rhyme or reason. It's notable that Edge at least gives the empty string for both, but Chrome and Firefox give the different results like the spec. Not sure about Safari, but assuming they match Chrome and Firefox, it's probably not worth changing everyone.

domfarolino · 2017-07-10T17:24:25Z

I think you are right that it could be refactored nicely without any normative change.

Cool I can submit a PR!

It's very important we hit all child nodes, not just element child nodes, as otherwise we'd miss all the text.

Perfect that's what I wasn't sure about. I guess I was referring to how in step 2 we apply the procedure to "each child node node of...." and in substep 1 we apply the procedure to "each child of..." and didn't know if the lack of node in the second example was significant at all.

...different results like the spec. Not sure about Safari, but assuming they match Chrome and Firefox, it's probably not worth changing everyone.

IIRC I believe Safari matches Chrome + Firefox FWIW. Cool thanks!

domfarolino · 2017-07-10T23:26:02Z

Are Chrome, Firefox, and Safari in violation of the spec at this time?

Consider the following HTML:

<p>
  <span style="display: none">Does this text count as innerText?</span>
</p>

From my understanding the spec indicates that the text "Does this text...." should be returned from p.innerText, though Chrome + Firefox + Safari return nothing.

Explanation:

When calling p.innerText we eventually slip into the substep recursive procedure which first recurses all the way down to the deepest node first (the Text node). Recursion bottoms out and we eventually hit substep 4 since we're at a Text node. We return a list that looks like this: «"Does this text count as innerText?"».

When we're unwinding upward we hit the stack frame that was passed the . In this frame the list is equal to «"Does this text count as innerText?"» since that is what was returned from our previous recursive call. We hit substep 3 since this node does not have any CSS boxes (display: none right?) and return the list. In other words, we don't do anything special or choose to disregard the list though it came from a node who is not "rendered". Eventually this list propagates up to the top and gets returned as the value of p.innerText. Is this right?

Edit

@domenic I assume by

but Chrome and Firefox give the different results like the spec.

You didn't mean they give different values from what the spec says they should (aka they're in violation); but if that's what you meant then sorry this is duplicate info. ...and if they are indeed in violation I assume we should change the spec right?

rocallahan · 2017-07-11T00:16:44Z

Recursion bottoms out and we eventually hit substep 4 since we're at a Text node.

No, we don't, because the Text node has no CSS boxes so we bail out at step 3 before reaching step 4.

domfarolino · 2017-07-11T05:04:39Z

~~@rocallahan Thank you very much. I had no idea Text node does not have CSS boxes, apparently my understanding of that concept is deficient!~~

domfarolino · 2017-07-11T05:16:42Z

@rocallahan Hm does that mean we'd never meet the condition for substep 4 then? Because if we did, that means we're operating on a Text node, and if we're operating on a Text node we'd always bail out early (substep 3) no?

rocallahan · 2017-07-11T15:54:13Z

A display:none Text node has no CSS boxes but a regular display:inline Text node does.

Helps with #1679.

From https://rocallahan.github.io/innerText-spec/ with the following normative changes: * Defined behavior for non-CSS UAs. * The setter is better defined. * Added [CEReactions, TreatNullAs=EmptyString] to the IDL. Fixes whatwg#465. Remaining issues: whatwg#1679

See whatwg#1679 (comment) for context.

Helps with whatwg#1679.

domenic mentioned this issue Aug 17, 2016

Upstream the innerText spec #1678

Merged

domenic added the compat Standard is not web compatible or proprietary feature needs standardizing label Aug 17, 2016

zcorpan mentioned this issue Aug 18, 2016

Change content model of <rp> to "text" #1689

Closed

zcorpan added a commit that referenced this issue Aug 18, 2016

Editorial: remove innerText from a note about non-standard APIs

c37a1fc

Ref. #1679 (comment)

zcorpan mentioned this issue Aug 18, 2016

Editorial: remove innerText from a note about non-standard APIs #1691

Merged

annevk pushed a commit that referenced this issue Aug 18, 2016

Editorial: remove innerText from a note about non-standard APIs

066cb98

See #1679 (comment) for context.

zcorpan added a commit to web-platform-tests/wpt that referenced this issue Aug 18, 2016

Test that setting innerText on etc doesn't throw

203878e

Ref. whatwg/html#1679

zcorpan mentioned this issue Aug 18, 2016

Test that setting innerText on etc doesn't throw web-platform-tests/wpt#3491

Merged

zcorpan added a commit to web-platform-tests/wpt that referenced this issue Aug 18, 2016

innerText setter should not result in empty text nodes

c71a3a1

Ref. whatwg/html#1679

zcorpan mentioned this issue Aug 18, 2016

innerText setter should not result in empty text nodes web-platform-tests/wpt#3492

Merged

zcorpan added a commit to web-platform-tests/wpt that referenced this issue Aug 18, 2016

innerText setter should replace existing text node

83f41fa

Ref. whatwg/html#1679

This was referenced Aug 18, 2016

innerText setter should replace existing text node web-platform-tests/wpt#3493

Merged

Point to new location for spec and tests rocallahan/innerText-spec#6

Closed

[css-text] Define "content order" for innerText w3c/csswg-drafts#421

Closed

zcorpan mentioned this issue Dec 30, 2016

Should innerText have a blocklist of elements it doesn't work on? #2222

Closed

zcorpan added a commit to web-platform-tests/wpt that referenced this issue Jan 10, 2017

Test that setting innerText on etc doesn't throw

4451bd8

Ref. whatwg/html#1679

Ms2ger pushed a commit to web-platform-tests/wpt that referenced this issue Jan 17, 2017

innerText setter should replace existing text node

aae3e1b

Ref. whatwg/html#1679

zcorpan added a commit to web-platform-tests/wpt that referenced this issue Mar 14, 2017

innerText setter should not result in empty text nodes

7cfa65b

Ref. whatwg/html#1679

annevk pushed a commit to web-platform-tests/wpt that referenced this issue Mar 14, 2017

innerText setter should not result in empty text nodes

f5f8d4f

Ref. whatwg/html#1679.

domfarolino mentioned this issue Jul 12, 2017

Clean up innerText recursion #2831

Merged

domenic pushed a commit that referenced this issue Jul 14, 2017

Editorial: clean up innerText recursion

2202c6c

Helps with #1679.

alice pushed a commit to alice/html that referenced this issue Jan 8, 2019

Editorial: remove innerText from a note about non-standard APIs

675c82d

See whatwg#1679 (comment) for context.

alice pushed a commit to alice/html that referenced this issue Jan 8, 2019

Editorial: clean up innerText recursion

20934f1

Helps with whatwg#1679.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

innerText issues #1679

innerText issues #1679

zcorpan commented Aug 16, 2016 •

edited by domenic

Loading

domenic commented Aug 16, 2016

domenic commented Aug 16, 2016

rocallahan commented Aug 16, 2016

zcorpan commented Aug 17, 2016

zcorpan commented Aug 17, 2016

domenic commented Aug 17, 2016

zcorpan commented Aug 18, 2016 •

edited

Loading

domfarolino commented Jul 10, 2017

domenic commented Jul 10, 2017

domfarolino commented Jul 10, 2017

domfarolino commented Jul 10, 2017 •

edited

Loading

rocallahan commented Jul 11, 2017

domfarolino commented Jul 11, 2017 •

edited

Loading

domfarolino commented Jul 11, 2017

rocallahan commented Jul 11, 2017

innerText issues #1679

innerText issues #1679

Comments

zcorpan commented Aug 16, 2016 • edited by domenic Loading

Getter

Setter

Other known issues

domenic commented Aug 16, 2016

domenic commented Aug 16, 2016

rocallahan commented Aug 16, 2016

zcorpan commented Aug 17, 2016

zcorpan commented Aug 17, 2016

domenic commented Aug 17, 2016

zcorpan commented Aug 18, 2016 • edited Loading

domfarolino commented Jul 10, 2017

domenic commented Jul 10, 2017

domfarolino commented Jul 10, 2017

domfarolino commented Jul 10, 2017 • edited Loading

Edit

rocallahan commented Jul 11, 2017

domfarolino commented Jul 11, 2017 • edited Loading

domfarolino commented Jul 11, 2017

rocallahan commented Jul 11, 2017

zcorpan commented Aug 16, 2016 •

edited by domenic

Loading

zcorpan commented Aug 18, 2016 •

edited

Loading

domfarolino commented Jul 10, 2017 •

edited

Loading

domfarolino commented Jul 11, 2017 •

edited

Loading