Update getNumberOfChars() definition #200

jarek-foksa · 2016-07-12T12:41:30Z

The current definition of getNumberOfChars() method is as follows:

The getNumberOfChars method is used to determine the total number of characters available for rendering within the current element. When getNumberOfChars() is called, the following steps are run:

Let text be the result that would be obtained from getting the textContent IDL attribute on the current element.

Return the length of text.

This does not look right to me. ThetextContent includes characters inside never-rendered elements such as <title> and <desc> or <tspan style="display: none">. Those characters should not be counted.

I would first spec SVGElement.innerText in a similar fashion to HTMLElement.innerText and then I would update the definition of getNumberOfChars() to rely on innerText rather than textContent.

The text was updated successfully, but these errors were encountered:

jarek-foksa · 2016-07-12T12:47:49Z

Chrome and Safari do not seem to follow the current definition, I haven't checked the other user agents.

AmeliaBR · 2016-07-12T17:00:42Z

Good catch, @jarek-foksa.

We do need to distinguish between title/desc text and actual rendered text content. I'm less convinced whether a JS method like getNumberOfChars should reflect CSS display: none; I can see use-cases for both options.

At this point, we can't rely on innerText for anything normative, since it doesn't have a normative specification.

Also, if we do re-write this method in a way that reflects rendered text as currently styled, I would want to write it in a way that would be inclusive of CSS generated text (as discussed in #125).

jarek-foksa · 2016-07-12T18:53:57Z

@AmeliaBR If the goal is to follow the existing implementations, then I would just make the spec state the only characters to be counted are those that have corresponding glyphs rendered on the screen.

That would exclude any characters belonging to never-rendered elements (<title>, <desc>), renderable elements hidden with CSS (<tspan style="display: none;">) and renderable elements disconnected from the current document / render tree.

nikosandronikos · 2016-07-13T07:26:47Z

Ok, so here's a test:
https://jsfiddle.net/dodgeyhack/Lbhtyh1t/
(I can't guarantee the arabic code points will render and combine correctly on your browser)

Some findings:

Code points are counted, not glyphs.
textContent.length on HTML and SVG content DOES include display:none elements.
getNumberOfChars() does NOT include display:none
textContent.length on SVG content DOES include title and desc elements..
getNumberOfChars() does NOT include title and desc elements
Safari, Chrome, and Firefox on OS X or MacOS or whatever it's called today are interoperable

jarek-foksa · 2016-07-13T09:52:32Z

For getNumberOfChars() and textContent.length, title and desc are NOT included

From my testing on Chrome, Safari and Firefox, textContent.length DOES indlude <title>, <desc> and <tspan style="display: none"> while getNumberOfChars() DOES NOT include them: https://jsfiddle.net/Lbhtyh1t/12/

nikosandronikos · 2016-07-13T10:08:16Z

@jarek-foksa Yep, you're right. I fixed my test and updated my comment above.

AmeliaBR · 2016-07-13T16:28:55Z

The spec reference to .textContent is new in SVG 2, so let's get rid of that and focus on the prose (# of renderable characters) and existing implementations.

The SVG 1.1 definition was:

Returns the total number of characters available for rendering within the current element, which includes referenced characters from ‘tref’ reference, regardless of whether they will be rendered. Effectively, this is equivalent to the length of the Node::textContent attribute from DOM Level 3 Core ([DOM3], section 1.4), if that attribute also expanded ‘tref’ elements.

Removing the <tref> mentions and focusing on the normative text instead of the incorrect "Effectively..." sentence, we get:

Returns the total number of characters available for rendering within the current element, regardless of whether they will be rendered.

An algorithmic definition thereof would look something like:

Returns the sum of the renderable text length for all child nodes of this element, where the renderable text length is as follows:

For a text node, the length of its text content, after normalizing whitespace according to the value of CSS white-space property on this element;

For a text content element, <a> element, or SVGUnknownElement, the result of recursively calling this algorithm on that element;

For all other elements or nodes, 0.

However, since you need to compute CSS styles for white-space, there's no additional performance or utility impact of also making the function depend on display: none, to match the current behavior.

That could be introduced into the recursive algorithm by either checking whether this element is rendered (and always returning 0 if it isn't), or by looking explicitly at the display property for each child element. I'd prefer the second approach, so that you can still measure the usable text length of an element even when it's inside a <defs> or a <symbol> or such.

However, because there is a dependency on CSS, this algorithm would not be defined for an element that is currently detached from a document, unless we explicitly defined it to use the initial values of the properties.

Also, on the question of display: none, we probably need more explicit guidance on how character-by-character positioning attributes are assigned from a parent element if a nested element is not displayed. The only guidance from SVG 1.1 was that elements with display: none "do not affect text layout". SVG 2 explicitly defines "addressable characters" to exclude those "discarded during layout", but it doesn't explicitly mention display: none.

E.g., for something like the following JSBin demo:

<text x="10 20 30 40 50">12<tspan style="display: none">34</tspan>5</text>

Chrome & Edge render it to match:

<text x="10 20 30">125</text>

But Firefox renders it to match:

<text x="10 20 50">125</text>

Assuming the Chrome/Edge behavior is deemed correct, so that display: none content is removed from per-character positioning, then it makes sense that those characters are also removed from getNumberOfChars().

_But_ the same behavior would also have to be used for all the other character-count methods. E.g., if getNumberOfChars() returns 3, getExtentOfChar(2) better not return an empty box from one of the non-displayed characters.

That behavior, of only counting, positioning, and querying the final rendered text after CSS processing, is consistent with the approach to whitespace (which is or isn't included in the count, based on current collapsing rules) and keeps us open to allowing ::before/::after content. In fact, if we define all SVG text positioning and interfaces to use the rendered text string, I see no reason not to allow CSS generated content now (as @tabatkins would support and @Tavmjong would disagree with, according to the last discussion in #125).

(Aside, this will get really fun when you throw a text-transform: full-width style on it. That would change the byte count of the rendered string!)

For backwards consistency (and consistency with existing implementations), text length must still be measured in UTF-16 code points, equivalent to String.length in JavaScript. Once ECMAScript support for proper counting of mutli-byte characters is stable, we'll probably want to introduce parallel versions of the SVG methods that use those new rules, and an attribute or property on the elements to switch how x/y/dx/dy/rotate are counted.

Tavmjong · 2016-07-21T09:19:53Z

General comment about character-by-character positioning attributes: The goal should be to have predictability. The Firefox rendering of the test case meets this goal. You don't want characters to be jumping around if you turn on/off the display of a tspan. The "addressable characters" is meant to describe white space that has been discarded. That could be made more clear in the definition.

AmeliaBR · 2016-07-21T18:04:19Z

You don't want characters to be jumping around if you turn on/off the display of a tspan.

But that's exactly what happens in normal text layout. You remove some text, and the rest moves over to replace it.

If you don't want the other characters to move, you should be using visibility: hidden, not display: none.

AmeliaBR · 2016-07-21T18:06:42Z

See also this demo of hacking support for conditional line breaks, which is broken in Firefox because x/y attributes on the un-rendered <tspan> itself nonetheless affect subsequent characters.

Tavmjong · 2016-07-22T20:33:01Z

@AmeliaBR Your last example could be handled by disabling dx/dy attributes in un-rendered <tspan> elements... but characters inside the <tspan> would still count in assigning the values from a wrapping <text> or <tspan> element:

<text dx="5 3 0 0 5 3">AB<tspan dx="2 2">CD</tspan>EF<text>

This way you get predictability in setting the spacing between E & F without having to wrap them in a new <tspan>.

Tavmjong · 2016-07-22T20:44:52Z

This issue should be run by @heycam .

AmeliaBR · 2016-07-22T21:09:54Z

I think the whitespace issue (the number of characters can change based on computed CSS style) makes "predictable" a bit of a lost cause. As mentioned previously, I also would like to keep the option open to have CSS-generated text for auto-numbering tick marks and so on, and I don't see how that's possible if coordinates are assigned to characters before applying CSS modifications.

That said, my top concerns are that (1) all behavior is clearly defined and (2) attributes that are declared directly on a display-none element do not affect layout.

character set. Part of #200.

nikosandronikos · 2016-07-29T04:42:17Z

@AmeliaBR Could you review those commits please? I'm not so familiar with the text chapter.

@Tavmjong

Including some final edits with respect to issues #200 and #210, as well as general typo fixes, formatting, and clarification. Changes the definition and example of `shape-margin` to reflect the fact that this is a property of the excluded element, not the element from which the exclusion is subtracted. (attn @Tavmjong) Adds a note about ::before and ::after being possible in the future (see #125). Makes ::selection and :focus styles a requirement for interactive user agents, while leaving open how that is implemented. (Most web browsers use background color as part of a selection style, and we don't have a way of specifying background color on a span of text!) Not the full edit I'd like to have done, but hopefully caught most normative issues.

nikosandronikos · 2016-08-01T01:35:53Z

Thanks!

nikosandronikos · 2016-08-01T01:36:24Z

Milestone achieved =)

AmeliaBR · 2016-08-01T01:39:04Z

@nikosandronikos

I did an end-to-end review of the chapter, found a couple other places which needed adjustments based on this decision,

For the getNumberOfChars method in particular, I re-factored the algorithm to avoid confusing DOM "text nodes" with "SVG text content elements". I also switched the variable to count instead of length to be consistent with the other algorithms in the section. My version is:

Let node be the element or node upon which this method was called
If node is a DOM text node, return the length of the text content of node, after normalizing whitespace according to the value of the 'white-space' property on its parent element.
If node is an Element:
- If the element is not rendered (e.g., because the 'display' property has the used value none), then return 0;
- Otherwise, set count to 0, and for each child of node:
  - Recursively call this algorithm and add the returned value to count.
  Return count.
For all other node types (e.g., DOM comments), return 0.

I was going to suggest that you review & close, but it seems you've already got there!

Now I just have to commit a fix for some markup errors I discovered when previewing this post...

AmeliaBR added Text chapter SVG Core Needs WG input labels Jul 12, 2016

AmeliaBR added this to the CR milestone Jul 12, 2016

AmeliaBR added the Agenda+ label Jul 20, 2016

nikosandronikos added Needs editing and removed Agenda+ Needs WG input labels Jul 22, 2016

nikosandronikos assigned Tavmjong Jul 22, 2016

AmeliaBR mentioned this issue Jul 29, 2016

Spec behavior of ::before and ::after within SVG text #125

Open

nikosandronikos added a commit that referenced this issue Jul 29, 2016

Define getNumberOfChars to match implementations. Part of #200.

a04a878

nikosandronikos added a commit that referenced this issue Jul 29, 2016

Make chracters in display:none elements not part of the addressable

76aa2ba

character set. Part of #200.

nikosandronikos closed this as completed Aug 1, 2016

nikosandronikos added DoC_noResponse DoC_accepted and removed Needs editing labels Aug 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update getNumberOfChars() definition #200

Update getNumberOfChars() definition #200

jarek-foksa commented Jul 12, 2016 •

edited

Loading

jarek-foksa commented Jul 12, 2016

AmeliaBR commented Jul 12, 2016

jarek-foksa commented Jul 12, 2016 •

edited

Loading

nikosandronikos commented Jul 13, 2016 •

edited

Loading

jarek-foksa commented Jul 13, 2016 •

edited

Loading

nikosandronikos commented Jul 13, 2016

AmeliaBR commented Jul 13, 2016

Tavmjong commented Jul 21, 2016

AmeliaBR commented Jul 21, 2016

AmeliaBR commented Jul 21, 2016

Tavmjong commented Jul 22, 2016

Tavmjong commented Jul 22, 2016

AmeliaBR commented Jul 22, 2016

nikosandronikos commented Jul 29, 2016

nikosandronikos commented Aug 1, 2016

nikosandronikos commented Aug 1, 2016

AmeliaBR commented Aug 1, 2016

Update getNumberOfChars() definition #200

Update getNumberOfChars() definition #200

Comments

jarek-foksa commented Jul 12, 2016 • edited Loading

jarek-foksa commented Jul 12, 2016

AmeliaBR commented Jul 12, 2016

jarek-foksa commented Jul 12, 2016 • edited Loading

nikosandronikos commented Jul 13, 2016 • edited Loading

jarek-foksa commented Jul 13, 2016 • edited Loading

nikosandronikos commented Jul 13, 2016

AmeliaBR commented Jul 13, 2016

Tavmjong commented Jul 21, 2016

AmeliaBR commented Jul 21, 2016

AmeliaBR commented Jul 21, 2016

Tavmjong commented Jul 22, 2016

Tavmjong commented Jul 22, 2016

AmeliaBR commented Jul 22, 2016

nikosandronikos commented Jul 29, 2016

nikosandronikos commented Aug 1, 2016

nikosandronikos commented Aug 1, 2016

AmeliaBR commented Aug 1, 2016

jarek-foksa commented Jul 12, 2016 •

edited

Loading

jarek-foksa commented Jul 12, 2016 •

edited

Loading

nikosandronikos commented Jul 13, 2016 •

edited

Loading

jarek-foksa commented Jul 13, 2016 •

edited

Loading