Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-highlight-api] invalidation of static ranges #4597

Closed
frivoal opened this issue Dec 13, 2019 · 21 comments
Closed

[css-highlight-api] invalidation of static ranges #4597

frivoal opened this issue Dec 13, 2019 · 21 comments

Comments

@frivoal
Copy link
Collaborator

frivoal commented Dec 13, 2019

As far as I am aware, uses of StaticRange prior to css-highlight-api were for ranges created by the User Agent and passed to the author.
In css-highlights-api, it's the other way around, which raises (for the first time?) the question of invalidation of static ranges, as they may have become stale by the time the browser needs to use them.

https://drafts.csswg.org/css-highlight-api-1/#range-invalidation defines a way to evaluate that. Is that reasonable? If not, what should we do instead?

@dandclark
Copy link
Contributor

The invalidation logic at https://drafts.csswg.org/css-highlight-api-1/#range-invalidation seems reasonable.

In addition to checking whether the start and end node are both in a document tree, I wonder if this should check if they are in the same document tree. For example if either the start or end container is adopted into a same-domain iframe document we should probably treat the range as invalid.

Checking that the range doesn't cross the boundary of a contained element per #4598 could also be an addition here.

@sanketj
Copy link
Member

sanketj commented Feb 11, 2021

I have proposed a set of criteria for validating StaticRange here: whatwg/dom#947. These criteria can be used for any StaticRange, not just static ranges used in the highlight API. I think #4598 should be an additional restriction for the highlight API, but not something that applies generally to StaticRange.

@annevk
Copy link
Member

annevk commented Feb 15, 2021

Why does this not use live ranges internally? Is any implementation actually planning on using static ranges internally?

cc @smaug----

@sanketj
Copy link
Member

sanketj commented Feb 15, 2021

There are certainly benefits to the highlight API using live ranges internally:

  • The Highlight constructor can be modified to only accept StaticRange. The way I see it is that highlights would create and maintain their own "highlight ranges" internally, which are live, and the static ranges passed into the constructor are just inputs for creating these internal ranges.
  • We wouldn't need to define any invalidation behavior for the highlight API, and [css-highlight-api][css-contain] static ranges and css-contain #4598 wouldn't be an issue anymore.

However, I don't think we have answers yet regarding the performance tradeoff - live range maintenance cost vs. static range and highlight validation cost. It would be great to know if Safari ran any numbers on this (since they have an experimental implementation already) and what their findings were. cc: @hober, @megangardner

Regardless of which option is ultimately chosen, we will need to document this in the spec, as there are painting behavior differences depending on which way we go.

@annevk
Copy link
Member

annevk commented Feb 16, 2021

Are the painting differences elaborated on somewhere?

@sanketj
Copy link
Member

sanketj commented Feb 16, 2021

Consider the example below:

<!DOCTYPE html>
<html>
  <head>
    <style>
      div::highlight(test) {
        background-color: red;
      }
    </style>
  </head>
<body>
    <div>abd</div>
    <input type='button' value='click' onclick='onClick()'>
    <script>
      let div = document.querySelector('div');
      let textNode = div.firstChild;
      let range = new StaticRange({startContainer: textNode, startOffset: 1, endContainer: textNode, endOffset: 3});
      let highlight = new HighlightRangeGroup(range); 
      CSS.highlights.set('test', highlight);

      function onClick() {
        textNode.insertData(2, 'c');
      }
    </script>
</body>
</html>

After the button is clicked and 'c' is inserted, the div will have text "abcd". If the highlight was represented with a live range internally, the range's end offset will become 4 and "bcd" will be painted with the highlight styles. If we were just painting the static range internally, the range endpoints will not change and only "bc" will be highlighted.

@annevk
Copy link
Member

annevk commented Feb 16, 2021

Another consideration here is what would be better or preferred by web developers, but I realize that might be hard to get data on.

@smaug----
Copy link

https://drafts.csswg.org/css-highlight-api-1/#range-invalidation is missing things like shadow DOM, and how is that supposed to work if the start point has been moved in DOM to be after the end point?
All those would need to be validated during paint time, and if there are lots of highlights around, like there could be for example if one implements a spellchecker using the API, the overhead during paint time might be rather significant.

@sanketj
Copy link
Member

sanketj commented Feb 17, 2021

That's why I had proposed the StaticRange validation criteria in whatwg/dom#947. That should cover the shadow DOM cases and the overlapped endpoints case. That criteria seems pretty generic, not highlight API specific, hence I proposed adding those to the StaticRange spec. To validate static highlights, we would need to run those checks, as well as an additional check for #4598. Of course, all of the above is only applicable if the highlight API paints static ranges directly, instead of using live ranges internally.

If we used live ranges internally, there are some additional questions we'll need to answer.

  • How would authors be able to remove a range from a Highlight? If the static ranges passed into the Highlight constructor are just inputs that the Highlight uses to create its own internal live ranges that are not exposed to JS, then authors don't have access to the actual range objects being painted and can't remove them individually.
  • At what point would we create the internal live ranges maintained by the highlight? A highlight is only painted if it is registered (ie. added to CSS.highlights). If we create the internal ranges when we construct the Highlight object, then we start paying the cost of maintaining them during DOM mutations, even though the highlight isn't being painted yet. Instead, we could create the internal ranges only when the Highlight is added to CSS.highlights. However, this would require storing the passed-in static ranges internally until that moment, and they may become invalid by then.
  • At what point do we "drop" the internal live ranges? A Highlight can also be deregistered (ie. removed from CSS.highlights). Let's say that after the Highlight is removed, we just wait for it to go out of scope and get GC'ed, which is when the internal ranges will be released. Until these ranges are garbage collected, we are still paying the cost to maintain them during DOM mutations, even though the highlight is no longer registered. Instead, we could drop the ranges when the Highlight is deregistered. However, how would we re-establish the old range positions if the Highlight is subsequently re-registered?

With regards to the performance tradeoff, we are starting prototyping in Chromium, so we can run some perf tests to compare the static range painting overhead vs. the added maintenance cost during DOM mutations if we used live ranges internally. Hopefully that can give us a better idea on which way to go. If other implementors could do the same, that would be great too.

@smaug----
Copy link

Highlight API is effectively a form of Selection API, with support for multiple Ranges, that is what Gecko has had basically forever.
It is used for example for spellchecker and IME and find, in addition to normal selection (which also supports multiple ranges - try with ctrl+mouse). So there is some experience on using Ranges.
Doing boundary point comparisons during paint time does cause some significant performance issues currently (with WebRender because WebRender code paths haven't been optimized yet), but I'd prefer to remove all such checks from paint time.

@frivoal
Copy link
Collaborator Author

frivoal commented Mar 25, 2021

Why does this not use live ranges internally?

I think if that's the expected implementation, we should probably drop static ranges from this API entirely. The intended use case for using static ranges rather than live ones is when the author intends to run their own range invalidation logic upon DOM changes, and would like to avoid paying the cost of having the UA update the ranges first, just before they trash all these needlessly updated ranges and insert their own new ones.

If the internal implementation uses live ranges anyway, then this is pointless, and we might as well restrict authors to describing their highlights with live ranges in the first place.

@dandclark
Copy link
Contributor

@ffiori ran some experiments and confirmed that at least in Chromium, adding a large number of live Ranges to a document increases the cost of DOM node moves due to the cost of keeping the Ranges up-to-date. On the other hand, with an implementation that caches StaticRange validity there is no scenario where StaticRange has significantly worse performance for Highlights.

It would be interesting to know if WebKit has done any similar experiments on their implementation.

There are realistic scenarios where a site might want to use a large number of Highlights, like a site with a lot of text implementing its own find-on-page. Allowing the use of StaticRange, without creating StaticRanges internally, could be helpful for performance in such scenarios.

I also agree with @frivoal's reply above that if we're always going to use live Ranges internally anyway, then I don't see any reason to include StaticRange in the API.

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed Invalidation of Static Ranges.

The full IRC log of that discussion <fantasai> Topic: Invalidation of Static Ranges
<fantasai> github: https://github.com//issues/4597#issuecomment-892202307
<fantasai> dandclark: Static ranges can get into weird states, like if I remove one boundary from the DOM
<fantasai> dandclark: Sanket came up with criteria for deciding if a range is valid or invalid
<fantasai> dandclark: but the qustion is, if user adds range to a highlight should we use a live range, or use static range internally
<fantasai> dandclark: reason to use live range, don't have to check validity in the same way because they're kept up to date
<fantasai> dandclark: We discussed awhile, and decided not to back static ranges with a live range
<fantasai> dandclark: first, perf reasons -- live ranges, they can slow down DOM mutations because live ranges have to be notified every time there's a DOM mutation
<fantasai> dandclark: static range doesn't have that issue
<fantasai> dandclark: we did some research showing that this perf issue is observable
<fantasai> dandclark: and we also found that with caching, it's possible to cache static range validdity
<fantasai> dandclark: which reduces perf cost during painting
<fantasai> dandclark: but if they were backed by live ranges internally, that wouldn't hold
<fantasai> dandclark: because DOM mutations would have to update, so lose perf benefits
<fantasai> dandclark: Sanket also raised some issues with using live ranges internally
<fantasai> dandclark: Authors don't have direct references to those live ranges
<fantasai> dandclark: so difficult, how would they remove or modify a highlight that they added?
<fantasai> dandclark: even if we added a map, can become out of sync with tree
<fantasai> dandclark: and not clear when safe to drop live range, because author might de-regeister and re-register a static range
<fantasai> dandclark: so confusing developer-confusing scenarios there, tricky answers for spec
<fantasai> dandclark: so proposal is to allow static ranges and live ranges to be added to highlights
<fantasai> dandclark: and there's no live range backing for static ranges
<fantasai> astearns: My first question was, can developer use either
<fantasai> astearns: is it clear to devs that live range could have perf implications?
<fantasai> dandclark: It's in line with current use of live ranges
<fantasai> dandclark: Spec issue states that static ranges solve this perf
<fantasai> dandclark: Use of live ranges in genera, here or otherwise, has these perf costs
<fantasai> dandclark: I think it's a documentation and devrel issue
<fantasai> dandclark: of when live ranges vs static ranges are appropriate
<fantasai> sanketj: HTML doc also mentions this (?)
<fantasai> sanketj: It has been documented
<fantasai> sanketj: Static ranges are fairly new concept, previously only had option of live ranges
<fantasai> astearns: OK, out of time
<fantasai> astearns: my understanding is that with this impl experience, you would like to close issue no change?
<fantasai> dandclark: yes
<fantasai> astearns: OK, we'll expect to close no change, and will resolve next week
<fantasai> Meeting closed.

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed highlight API, and agreed to the following:

  • RESOLVED: When an author uses a static range for highlight API, it should actually be a static range internally, not backed by a hidden live range.
  • RESOLVED: Publish a new Highlight API WD when changes are made
The full IRC log of that discussion <TabAtkins> Topic: highlight API
<Rossen_> github: https://github.com//issues/4597
<TabAtkins> github: https://github.com//issues/4597
<TabAtkins> dandclark: This was originally what the criteria was to invalidate a static range during DOM mutations
<TabAtkins> dandclark: discussion evolved into use of StaticRange more generally
<TabAtkins> dandclark: So when an author adds a staticrange to a highlight, should we internally back it with a live range? Or just let it stay static
<TabAtkins> dandclark: I want to advocate we use statics internally
<TabAtkins> dandclark: One for perf; static range is better for perf in some cases when there's a lot of DOM mutations
<TabAtkins> dandclark: Team has done some benchmarks and shown the perf diffs aren't theoretical
<TabAtkins> dandclark: And issues with invalidation of static ragnes during paint time can be addressed with invalidation caches
<chrishtr> q+
<TabAtkins> dandclark: So if they're backed with live ranges internally, we pretty much lose all the benefits and there's no reason to support it
<TabAtkins> dandclark: Second is some issues sanketj pointed out
<TabAtkins> dandclark: If a static is backed internally with a live range, the author doesn't ahve a ref to it, so how do they manipulate this
<TabAtkins> dandclark: Like if a highlihgt is unregistered, should we maintain it? Highlight can be reregistered, unsure what the lifetime is
<TabAtkins> dandclark: So I'd like to get agreement that the internal range is indeed static
<florian> q?
<Rossen_> ack chrishtr
<TabAtkins> chrishtr: So your proposal is that the static range is not live internally, and as a result, when the highlight api uses a static range it's ignored if it's invalid
<TabAtkins> chrishtr: [missed]
<TabAtkins> dandclark: There are some cases like "start before end" that you might have to treewalk to determine; it's not constant-time to determine
<TabAtkins> dandclark: You can largely cache validity status and be fine; if there's no dom mutations you're guaranteed right
<florian> q+
<TabAtkins> chrishtr: But if the dom mutation was non-trivial, the UA might have to treewalk in the worst case?
<TabAtkins> dandclark: Yeah
<TabAtkins> chrishtr: And you couldn't avoid that situation?
<TabAtkins> dandclark: I think UAs could in theory do some optimizations, but the simplest way is to just cache the data; you won't have mutations between every paint in practice
<TabAtkins> chrishtr: So the script usually will make a static range and then immediately add it to the doc
<TabAtkins> dandclark: yeah
<GameMaker> q+
<Rossen_> ack florian
<TabAtkins> florian: agree. I think this is the right thing to do; it's why we proposed having static ranges in the first place
<TabAtkins> florian: Live ranges can be costly, and it might not even be a useful cost
<TabAtkins> florian: If the DOM changes and the UA updates the range, whether or not the updated range is what the author actually wanted isn't clear
<TabAtkins> florian: So paying the cost of updating the live range, only to immediately discard it and regen the range, is silly.
<sanketj> +1 to the proposed resolution
<chrishtr> +1
<GameMaker> +1
<TabAtkins> Rossen_: Any furhter comments?
<Rossen_> q?
<Rossen_> ack GameMaker
<TabAtkins> GameMaker: Agreeement; webkit has the same perf issues that Chrome does, so +1
<TabAtkins> RESOLVED: When an author uses a static range for highlight API, it should actually be a static range internally, not backed by a hidden live range.
<TabAtkins> florian: Do we think we need a spec change to reflect this? Or is the spec already sufficiently implying this?
<TabAtkins> dandclark: I think the spec is clear as-is; we'd only need a change if we decided the other way
<TabAtkins> dandclark: There's another issues about invalidation we coudl discuss; it was the original issue.
<TabAtkins> sanketj: I think it would be good to have a note about using static ranges internally
<TabAtkins> sanketj: So for interop we should have an explanation
<TabAtkins> florian: We say already that when you use a static range they shouldn't be updated, but we can work over the note
<TabAtkins> dandclark: I can look over it
<Rossen_> ack fantasai
<Zakim> fantasai, you wanted to ask about publication
<TabAtkins> fantasai: last pub of this draft was Dec 2020; given changes, we should get an update onto TR
<TabAtkins> fantasai: ED claims there's only been minor changes since last WD, that probably isn't true
<TabAtkins> florian: agree. i'll start making sure the changes section is up to date
<TabAtkins> Rossen_: Should we get a resolution to republish the WD when it's ready?
<TabAtkins> RESOLVED: Publish a new Highlight API WD when changes are made

@frivoal
Copy link
Collaborator Author

frivoal commented Aug 26, 2021

Now that we've resolved on the meta-question of whether to use static ranges internally at all, let's go back to the original question: is the invalidation logic given in the spec reasonable? Recent discussions seem to indicate that it is good and we should accept it, but I'd like an actual resolution so that we can close this issue.

@dandclark
Copy link
Contributor

A good approach might be to try to land the steps in whatwg/dom#947 into the DOM spec (I can work on that), and update the Highlight API to reference that. It could be useful to have those steps in DOM where they can be referenced by other potential users of StaticRange.

One difference between those steps and the current Highlight API spec text is that those steps consider a StaticRange to be invalid if the start boundary is after the position of the end boundary. I think that's a good addition.

@smaug----
Copy link

It is unclear to me from the resolution how using Static Ranges works if one modifies DOM. If one has 2 sibling nodes, and a StaticRange selecting them both, and then one add a new node between those siblings, StaticRange would select the first two ones, but not the last node, yet it is the last node which the user of the API wanted to be selected. Rather odd API behavior.

@dandclark
Copy link
Contributor

It is unclear to me from the resolution how using Static Ranges works if one modifies DOM. If one has 2 sibling nodes, and a StaticRange selecting them both, and then one add a new node between those siblings, StaticRange would select the first two ones, but not the last node, yet it is the last node which the user of the API wanted to be selected. Rather odd API behavior.

For StaticRange, the developer has the responsibility to update the ranges when DOM mutations occur. So if the developer fails to do this, they'll observe odd behavior like in the case you pointed out. There is a deliberate tradeoff here between between ease-of-use and performance for live range vs static range.

@smaug----
Copy link

smaug---- commented Aug 31, 2021

Then we need to have performance tests which check how much overhead it is to observe mutations and update highlights based on them. Simple validity check isn't enough on the caller side, but one needs to check if the end points of the StaticRange are pointing to were they should, and that is costly.

@dandclark
Copy link
Contributor

Then we need to have performance tests which check how much overhead it is to observe mutations and update highlights based on them. Simple validity check isn't enough on the caller side, but one needs to check if the end points of the StaticRange are pointing to were they should, and that is costly.

We've updated the test document with a couple of tests along these lines (Tests 6-7, where 6 is the more realistic of the two and 7 is showing a worst-case scenario for StaticRange). When considering the cost of author code to maintain StaticRanges, StaticRange will still often be much better than live range because the StaticRange updates can be batched at the end of a series of DOM mutations, where live range updates always happen during every individual change in a series of DOM mutations.

It's also important to consider that in many scenarios there would also be a cost for author code to update live range positions, because the live range repositioning done by the platform will probably not always be the exact behavior that authors need.

For example, consider a live range highlight marking a misspelled word for a custom spell-checker. If the text of the word grows or shrinks, the author will probably need to have some code run to keep the range repositioned around the word (or to remove the range if it’s now spelled correctly). It’s unlikely that the browser engine would update the offsets of the live range in exactly the way required by the author, so in practice there is still cost for running author code to check and update live ranges.

@css-meeting-bot
Copy link
Member

The CSS Working Group just discussed Invalidation of Static Ranges, and agreed to the following:

  • RESOLVED: Reference DOM's definition of invalidity for static ranges for highlight API
  • RESOLVED: Republish highlight API
The full IRC log of that discussion <fantasai> Topic: Invalidation of Static Ranges
<fantasai> github: https://github.com//issues/4597
<fantasai> dandclark: Discussed structures for representing highlights
<fantasai> dandclark: author could choose live or static ranges, with different perf considerations each
<fantasai> dandclark: issue was originally open to discuss, given static ranges can become stale
<dandclark> https://dom.spec.whatwg.org/#staticrange-valid
<fantasai> dandclark: what is an invalid static range that we should not try to paint?
<florian> q+
<fantasai> dandclark: Do we want to have highlight API spec point to these definitions in DOM spec, or do we want a different definition
<fantasai> dandclark: ...
<fantasai> dandclark: [reads criteria]
<fantasai> florian: I think it's likely a good idea to point to DOM
<fantasai> florian: reason this is new is that it's first time in the platform that the static range is created by user and passed to UA
<fantasai> florian: other uses the UA passes to user
<fantasai> florian: The criteria listed are reasonable, but ...
<fantasai> florian: but is sensitive to the length of the node
<Rossen_> ack florian
<fantasai> sanketj: Should just follow DOM definition
<fantasai> fantasai: what about the length?
<Rossen_> ack fantasai
<fantasai> fantasai: what is the length of a node?
<fantasai> sanketj: for text it's number of characters
<cbiesinger> is that code points? code units?
<fantasai> florian: so if you have an offset bigger than length of node, one behavior is to treat whole thing as invalid
<fantasai> florian: other behavior that we had originally was to clamp to within the length
<fantasai> florian: going with DOM's definition seems reasonable
<fantasai> Rossen_: Any objections to using DOM's definition of invalid for static ranges?
<fantasai> RESOLVED: Reference DOM's definition of invalidity for static ranges for highlight API
<sanketj> @cbiesinger, yes, code units
<fantasai> fantasai: Do we need to republish?
<fantasai> fantasai: Last publication with 2020
<fantasai> RESOLVED: Republish highlight API

@frivoal frivoal added Closed Accepted by CSSWG Resolution Tested Memory aid - issue has WPT tests labels Sep 8, 2021
@frivoal frivoal closed this as completed Sep 8, 2021
sanketj pushed a commit that referenced this issue Dec 1, 2021
…anges internally for static ranges #4597 (#6545)

* Add note clarifying that user agents must not use live ranges internally for static ranges

* Use spec reference syntax

* Pull the note into the normative paragraph.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants