Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC for cached attr-associated elements? #8545

Open
alice opened this issue Nov 24, 2022 · 26 comments
Open

GC for cached attr-associated elements? #8545

alice opened this issue Nov 24, 2022 · 26 comments
Labels
topic: reflect For issues with reflected IDL attributes and friends.

Comments

@alice
Copy link
Contributor

alice commented Nov 24, 2022

This came out of the discussion on #8442.

@annevk noticed that while the explicitly set attr-elements for an element are lists of weak references, the cached attr-associated elements are not explicitly specced as a weak reference.

Does the spec need to clarify this?

@annevk
Copy link
Member

annevk commented Nov 25, 2022

I was thinking you could have a weak reference to it, but that doesn't work if you set an expando on it and then after GC observe the expando is gone.

Currently the getter has this step:

If the contents of elements is equal to the contents of this's cached attr-associated elements, then return this's cached attr-associated elements.

I think a problem here is that that even if someone mutates the node tree, an implementation cannot toss away "cached attr-associated elements", as further mutations to that node tree could make "attr-associated elements" return the same list. And you really only know whether that is the case the moment the getter is invoked again.

So as long as the element is alive you leak cached attr-associated elements and everything therein, even if everything therein was eligible for collection.

@smaug---- @rniwa @domenic thoughts?

@annevk
Copy link
Member

annevk commented Feb 20, 2023

@mrego how did you implement this in Chromium without a leak?

@annevk annevk added the agenda+ To be discussed at a triage meeting label Feb 20, 2023
@annevk
Copy link
Member

annevk commented Feb 20, 2023

I cannot attend triage until Feb 23 it seems. If folks could give this a look before then I'd appreciate it, but otherwise I can explain then.

@mrego
Copy link
Member

mrego commented Feb 20, 2023

@mrego how did you implement this in Chromium without a leak?

I actually implemented this in WebKit at https://commits.webkit.org/251237@main

Chromium doesn't have an implementation for this yet.

@annevk
Copy link
Member

annevk commented Feb 20, 2023

It looks like that just accepts we leak this for the lifetime of the element, right? Maybe that's okay then.

annevk added a commit that referenced this issue Feb 22, 2023
Other changes:

* Remove reflection of unrestricted double as it is buggy and unused.
* The DOMString getter steps did not account for a missing attribute.
* The native accessibility semantics map was renamed to the internal content attribute map as it's now a more general reflection concept.

Corresponding ARIA PR: w3c/aria#1876.

Fixes #8442.

Follow-up:

* w3c/core-aam#152
* w3c/aria#1110
* #3238
* #8544
* #8545
* #8926
* #8927
* #8928
* #8930
@past past removed the agenda+ To be discussed at a triage meeting label Feb 23, 2023
@annevk
Copy link
Member

annevk commented Feb 24, 2023

Closing this as nobody on the triage call seemed interested in this and there are implementations as well.

@annevk annevk closed this as completed Feb 24, 2023
@smaug----
Copy link
Collaborator

I think this needs to be fixed. Leaking is not ok.

@smaug---- smaug---- reopened this Feb 27, 2023
@domenic
Copy link
Member

domenic commented Feb 27, 2023

To expand a bit on the discussion in the triage call:

If you do

const x = {};
x.y = [z];

then z, and the associated array, "leak" for the lifetime of x. This is fine and not generally what we consider a "memory leak".

The situation is the same with cached attr-associated elements, so nobody on the call thought this was particularly problematic.

@smaug----
Copy link
Collaborator

But here that leak happens in a rather hidden way. You don't know that there is [z] associated to your element and you don't have a way to clear that easily, no?

@domenic
Copy link
Member

domenic commented Feb 27, 2023

You do and you do. You know because you assigned it there, and you can clear it by setting the value of the IDL attribute to the empty array or the content attribute to the empty string.

@smaug----
Copy link
Collaborator

And we expect web devs to clear it? Web sites leak so often that I'd rather avoid adding yet more weird hazards for web devs.

@annevk
Copy link
Member

annevk commented Feb 27, 2023

I guess we're expecting that the encompassing subtree is removed in a typical setup.

@nolanlawson
Copy link

nolanlawson commented Feb 27, 2023

AIUI it's possible to leak in this scenario:

element.setAttribute('aria-labelledby', 'valid-id')
element.ariaLabelledByElements // invoking the getter causes the cached attr-associated elements to be set

From a web dev's perspective, it is a bit surprising that element now has a strong reference to the element with id="valid-id". If some code elsewhere calls document.getElementByid('valid-id').remove(), then the element would continue to leak even when detached from the DOM.

@annevk annevk added the topic: reflect For issues with reflected IDL attributes and friends. label Mar 3, 2023
@annevk
Copy link
Member

annevk commented Mar 17, 2023

Based on the discussion with @alice in #8932 I thought of an idea. We could require connectedness before we return anything and a have a disconnected listener that clears the cached list.

It's already the case that when nodes are disconnected the relationship might not be exposed to JS if they are in separate trees. This would make that the case even if they are in the same tree, but it gives user agents more opportunity to reclaim memory.

Thoughts?

cc @nt1m

@nt1m
Copy link
Member

nt1m commented Mar 18, 2023

@mrego did the implementation in WebKit of attr-associated elements. No specific thoughts from me.

@annevk
Copy link
Member

annevk commented Mar 18, 2023

@smaug---- pointed out that if the elements you point to get disconnected you still leak, which could only be solved by the user agent adding removal listeners to all elements involved. The setup gets pretty hairy at that point though so I don't think we want to go there.

@alice
Copy link
Contributor Author

alice commented Mar 22, 2023

From a web dev's perspective, it is a bit surprising that element now has a strong reference to the element with id="valid-id".

Urgh, indeed, I hadn't put that together.

Is the caching invariant really worth all of these counter-intuitive side effects?

@annevk
Copy link
Member

annevk commented Mar 22, 2023

I don't know caching invariant refers to. The third bullet point of https://www.w3.org/TR/design-principles/#attributes-like-data perhaps? We shouldn't break that.

@alice
Copy link
Contributor Author

alice commented Mar 22, 2023

Yes, that. I'm asking whether sticking to that principle in this instance is worth the trade-off with all the negative effects it has. Perhaps you could explain why it is.

@annevk
Copy link
Member

annevk commented Mar 22, 2023

Well, I don't think we have clearly demonstrated the negative effects as-of-yet. Nor have we fully explored possible solutions to those issues.

Changing API invariants such as that one however makes the web platform brittle and gives web developers a bad time. We should have used a different API then.

@alice
Copy link
Contributor Author

alice commented Mar 22, 2023

Any option to maintain that invariant will inherently require strong references to be held, at least under certain circumstances, to any element referred to. (That might have been obvious to everyone else, but it wasn't to me, so I figure it might be worth saying.) So, I strongly (heh) doubt that we're going to find a solution which neither causes leaks in the sense referred to here, nor requires a lot of bookkeeping in order to keep back-references from referred-to elements to the element hosting the attribute (which I think has been ruled out multiple times).

So, I think realistically our options are limited to:

  • Live with leaks (possibly with some mitigations like GC for cached attr-associated elements? #8545 (comment)), or
  • Scrap the feature as-designed, replace it with something more method-shaped, or
  • Modify the design such that there's a 1:1 association between attrElements and explicitly set attr-elements, so that we never need to compute and return the attr-associated elements to the getter, and explicitly set attr-elements become easily observable strong references, or
  • Drop the caching invariant (in lieu of a different pithy name I'm going to keep calling it that now that we're all on the same page about what it means).

I was more comfortable with the first option until Nolan pointed out what it means for elements which are associated by ID only but retrieved via the getter. I'm pretty sure I could still live with it, though, per Domenic's reasoning that in practice leaks will be short-lived the vast majority of the time. (Is that what you're referring to about demonstrating negative effects?)

I'm not wild about the second, obviously, since I've invested a lot of time into this version of the API, and I think it would be less friendly for developers. I'd be reluctantly prepared to explore it if there's literally no other option everyone can agree on.

The third option means, of course, that elements inside Shadow DOM are fair game for the getter, plus we would lose the closer parallel between attrElements and attr-associated elements which are potentially exposed to AT APIs.

As far as the fourth option goes - I think it's already bending the spirit of the rule that you can get el.attrElements on two separate occasions without setting it and still get back two different results. If we want to stick to the absolute letter of the rule, we could cache the result very very briefly, for example until there is any tree mutation at all?

@annevk
Copy link
Member

annevk commented Mar 22, 2023

That's not bending the rules. Getters returning different things because of state changes is perfectly acceptable. Getters returning different things purely because of multiple invocations of the getter is what that rule is concerned with.

@alice
Copy link
Contributor Author

alice commented Mar 29, 2023

I spent some time poring over the original PR and saw that @domenic had suggested ObservableArray might work, but at the time they hadn't been implemented anywhere.

Thanks to the passage of time, and CSSOM, they seem to now be implemented; should we revisit that option?

@domenic
Copy link
Member

domenic commented Mar 29, 2023

I think that might work! There's some complexity there in getting the exact right semantics---i.e., I think it would be a tricky spec to write, and you'd need to spend extra time double-checking the resulting behavior. But an ObservableArray can maintain its identity over time while changing its contents, so I think it would allow you to thread the needle here.

@nolanlawson
Copy link

Just to loop back on this: I don't think the memory leak issue is a dealbreaker per se. To truly leak memory in my example, the element would need to remain omnipresent in the DOM (whereas what it references with ariaLabelledByElements would not).

In practice, I imagine that the two elements would be added to and removed from the DOM together, so the GC of both would happen normally. (E.g. an <input> and its <label>.)

And even in the worst case, it is possible for the author to simply clear element.ariaLabelledByElements when removing its target from the DOM.

@annevk
Copy link
Member

annevk commented Mar 22, 2024

cc @petervanderbeken

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: reflect For issues with reflected IDL attributes and friends.
Development

No branches or pull requests

8 participants