Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upTemplate objects are eternal when put into WeakMap #840
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
allenwb
Mar 7, 2017
Member
template objects are canonicalize so that template handlers can use them as cache keys (possibly weak) to memoize any preprocessing of the template object.
I assume the intention is to allow this registry to be implemented internally as a weak map: if the template object itself is GCed, then the identity of the canonical object becomes unobservable, and thus can be removed from the template registry.
Not necessarily. The original idea was that template objects corresponded to a specific "site" in the source code and that any evaluation (for the duration of its realm) of the template literal at that site would always produce the same template objct. That's why the metavariables that holds the result of calls to GetTemplateObject are named siteObj. GetTemplateObject and the registry it uses were just a specification mechanism for saying that each evaluation of template literal at particular source code site must yield the same template site object.
At some point we (presumably @erights and me, but I don't recall) must have realized that because the "site" template objects were immutable, the same object could be used by different template literal sites within the same realm (same realm because they reference %ArrayPrototype%) and so the registry doesn't include the parse node ("the site") as part of the key. Unfortunately, that in combination with the ESs ability to dynamically load code implies that once a template object is created in a realm it must exist for the remaining lifetime of the realm. I'm not sure whether that was in fact the original intent when we were thinking of template objects as being site specific.
|
template objects are canonicalize so that template handlers can use them as cache keys (possibly weak) to memoize any preprocessing of the template object.
Not necessarily. The original idea was that template objects corresponded to a specific "site" in the source code and that any evaluation (for the duration of its realm) of the template literal at that site would always produce the same template objct. That's why the metavariables that holds the result of calls to GetTemplateObject are named siteObj. GetTemplateObject and the registry it uses were just a specification mechanism for saying that each evaluation of template literal at particular source code site must yield the same template site object. At some point we (presumably @erights and me, but I don't recall) must have realized that because the "site" template objects were immutable, the same object could be used by different template literal sites within the same realm (same realm because they reference %ArrayPrototype%) and so the registry doesn't include the parse node ("the site") as part of the key. Unfortunately, that in combination with the ESs ability to dynamically load code implies that once a template object is created in a realm it must exist for the remaining lifetime of the realm. I'm not sure whether that was in fact the original intent when we were thinking of template objects as being site specific. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
erights
Mar 7, 2017
Yes. Given evaluators, there were ambiguities regarding the definition of "site". Rather than resolve these, we decided that all identical template object contents within the same realm be made identical through a realm-specific registry. Had the identity been specific to "site", the identity needs to last as long as the possibility that the "same" "site" be able to be mentioned again, for whatever the concept of "site" is. This emphasizes why the "site" concept is tricky to get right.
The realm is a good replacement for site. Since the template object inherits from some Array.prototype intrinsic, clearly nothing coarser-grained than realm is possible.
I'm sorry if the issue of what-is-weak got confused in the process. I may indeed have been confused about this during the transition. Here's how I see it now:
The per-realm registry must indeed hold onto template objects strongly. It is a strong interning table. The important weak maps we were always trying to enable, and still are, are the weak maps that the template tags can choose to use in order to memoize any preprocessing work it does on a particular template object, so it can amortize that work over all the substitution values that template object is reused with.
My own https://github.com/erights/quasiParserGenerator is a good example: The bnf is part of the literal template text. The action rules are represented by first class functions as substitution values. From one evaluation of a bnf`...` expression to the next, the literal bnf grammar is always the same but the action rules can differ. All the work generating a parser from the bnf text is reused even though the resulting parser may be used with different action rules.
For present purposes, a strong per-realm interning registry does enable working weakmaps per tag for memoizing any per-template preprocessing, for amortizing effort per template expression evaluation of the "same" expression, for a workable definition of "same".
However, the quasi-parser generator also demonstrates that per-realm interning is too coarse for a completely different reason I'll explain in another post. That reason does not affect the conclusion that the per-realm registry should be strong.
erights
commented
Mar 7, 2017
•
|
Yes. Given evaluators, there were ambiguities regarding the definition of "site". Rather than resolve these, we decided that all identical template object contents within the same realm be made identical through a realm-specific registry. Had the identity been specific to "site", the identity needs to last as long as the possibility that the "same" "site" be able to be mentioned again, for whatever the concept of "site" is. This emphasizes why the "site" concept is tricky to get right. The realm is a good replacement for site. Since the template object inherits from some I'm sorry if the issue of what-is-weak got confused in the process. I may indeed have been confused about this during the transition. Here's how I see it now: The per-realm registry must indeed hold onto template objects strongly. It is a strong interning table. The important weak maps we were always trying to enable, and still are, are the weak maps that the template tags can choose to use in order to memoize any preprocessing work it does on a particular template object, so it can amortize that work over all the substitution values that template object is reused with. My own https://github.com/erights/quasiParserGenerator is a good example: The bnf is part of the literal template text. The action rules are represented by first class functions as substitution values. From one evaluation of a For present purposes, a strong per-realm interning registry does enable working weakmaps per tag for memoizing any per-template preprocessing, for amortizing effort per template expression evaluation of the "same" expression, for a workable definition of "same". However, the quasi-parser generator also demonstrates that per-realm interning is too coarse for a completely different reason I'll explain in another post. That reason does not affect the conclusion that the per-realm registry should be strong. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ajklein
Mar 7, 2017
Contributor
@caitp wrote the v8 implementation; a cursory examination suggests that we hold onto the objects strongly (we use a JS Map as the storage).
|
@caitp wrote the v8 implementation; a cursory examination suggests that we hold onto the objects strongly (we use a JS Map as the storage). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
erights
Mar 7, 2017
Why is a per-realm interning based on only the currently specified contents of a template object too coarse?
Image that a template tag function is a quasi-parser for an interesting dsl. The https://github.com/erights/quasiParserGenerator is a fine example, where the dsl is bnf[1]. Now, let's say that a given bnf`...` expression itself has a syntax error -- it is not a syntactically valid bnf specification. The bnf tag must certainly reject it, and do so with an error message for helping the programmer to find and fix the syntax error. Of all the parts of a syntax-error-diagnostic, by far the most useful part is the source location.
Given the current template literal specification, it is IMPOSSIBLE for a template tag function to figure out what source location to tell the user about.
Instead, we need, as another immutable part of the template object, a source map mapping from locations in the template back to source locations. But what is a source location? Both the de-facto source map draft standard and the error stacks proposal traffics in source locations, but neither tries to pin down what they are. Whatever they are, this means that the per-registry strong interning table should take these into account as part of the interning. This will make finer grain distinctions among templates in the same realm than can be made within the current spec.
Attn @disnet
[1] Actually, parsing expression grammars rather than bnf. The difference is irrelevant to the present point.
erights
commented
Mar 7, 2017
•
|
Why is a per-realm interning based on only the currently specified contents of a template object too coarse? Image that a template tag function is a quasi-parser for an interesting dsl. The https://github.com/erights/quasiParserGenerator is a fine example, where the dsl is bnf[1]. Now, let's say that a given Given the current template literal specification, it is IMPOSSIBLE for a template tag function to figure out what source location to tell the user about. Instead, we need, as another immutable part of the template object, a source map mapping from locations in the template back to source locations. But what is a source location? Both the de-facto source map draft standard and the error stacks proposal traffics in source locations, but neither tries to pin down what they are. Whatever they are, this means that the per-registry strong interning table should take these into account as part of the interning. This will make finer grain distinctions among templates in the same realm than can be made within the current spec. Attn @disnet [1] Actually, parsing expression grammars rather than bnf. The difference is irrelevant to the present point. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
syg
Mar 7, 2017
Member
I'm not particularly please about speccing something that in practice un-GC-able. Nevertheless, thanks for verifying that the registry must be strong.
|
I'm not particularly please about speccing something that in practice un-GC-able. Nevertheless, thanks for verifying that the registry must be strong. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
erights
Mar 7, 2017
A realm as a whole is gc-able. But yes, so long as anything retains the realm, this is not gc-able.
erights
commented
Mar 7, 2017
|
A realm as a whole is gc-able. But yes, so long as anything retains the realm, this is not gc-able. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
WebReflection
Mar 8, 2017
Chiming in, since I've based part of my recent work trusting that the following would be true:
(a=>a)`a${1}b${2}` === (a=>a)`a${3}b${4}`I also think template literals are widely used as one-off so, beside those cases where these are held, it might make more sense to have them weakly referenced?
I understand that "one-off" situation could be repeated in the future indirectly through runtime loaded code, but is it really worth to keep all of them always in memory, considering Mobile devices and IoT implications?
Just speaking as a developer, also grateful Firefox eventually decided to face this issue.
Thank you all!
WebReflection
commented
Mar 8, 2017
|
Chiming in, since I've based part of my recent work trusting that the following would be true: (a=>a)`a${1}b${2}` === (a=>a)`a${3}b${4}`I also think template literals are widely used as one-off so, beside those cases where these are held, it might make more sense to have them weakly referenced? I understand that "one-off" situation could be repeated in the future indirectly through runtime loaded code, but is it really worth to keep all of them always in memory, considering Mobile devices and IoT implications? Just speaking as a developer, also grateful Firefox eventually decided to face this issue. Thank you all! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Apr 12, 2017
Member
To summarize, this thread raises a number of significant problems:
- (from @syg) Template strings may live forever due to putting them in a WeakMap
- (from @ajklein) Because of this logic, it's rather difficult to implement any GC at all of template objects
- (from @erights) The template string facility cannot be extended to expose anything about the site, as it's site-independent
At this point, it sounds like it could be reasonable to return to the original "site"-based definition. There's a chance that this could lead to compat issues, but these are reduced a bit if Firefox hasn't shipped site-independent caching yet.
Spec-wise, would it be acceptable to just put the Parse Node in the Realm's [[TemplateMap]] in place of the [[Strings]] field? We could compare them with language like, "If they are the same Parse Node...".
@WebReflection What are you using that invariant for? From the above discussion, it doesn't sound like it was core to the design of template strings.
As far as Mobile/IoT, ECMAScript already forces implementations to keep the much of the source code in memory (due to Function.prototype.toString); do you expect cached template literals to be a big proportion of that? Of course it's possible to create many more template literals than functions using eval, but such a specialized case could probably reduce its memory usage with manual usage of a Map. This is just the same as if the user is creating many Arrays that all have the same contents.
|
To summarize, this thread raises a number of significant problems:
At this point, it sounds like it could be reasonable to return to the original "site"-based definition. There's a chance that this could lead to compat issues, but these are reduced a bit if Firefox hasn't shipped site-independent caching yet. Spec-wise, would it be acceptable to just put the Parse Node in the Realm's [[TemplateMap]] in place of the [[Strings]] field? We could compare them with language like, "If they are the same Parse Node...". @WebReflection What are you using that invariant for? From the above discussion, it doesn't sound like it was core to the design of template strings. As far as Mobile/IoT, ECMAScript already forces implementations to keep the much of the source code in memory (due to Function.prototype.toString); do you expect cached template literals to be a big proportion of that? Of course it's possible to create many more template literals than functions using |
added a commit
to littledan/ecma262
that referenced
this issue
Apr 12, 2017
added a commit
to littledan/test262
that referenced
this issue
Apr 12, 2017
This was referenced Apr 12, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Apr 12, 2017
Member
I wrote up a strawman PR to go back to a site-based definition at #890 . If anyone has advice about what's the proper way to write this sort of thing, that would be appreciated; I don't know whether it's OK to just put a Parse Node inside a List of Records (but I don't see why not) and talk about whether it's "the same one", though I don't see why not.
@syg Are these the semantics that Firefox is currently shipping? The new tests pass on my build of SpiderMonkey, but I don't know if there are some additional subtleties.
|
I wrote up a strawman PR to go back to a site-based definition at #890 . If anyone has advice about what's the proper way to write this sort of thing, that would be appreciated; I don't know whether it's OK to just put a Parse Node inside a List of Records (but I don't see why not) and talk about whether it's "the same one", though I don't see why not. @syg Are these the semantics that Firefox is currently shipping? The new tests pass on my build of SpiderMonkey, but I don't know if there are some additional subtleties. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
anba
Apr 12, 2017
Contributor
If anyone has advice about what's the proper way to write this sort of thing, that would be appreciated; I don't know whether it's OK to just put a Parse Node inside a List of Records (but I don't see why not) and talk about whether it's "the same one", though I don't see why not.
ES6 rev28 used for GetTemplateObject (called GetTemplateCallSite back then):
- If a call site object for the source code corresponding to templateLiteral has already been created
(see step 12 below) by a previous call to this abstract operation, then
- Return that call site object.
- ...
- Remember an association between the source code corresponding to templateLiteral and siteObj such that siteObj can be retrieve in subsequent calls to this abstract operation.
- Return siteObj.
ES6 rev28 used for GetTemplateObject (called GetTemplateCallSite back then):
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
syg
Apr 12, 2017
Member
@littledan SM Nightly is supposed to have per-content caching. Which version of the repo are you building the shell from?
|
@littledan SM Nightly is supposed to have per-content caching. Which version of the repo are you building the shell from? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@syg I used a checkout from two months ago. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Jamesernator
Jul 10, 2017
I feel like this has some bad implications if weak references happens, not only would your own weak reference be essentially useless if another bit of code happened to have a strong reference to the same template object (as you can't even control your own object), but you can even check if anything else has a reference to that template object.
Jamesernator
commented
Jul 10, 2017
•
|
I feel like this has some bad implications if weak references happens, not only would your own weak reference be essentially useless if another bit of code happened to have a strong reference to the same template object (as you can't even control your own object), but you can even check if anything else has a reference to that template object. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
erights
Jul 11, 2017
Attn @dtribble
not only would your own weak reference be essentially useless if another bit of code happened to have a strong reference to the same template object (as you can't even control your own object),
I don't understand this objection. In general, if you weakly point at something that is still strongly reachable from roots, it sticks around. This might seem to be the only case (and therefore a new case) where this sharing can be brought about among objects that share only code, but it is not. This sharing only happens within a realm. Objects sharing code in a realm also share intrinsics, and from the intrinsics can navigate to other objects. These non-intrinsic objects may be differentially retained or gc'ed in the same manner.
but you can even check if anything else has a reference to that template object.
This is a known side channel not specific to template objects. The proposal calls attention to this side channel and puts the makeWeakRef function (or WeakRef constructor) on the System object (or the import.meta object) for exactly this reason.
erights
commented
Jul 11, 2017
|
Attn @dtribble
I don't understand this objection. In general, if you weakly point at something that is still strongly reachable from roots, it sticks around. This might seem to be the only case (and therefore a new case) where this sharing can be brought about among objects that share only code, but it is not. This sharing only happens within a realm. Objects sharing code in a realm also share intrinsics, and from the intrinsics can navigate to other objects. These non-intrinsic objects may be differentially retained or gc'ed in the same manner.
This is a known side channel not specific to template objects. The proposal calls attention to this side channel and puts the |
DanielRosenwasser
referenced this issue
Aug 26, 2017
Closed
Tagged template literals compiled incorrectly, TemplateStringsArray not cached #17956
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
WebReflection
Sep 6, 2017
What are you using that invariant for? From the above discussion, it doesn't sound like it was core to the design of template strings.
Sorry I didn't receive any notification about your question.
Both me and recently Polymer Labs too, use template literals as unique reference to understand if same static content (within the template) needs to be analyzed.
Once you are sure one static content is identical to any version of itself, parsing it, analyzing it, procesing it as DOM tree, makes no sense because the only changes you want to consider are interpolations and nothing else.
Accordingly, having the first argument of any function tag as in this scenario:
(a=>a)`a${1}b${2}` === (a=>a)`a${3}b${4}`is essential to figure out if the static parts of the template need to be parsed or not.
AFAIK neither my project nor Polymer Labs one use a WeakMap to hold those templates ... but that's because we both use a Map, which would retain such static content forever regardless.
I wouldn't care if WeakMap could discard them, as long as a Map works and the template literal remains unique (per realm, I wouldn't care at all if it doesn't cross realm).
Hopefully I've explained the invariant.
WebReflection
commented
Sep 6, 2017
Sorry I didn't receive any notification about your question. Both me and recently Polymer Labs too, use template literals as unique reference to understand if same static content (within the template) needs to be analyzed. Once you are sure one static content is identical to any version of itself, parsing it, analyzing it, procesing it as DOM tree, makes no sense because the only changes you want to consider are interpolations and nothing else. Accordingly, having the first argument of any function tag as in this scenario: (a=>a)`a${1}b${2}` === (a=>a)`a${3}b${4}`is essential to figure out if the static parts of the template need to be parsed or not. AFAIK neither my project nor Polymer Labs one use a WeakMap to hold those templates ... but that's because we both use a Map, which would retain such static content forever regardless. I wouldn't care if Hopefully I've explained the invariant. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Sep 6, 2017
Member
@WebReflection With this proposed change, your caching system would still work. It would also still successfully get cache hits, much of the time. The cache misses would only come when a distinct piece of source code is loaded which happens to have the same template literal. Then, some duplicate work would be done, but only the first time for that second piece of code. Would this be acceptable for you?
|
@WebReflection With this proposed change, your caching system would still work. It would also still successfully get cache hits, much of the time. The cache misses would only come when a distinct piece of source code is loaded which happens to have the same template literal. Then, some duplicate work would be done, but only the first time for that second piece of code. Would this be acceptable for you? |
added a commit
to littledan/ecma262
that referenced
this issue
Oct 12, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Is this outstanding census? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
WebReflection
Nov 24, 2017
@littledan gosh I still don't receive notifications here ...
The cache misses would only come when a distinct piece of source code is loaded which happens to have the same template literal.
if this includes dynamic import it might be an issue. is that the case?
WebReflection
commented
Nov 24, 2017
|
@littledan gosh I still don't receive notifications here ...
if this includes dynamic |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Nov 24, 2017
Member
We got consensus on the general idea of the change in the September 2017 meeting, but some more editorial changes are needed on the spec patch before it can land.
|
We got consensus on the general idea of the change in the September 2017 meeting, but some more editorial changes are needed on the spec patch before it can land. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
I apologize for the delay here! |
syg commentedMar 7, 2017
As with all fun things this came out of a discussion with @anba and @arai-a. We ran into it as I was fixing template object canonicalization in Firefox.
So GetTemplateObject canonicalizes template objects using a realm-wide template registry. I assume the intention is to allow this registry to be implemented internally as a weak map: if the template object itself is GCed, then the identity of the canonical object becomes unobservable, and thus can be removed from the template registry.
Funnily enough, using the template object as a JS WeakMap key makes it observable forever. Since from the spec's point of view the template objects are eternal anyways, the spec calls for the following behavior (test case courtesy of @arai-a):
In other words, putting a template object as a key in a WeakMap actually makes it live forever and unable to be GCed.
@ajklein @bterlson @msaboff How do your respective engines implement this?