-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8372948: Store end positions directly in JCTree #28610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
👋 Welcome back cushon! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
Mailing list message from Jonathan Gibbons on javadoc-dev: Without looking in detail at this specific proposal, I wonder if you considered the alternative to only store end positions in the subtypes of JCTree that actually "need" them. In other words, you only need store end positions in tree nodes that "end" in a lexical token and not in a child tree node. Effectively, you only need store the end position in tree nodes that would otherwise have entries in the EndPosTable. -- Jon On Tue, Dec 2, 2025, at 8:12 AM, Liam Miller-Cushon wrote: |
Good question--I hadn't investigated that option. It seems do-able, perhaps with a shared interface for subtypes that needed end positions to simplify the handling of them. What tradeoffs do you see here, would only declaring the field on trees that need it be mostly about saving memory? Also is that unique to end positions? Or could javac potentially avoid storing start positions for nodes that don't start with a lexical token as well? |
|
Mailing list message from Jonathan Gibbons on javadoc-dev:
Probably, yes, as well as just being a closer equivalent to the existing code.
Every JCTree node already has a 'pos' field, representing the position of the first character that is unique to the tree node. That means it is the 'start' position for those nodes that begin with a lexical token, and so no additional field is necessary. The asymmetry between start and end positions is indicative of why there is only an EndPosTable, without any need for a StartPosTable. -- Jon On Tue, Dec 2, 2025, at 8:30 AM, Liam Miller-Cushon wrote: |
|
Mailing list message from Jonathan Gibbons on javadoc-dev: I'm not sure a shared interface gets you anything significant, since you cannot inherit a shared field that way. Instead, you could have a `setEndPos` on `JCTree` that is a no-op on subtypes that do not need it, and which sets a locally declared field on subtypes that do need it. -- Jon On Tue, Dec 2, 2025, at 8:30 AM, Liam Miller-Cushon wrote:
-------------- next part -------------- |
Partly I was thinking the interface could make helpers that set end positions type safe, e.g. for the protected <T extends JCTree & JCTree.HasEndPos> T storeEnd(T tree, int endpos) { ... }But there are other places that store end positions on a
I think the set of trees that end positions are stored for is: I got that by instrumenting And then we could add the following snippet to all of those classes: private int endpos;
public int getEndPos() {
return endpos;
}
public void setEndPos(int endpso) {
this.endpos = endpos;
}My feeling is that perhaps it's worth the extra memory to not have to duplicate that code for all of those |
I tend to agree with your assessment. Having code duplication is kind of bad, unless we can somehow "common" the code -- and that's is probably possible, but not straightforward due the JCStatement vs. JCExpression split (and also JCFunctionalExpression). Also, it's hard to estimate which trees might need this... for instance the trees you show in your analysis don't include some patterns (e.g. record patterns), but that's just because probably there's no record pattern in the JDK, not because the end pos is not useful there. If we exclude JCSkip and maybe LetExpr (as that's only used by the backend), I'm not sure there's much stuff that actually doesn't require an end position? Perhaps with some keywords like One more interesting experiment could be to try to enable end position in all trees, then run the JDK build and compare with mainline, to see what the memory usage looks like (maybe enabling |
|
Mailing list message from Jonathan Gibbons on javadoc-dev: On Wed, Dec 3, 2025, at 1:47 AM, Liam Miller-Cushon wrote:
Yeah, you've done the analysis I think I would have done. And, it does seem l;ke there has been a fundamental shift in the desire/need to keep end positions around since times past. Simplicity says to go with a field in every JCTree these days. A more informed opinion would probably require more detailed analysis. I note that you can offset the space of the endPos fields that are not strictly required against the savings of the EndPosTable itself. I also wonder, just for curiousity sake, whether this would help with some of the (uncommon) problem cases in the existing situation, where the endPos of a non-terminal end element might not be the endPos of the entire tree -- IIRC, the issue was with C-style array declarations. The one test I would suggest to add (if necessary) in the "tree position" set of tests would be to verify that the new field is set in all applicable cases, with maybe some thought being given to synthetic nodes. -- Jon |
|
Mailing list message from Jonathan Gibbons on javadoc-dev: On Wed, Dec 3, 2025, at 11:13 AM, Maurizio Cimadamore wrote:
Doesn't the endPos records the position of the semicolon, not the end of the keyword? -- Jon |
|
Mailing list message from Jonathan Gibbons on javadoc-dev: On Wed, Dec 3, 2025, at 11:56 AM, Jonathan Gibbons wrote:
A useful heuristic is to check the `visit...` methods in `JCPretty`. If there is a call to `print(String)` or `print(char)` before the `} catch (IOException` then the node should have an endPos. In other words, an endPos is required for all nodes that end in a specific lexical token. For example, compare this from `visitLambda` printExpr(tree.body); and this from `visitParens` print(')'); |
I attempted this experiment. I ran configure with With this PR the output was something like And without these changes, it looks like it peaked at 36M instead of 37M diff --git a/make/common/JavaCompilation.gmk b/make/common/JavaCompilation.gmk
index 33f5d10535a..e9a800bce5a 100644
--- a/make/common/JavaCompilation.gmk
+++ b/make/common/JavaCompilation.gmk
@@ -254,7 +254,7 @@ define SetupJavaCompilationBody
javacserver.Main --conf=$$($1_JAVAC_SERVER_CONFIG)
else
# No javac server
- $1_JAVAC := $$(INTERIM_LANGTOOLS_ARGS) -m jdk.compiler.interim/com.sun.tools.javac.Main
+ $1_JAVAC := -verbose:gc $$(INTERIM_LANGTOOLS_ARGS) -m jdk.compiler.interim/com.sun.tools.javac.Main
ifeq ($$($1_SMALL_JAVA), true)
$1_JAVAC_CMD := $$(JAVA_SMALL) $$($1_JAVA_FLAGS) $$($1_JAVAC) |
This change adds a field to
JCTreeto store end positions, instead of using a separateEndPosTablemap. See also this compile-dev@ thread.I performed the refactoring in stages, preserving existing semantics at each step.
There are two known places where this changes existing behaviour that are reflected in changes to tests:
test/langtools/tools/javac/api/TestJavacTask_Lock.java- this test asserts that callingJavacTask#parsefirst and then calling#callor#parsesecond will fail. The assertion that the test is currently expecting is thrown when theEndPosTablegets set a second time, and this change means that no longer results in an exception. If desiredJavacTask#parsecould be updated to explicitly check if it is called twice and fail, instead of indirectly relying on theEndPosTablefor that.test/langtools/tools/javac/diags/DiagnosticGetEndPosition.java- there's a comment that 'ideally would be "0", but the positions are not fully set yet', and with the new approach the end position is available to the test, so it resolves the commentProgress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28610/head:pull/28610$ git checkout pull/28610Update a local copy of the PR:
$ git checkout pull/28610$ git pull https://git.openjdk.org/jdk.git pull/28610/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28610View PR using the GUI difftool:
$ git pr show -t 28610Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28610.diff
Using Webrev
Link to Webrev Comment