New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autogenerate IR tree #4560
Autogenerate IR tree #4560
Conversation
7e6c6c2
to
5cbe605
Compare
@mcpiroman Thank you for your amazing work! Frankly speaking I'm not responsible to merge this commits but there are some technical suggestion that I hope will speed up further process:
|
@max-kammerer I may make the first commit only generate the tree and not use it, but then there will be duplicated classes - one generated and one hand-written - so it won't compile. Assuming you don't want commits that don't compile I would have not to commit those in the former, spliting them into 'everything which does not affect the existing behavior' and 'everything else' fwiw. Secondly, I have left the |
Ahh, missclick |
@mcpiroman If you suggest to perform splitting in
then it would be much better. Regarding of approach you will take I should note that there could be some clients that could be affected by changes in first (it's depends how would be kept existing behaviour) or/and second parts. So I would suggest to keep as much as possible transparency and clearancy in part that affects current ir declarations and visitors/transfomers/etc (maybe combining some commits, or making actual substitution to generated ones in last commit) and keep classes/files histories (that already done in current approach). Please avoid
It looks ok for me. Could you explain what difficulties are here and what is estimated plan here?
If you mean switching current pr with 'everything which does not affect the existing behavior' I would prefer to keep current solution so other team folks could participate in discussion and add notes having current full view |
@max-kammerer I don't quite get it overall. As for the review I suggest you just look at the accumulated difference, not commits, even after dividing them anyhow. I feel like in this case it will be more helpful if I describe any changes as needed rather than doing something on the commit layer. I will write the per folder change overview below. Therefore merge commits won't cause any problems, as I expect you will squash everything on merge, right?
fun IrStringConcatenation.addArgument(argument: IrExpression) {
arguments.add(argument)
} which I think are no needed anymore, but to remove them I'd have to modify some use-sides to use the direct collection access. So, should I also commit 1. or 2. with this PR or a follow-up one? Edit: I've updated the list of changes in the PR's header and I think with those everything should be clear to review as is. |
Usually we don't squash commits on merge (if they are not explicitly marked for that) keeping original structure and making it's through I will review pr deeper in few day to give non-technical part of feedback. And once again, final decision here would required review from other folks too |
a068ae5
to
735c16c
Compare
|
||
public abstract var body: IrBlockBody | ||
|
||
public override fun <R, D> accept(visitor: IrElementVisitor<R, D>, `data`: D): R = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why data
with ``?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The library for code generation I use, https://github.com/square/kotlinpoet, seemingly tries to be super safe and always escapes even soft keywords. If you have some other concerns about formatting of the generated code, like line a wrapping, it is also because of it (and that particular case is also by design).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to avoid such artifacts in code, maybe it's worth to file issue in kotlinpoet about that (if there are no way to switch this behaviour off). I didn't compare other changes deeply in generated code as I wrote many generated files lost their history and it's quite complicated to review them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The policy of this library is to generate possibly ugly but firmly working code. I've filed an issue about an option to disable (or control) line wrapping and they declined. I may try with the unneeded escaping but I guess without success. Note that It would be quite hard, sometimes impossible, to track the context of the code and follow the rules of where a particular soft keyword is allowed or not (especially given the language may change). Here is a very relevant example on why this is tricky.
But on the other hand note the generated code won't be seen much either, I expect, mostly from intellisence where this doesn't matter. Or, to get an overview of a given element (like its properties), just go to the DSL (IrTree.kt
) as it's much more conscience. (I've just got an idea to for each generated class autogenerate a link in a comment pointing to the line the element is defined in the DSL, don't know if that's possible tho).
many generated files lost their history and it's quite complicated to review them
My advice is to not rely on git here, I don't see how can it help. If you just want to see the generated elements then go to the gen
folder. Or if you want to compere them with their hand written original, manually find its class by the name. I recall only one file that is really moved, other than that just treat them as separate entities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to add postprocessor here: with replacing all 'data'
with data
. And removing public
modifiers from generated code. Maybe it's also solve issue with file moving (it will require reset and recommit of switch commit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mcpiroman Thank you!
Additonal suggestions:
- Avoid
: Unit
return type generation. By postprocessing or via kotlinpoet if it's possible:kotlin/compiler/ir/ir.tree/gen/org/jetbrains/kotlin/ir/declarations/IrAnonymousInitializer.kt
Line 37 in bbe2389
override fun <D> acceptChildren(visitor: IrElementVisitor<Unit, D>, data: D): Unit { - Avoid additional indentation for function with expressions body:
kotlin/compiler/ir/ir.tree/gen/org/jetbrains/kotlin/ir/declarations/IrAnonymousInitializer.kt
Line 35 in bbe2389
visitor.visitAnonymousInitializer(this, data) - While not keep old scheme for
override fun visitConst(expression: IrConst<*>): IrExpression
with type parameter? If we can avoid unneccessery source level breakage I would prefer to not do them - I would also prefer to support some migration for
visitVariableAccess
: have deprecated one delegating to new 'visitValueAccess'. And moving lower switching tovisitValueAccess
in separate commit
There is still problem with file movement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@demiurg906
Well, in the IR tree source there are also no parent members. That's why I proposed listing them in the KDoc, in some sane format so that they don't take too much space. But isn't that also an oversight in the (or in the usage of) Intellij? I'd assume it is frequent enough case that it should be possible to list all available members.
- Can strip that
- Don't quite know how to do this, but see below
- It is easier for generator, makes more sense (when overridden, the function doesn't visit some consts of type
T
, it visits all consts), IIRC it also allowed to remove some@Suppress("UNSAFE_CAST")
.
But why do you care about the breakage? At least regarding compiler plugins I was told not to worry too much about them as their still in alpha, I'd also expect their authors do look at some git diff of the tree structure between compiler versions anyway. - Same as above. This is generated code so it's not super easy to do such exceptions. Also note there is only one usage of this function in the entire codebase.
But since you want the generated code to be pretty, it should be easier to run some linter over it rather than applying manual corrections, which I don't think would help that much either (I've also dig onto some issue at kotlinpoet and they suggest the same approach). Preferably the same linter as for the rest of the project, right now there isn't one though, right? I've heard about some new service of IntelliJ to lint code outside of the IDE, might be helpful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also expect their authors do look at some git diff of the tree structure between compiler versions anyway.
Yes, it's true. But in current approach it's quite hard to understand whats going on as file history is broken. Same thing makes full review complicated. That blocks me (and I assume other folks) to perform it in proper way.
Change for 'visitConst' might be ok (as migration here is quite straightforward and it even keeps binary compatibility:)).
At least regarding compiler plugins I was told not to worry too much about them as their still in alpha,
It could be OK in many cases. But here (from my side) core part of IR lowering infrastructure is touched and I would prefer to have some migration mechanism to allow smooth migration (that would be much more important in future but not critical for this PR). I can be wrong but look like in this particular case with 'visitValueAccess' it will require 2-3 additional if
in code generator.
But since you want the generated code to be pretty, it should be easier to run some linter over it rather than applying manual corrections, which I don't think would help that much either (I've also dig onto some issue at kotlinpoet and they suggest the same approach).
Most likely you speak about something like code formatter. We used built in one in IDEA and most likely it could be used separately (but it could require some big dependencies that I'm not sure is good to have here). As was already pointed by @demiurg906 IR base declaration assumed to be read in different scenarios and "it's important to have this generated code written in general code style"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most likely you speak about something like code formatter.
Yes, in particular the newly announced qodana, though now that I looked it doesn't quite seem to support formatting.
So how about postprocessing with ktlint? It would most probably require a few new simple rules for best effect which can be easily plugged-in by creating new module alongside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same code style is important as was written before, but it's last priority concern that I have for this PR //cc @bashor
Once again there are a lot of files that is not recognized by git as modified and moved, but as deleted and newly added (it's breaks file history and most likely would cause problems with commits cherrypicking to 1.6.0/10 release branch that would applied over pr ones). You can manually move these files as is to new folders in master branch and commit them (0 commit), after that you can rebase (or apply patch from pr changes) your pr over 0 commit. (I don't know how you are familiar with git, maybe this article also would be helpful: https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History) |
I would prefer to have changes for non-generated files with |
Assuming you talk about moving the IrElement files from
So here it may make more sense to actually unmark some of the files git marked as moved for the sake of consistency. |
Ok, but that will produce a not compiling commit |
It's a good points but I'm afraid that such approach will give a lot frustration during review and many problems for this code clients that we have now (I haven't yet taken into account compatibility problems with changes in code itself). So I would prefer to keep matching at least for identically named files. |
Just for note: Ideal solution for me would be in splitting all changes into
and keep file (and class) history in more proper way |
735c16c
to
590273d
Compare
I've divided the commits, but I can only guarantee the last will compile. I also don't know how to mark files as moved in git - as far as I know git recognizes moves by itself and the only thing one can do is to change the threshold of the differences to count as a move. |
590273d
to
bedc050
Compare
Thank you for commit splitting, now logical structure of changes much more clearer! And thank you for additional notes in commit message! Why you think that smth will affect compilation (do you mean last commit or new commit structure)?
I know just straightforward approach with manual file movement to new folder that I've wrote above |
d532c7d won't compile for sure because the missing |
Yes, it would be compilation fail on warnings (though there are still several modules where this check is not yet enabled). |
bedc050
to
48c87b9
Compare
@mcpiroman Sorry for the enormous delay, and thank you for the great work. I agree with my colleagues that it'd be better to keep git history. I have a few ideas and can help you with that. I would also like to separate some changes to make the code easier to review. So if you rebase your branch on the current master, and enable "allow edits from maintainers" in this PR (might already be enabled, I don't know), I'd like to make a few changes to your branch to make it mergeable to the current Kotlin master. |
@bashor Thanks for the review! |
c5aa261
to
1e45101
Compare
I've pushed the first part to master, and rebased the PR on top of it. I'm still trying to figure out whether the aforementioned performance regression exists, so this is still in progress by me. In addition to things listed above, I've changed generator a bit following the internal feedback:
I've made these changes as separate commits just to make it clear what I changed, but will probably squash them later into the main commit. The whole tree is regenerated only once in the main commit. |
compiler/ir/ir.tree/tree-generator/src/org/jetbrains/kotlin/ir/generator/print/Elements.kt
Outdated
Show resolved
Hide resolved
compiler/ir/ir.tree/tree-generator/src/org/jetbrains/kotlin/ir/generator/IrTree.kt
Show resolved
Hide resolved
1e45101
to
eb3dbac
Compare
|
eb3dbac
to
a0b9a9c
Compare
I've made a few more changes. As before, I'm going to push the commits which are before the main one separately, and fixup the ones starting with "(fixup)" into the main one. Most notably, I've made Also, I've deduplicated the interface/abstract-class solver copied from FIR tree generator. |
a0b9a9c
to
d4d6f8d
Compare
Should I rebase somehow soon in case I have free time? |
@mcpiroman Sorry, this has taken a lot longer than it needed and it's completely my fault. I was hoping to do a few remaining changes before merging the PR, but then got caught up with other urgent stuff. I will rebase it myself this week, take another look at it, and hopefully proceed with merging the PR if there are no objections from the team. |
No warries, just thought that rebasing might be a blocker to you. |
e9df7cf
to
9c27470
Compare
compiler/ir/ir.tree/gen/org/jetbrains/kotlin/ir/declarations/IrProperty.kt
Outdated
Show resolved
Hide resolved
To help git understand that they are moved in the subsequent commit.
Co-authored-by: Alexander Udalov <alexander.udalov@jetbrains.com>
Changelog after internal review (hopefully the last one):
|
Thank you for the contribution! |
So now what about protected abstract val typeArgumentsByIndex: Array<IrType?>
protected abstract val argumentsByParameterIndex: Array<IrExpression?> To support autogeneration, those should be made a regular public 'MutableList's (via
abstract class IrMemberAccessExpression<S : IrSymbol> : IrDeclarationReference() {
...
fun getValueArgument(index: Int) = valueArguments[index]
fun getTypeArgument(index: Int) = typeArguments[index]
fun putValueArgument(index: Int, valueArgument: IrExpression?) {
valueArguments[index] = valueArgument
}
fun putTypeArgument(index: Int, type: IrType?) {
typeArguments[index] = type
}
val valueArgumentsCount get() = valueArguments.size
val typeArgumentsCount get() = typeArguments.size
}
override val argumentsByParameterIndex: Array<IrExpression?>
get() = throw UnsupportedOperationException("Property reference $symbol has no value arguments") To support walking, these throws now need to be replaced by empty list/array. But because those elements just don't have value arguments, I think it is better (more type-safe) to move that list from
val IrMemberAccessExpression<*>.typeArguments get() = List(typeArgumentsCount) { getTypeArgument(it) }` Those could be eliminated to use the now exposed list/array.
Now, it could be just left hand written or made dirty-special-cased in the generator, but for me its better long-term to have all the tree elements constrained in structure, as enforced by the generator ,especially that it is already done for all but this one. It would also much help with exploration of the ideas I wrote about here: https://youtrack.jetbrains.com/issue/KT-51427. WDYT? |
Discussion: https://youtrack.jetbrains.com/issue/KT-47294
The autogeneration is analogous to the one of the FIR tree, although, as suggested, it does not reuse much of the FIR's generator.
It covers
expressions
(except forIrMemberAccessExpression
),declarations
andvisitors
(except forIrElementTransformerVoid
).List of changes (I tried to keep them to minimum):
compiler:ir.tree:tree-generator
, which generates files intocompiler/ir/ir.tree/gen
.Generate IR tree
, analogous to theGenerate FIR tree
.compiler/ir/ir.tree/src/(declarations|expressions)
removed, except where they had some extension members on the corresponding element.IrMemberAccessExpression
to make it possible to autogen its child elements (see explanation: Autogenerate IR tree #4560 (comment)).IrStatementContainer
anIrElement
. It looks like it was just forgotten to mark it so before.ObsoleteDescriptorBasedAPI
is now applied consistently, which means I also had to add a bunch of@OptIn(ObsoleteDescriptorBasedAPI::class)
here and there throughout the codebase.accept
,transform
,acceptChildren
andtransformChildren
are now all in the base (ir.tree) layer, instead of being spread between base and impl - just as in FIR.fun <T> visitConst(expression: IrConst<T>)
tofun visitConst(expression: IrConst<*>)
. TheT
parameter was not used anywhere.visitVariableAccess(expression: IrValueAccessExpression)
tovisitValueAccess
to match the element's name.accept
methods now return more concrete types, so I removed now unnecessary casts.IrScript
changedprovidedProperties: List<Pair<IrValueParameter, IrPropertySymbol>>
to a parallel lists ofprovidedProperties
andprovidedPropertiesParameters
.