Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kotlin] Add support for Kotlin #419

Closed
SUPERCILEX opened this issue May 31, 2017 · 53 comments · Fixed by #3808
Closed

[kotlin] Add support for Kotlin #419

SUPERCILEX opened this issue May 31, 2017 · 53 comments · Fixed by #3808
Labels
a:new-language Proposal to add a new PMD or CPD language to the main distribution a:suggestion An idea, with little analysis on feasibility, to be considered
Milestone

Comments

@SUPERCILEX
Copy link
Contributor

Since Google announced official support for Kotlin, a lot more people are going to be using the language (including me 😀) so it would be nice to have some rules for it.

@jsotuyod jsotuyod added a:new-language Proposal to add a new PMD or CPD language to the main distribution a:suggestion An idea, with little analysis on feasibility, to be considered labels May 31, 2017
@vanniktech
Copy link
Contributor

As a side note. Lint will start operating on UAST (Universal Abstract Syntax Tree) which means, that the Lint rules will also work with Kotlin. It might be possible to achieve the same here since some of the rules are working on an abstraction of the syntax, if not all (not that familiar with the internals of PMD).

@ryan-gustafson
Copy link
Contributor

ryan-gustafson commented Jun 1, 2017

The method for adding new language support to PMD is well established. The particulars are unique to the language however. PMD is basically an AST visitation framework, with Rules being unique visitation patterns. Some languages have custom parsers/AST created (e.g. Java), while others leverage available object graphs to wrap as AST (e.g. XML DOM). If a canonical parser/AST is available this could be used to quickly bootstrap Kotlin language support. Since languages evolve over time, it is useful for the AST to be relatively stable across versions of the language, or Rules may stop working.

A general outline of this process is here: https://pmd.github.io/pmd-5.7.0/customizing/new-language.html

AFAIK, there is not currently support for meta-language Rules, so if you want a Rule that works for Java to work for Kotlin, you'd need to re-implement it for Kotlin.

@jsotuyod
Copy link
Member

jsotuyod commented Aug 2, 2017

Rules ideas for the day we get this on:

https://ktlint.github.io/#rules
https://github.com/arturbosch/detekt#rulesets

@xanderblinov
Copy link

xanderblinov commented Jun 18, 2018

Hi guys, any news here? Are you planing to do it?

@jsotuyod
Copy link
Member

@xanderblinov this is still on the roadmap. As previously stated, this needs either:

  1. Full support for Antlr grammars (hopefully part of PMD 7 /cc @matifraga)
  2. An abstraction on top of Antlr, similar to what we did for Apex

We are inclined on 1 as it's a one time cost for multiple languages we want to fully support (swift, kotlin, go...), while 2 is an effort needed per language. However, PRs are always welcomed to help get this shipped faster.

If anyone wants to contribute, reach out to me on gitter https://gitter.im/pmd/pmd

@yanex
Copy link

yanex commented Jun 25, 2018

Kotlin developer here.
What are you willing to do with just an ANTLR grammar? IMO any sensible static analysis will be impossible without the type/reference resolution support. And trying to "emulate" (re-implement) it is really a bad idea. It's a very complex task, and the result won't reflect the reality anyway.
I suggest you call Kotlin compiler directly or use UAST.

@xanderblinov
Copy link

@jsotuyod what do you think about directly complier call and using UAST?

@jsotuyod
Copy link
Member

jsotuyod commented Jul 14, 2018

I'm hesitant to do so. It would require PMD to bundle the kotlin compiler, which is dependent on the Kotlin version in use (experimental features may break even in minor versions, and there are no compatibility promises for an eventual Kotlin 2 release, as per https://kotlinlang.org/docs/reference/compatibility.html).

As far as I know, the Kotlin compiler, unlike the Java compiler, can't be told "this code is in version X", so supporting multiple Kotlin versions require bundling different versions of the compiler, which in turn means a really heavyweight PMD binary, and not doing so means forcing users to stick to an outdated PMD version or upgrade their codebases to whatever Kotlin version we decide to support.

Unless IntelliJ decides to join forces with us (I'd love to!), the same way Salesforce did for Apex / VisualForce support, and work with us to avoid these issues, I've some doubts this approach will be sustainable.

Using the Kotlin parser to get the UAST still needs mapping those nodes to our own Node structures for the pmd-core to be able to work on them. This is exactly what we do on Salesforce Apex code for instance, so the main difference is wether we use a grammar implementation from someone else, or our own. Both approaches have their benefits and their cons, I don't think one is clearly better over the other, so I'm ok with either on this particular aspect and it's a tradeoff I think we can work with.

On the other hand, we do have type and reference resolution in place for JVM-based languages. I know Kotlin can compile to other platforms, but probably most of our users for the time being are focused on Android support, so we are probably just fine were we stand at the moment.

@jsotuyod jsotuyod added a:suggestion An idea, with little analysis on feasibility, to be considered and removed a:suggestion An idea, with little analysis on feasibility, to be considered labels Jul 16, 2018
@JLLeitschuh
Copy link

JLLeitschuh commented Jul 31, 2018

This is technically a "Fork" of the work done by Poeschl which is, itself, a fork of the
original project pmd-kotlin from jk1.

People are welcome to expand upon this project and make the tests pass. I had to kill a lot of the old tests because the compiler API's that were used to make the checks were removed/changed.

I'd like to publish this project eventually so I can use the CPD feature in my companies builds.

I'm also happy to roll this into the PMD project if there's an interest.

https://github.com/JLLeitschuh/pmd-kotlin

@Zhuinden
Copy link

Zhuinden commented Jan 2, 2019

Do I understand correctly that the latest PMD release (6.10.0) supports CPD for Kotlin? 🙏

@jsotuyod
Copy link
Member

jsotuyod commented Jan 2, 2019

@Zhuinden yes, you understand perfectly.

@jsotuyod
Copy link
Member

For everyone following this, starting from PMD 6.13.0 the CPD support for Kotlin is now complete, which means it now supports suppressions through comments and CPD-OFF / CPD-ON. Check the docs for more detail.

Full Kotlin support for PMD rules will probably be part of PMD 7.0.0 (although we may have no actual rules at that point yet, but contributions are welcomed)

@matifraga
Copy link
Contributor

We will also add a wiki page so you can enable full CPD support for your second favourite language ❤️ in a few steps

@adangel adangel removed the has:pr label Apr 27, 2020
@Jeeppler
Copy link

@jsotuyod are there any rules for Kotlin? What one has to do to add new rules for Kotlin?

@jborgers
Copy link
Contributor

Curious about the status of Kotlin support by PMD. Is it going to be in 7.0? Is there an early version to try out?

@adangel
Copy link
Member

adangel commented Jun 15, 2021

If someone wants to start the Kotlin for PMD approach, I would suggest to use PMD 7 as the basis, as there we have support for Antlr-based grammars. There is now a official antlr-grammar available for Kotlin - https://github.com/Kotlin/kotlin-spec
This could be used to add full support with moderate effort.

Having support for a language is one piece, the other piece would be to provide interesting and useful rules.

@jborgers
Copy link
Contributor

We are looking into this. What would be a good way to contribute? Open an issue to discuss it?

@adangel
Copy link
Member

adangel commented Sep 2, 2021

Thanks for your offer. I think, this issue here is the issue to discuss adding support for Kotlin.
As described in my earlier comment, I'd suggest to integrate it into PMD 7 - since we have there full antlr support. Otherwise you would need to reimplement Kotlin grammar in javacc.

PMD 7 development happens in the branch pmd/7.0.x - so make sure to base your PRs against this branch (and not master).

In PMD 7, there is one language that uses Antlr, where you can have a look, how this is integrated: Swift.
The source for that can be found here: https://github.com/pmd/pmd/tree/pmd/7.0.x/pmd-swift/src/main/java/net/sourceforge/pmd/lang/swift

The PMD 7 documentation page is here: https://docs.pmd-code.org/pmd-doc-7.0.0-SNAPSHOT/ - e.g. Swift has 4 rules.

The Antlr integration documentation is not integrated yet - so the docs Adding PMD support for a new language only describes the javacc way. But there is also a pull request #1881 which contains the new doc for Antlr-based languages.

@jborgers
Copy link
Contributor

jborgers commented Sep 3, 2021 via email

@adangel
Copy link
Member

adangel commented Sep 4, 2021

@jborgers I've updated the documentation and merged the PR. The doc is now available at https://docs.pmd-code.org/pmd-doc-7.0.0-SNAPSHOT/pmd_devdocs_major_adding_new_language_antlr.html

Please let us know, if there is anything wrong/unclear/doesn't work.

Also make sure to clone pmd/pmd - be careful when cloning forks, they might not be up-to-date...

@jborgers
Copy link
Contributor

jborgers commented Sep 6, 2021

@adangel thanks, this is really helpful.

How can I provide feedback on typo's/unclearness etc.?

Yes, using my fresh fork now (jborgers/pmd), branch 7.0

@jborgers
Copy link
Contributor

jborgers commented Sep 14, 2021

Interesting suggestion. I just did. The parser grammar needs to be the same as the file name, so Kotlin instead of KotlinParser. The result is that indeed the KotlinVisitor etc. classes have the right name, however, the KotlinParser class is now called Kotlin.
Renaming that to KotlinParser and running class generation again without clean, gets me further.

So maybe this renaming needs to be automated in one of the scripts or a pom?

@jborgers
Copy link
Contributor

jborgers commented Sep 14, 2021

My generated KotlinParser now has errors: static classes extending org.antlr.v4.runtime.ParserRuleContext call super.acceptVisitor(visitor, data) while this method does not exist.
There is an accept(visitor) however in org.antlr.v4.runtime.RuleContext.
And I see an acceptVisitor(visitor, data) in KotlinInnerNode.

In SwiftParser, I see the static classes extend Swift specific classes. That didn't get generated for Kotlin(Parser).

@jborgers
Copy link
Contributor

@adangel @oowekyala Help appreciated, I am stuck.

@oowekyala
Copy link
Member

I'm not sure where you pushed your changes. Could you open a draft PR with the relevant branch, and give us write access (a checkbox when you open the PR)? This way we could take a look easily

@jborgers jborgers mentioned this issue Sep 15, 2021
4 tasks
@jborgers
Copy link
Contributor

Okay, I just did. Let me know if easily taking a look works now

@jborgers
Copy link
Contributor

jborgers commented Sep 16, 2021

Great, it seems to work now! With generated files, all compiles.
What did you do? Does the documentation need to be updated, the steps?
Could we get the designer tool to work as well?

@oowekyala
Copy link
Member

What did you do? Does the documentation need to be updated, the steps?

I mostly just fixed the Ant script, I'll update the docs later.

Could we get the designer tool to work as well?

The designer should just work, but you'll have to follow the instructions at the top of this page to use it with pmd 7.

@jborgers
Copy link
Contributor

jborgers commented Sep 27, 2021

We could eventually managed to get designer working. We for instance also needed build-tools and IntelliJ was hanging on pmd-designer.

image
Example rule: AvoidShortFunctionNames

@jborgers
Copy link
Contributor

jborgers commented Sep 27, 2021

But something seems wrong in the grammar, we get a very deeply nested syntax tree for the simplest expressions:
image
Not sure how to proceed, help appreciated.

@adangel
Copy link
Member

adangel commented Sep 27, 2021

The syntax tree is correct - that's how the grammar is written. @oowekyala made me aware, that ANTLR doesn't create a abstract syntax tree but instead creates a parse tree. And that is, what you are seeing. Basically everything down to the tokens is a node in the tree.

What we would need is some kind of post-processing of the parse tree, to make it look nicer and simpler to use for static analysis. E.g. the name of the function would be a perfect fit for an attribute on the node "FunctionDeclaration" (instead of having the node "T-Identifier").

The deep tree also comes from the expression production. In the Javacc grammar for Java, we already simplify this part of the tree with the help of a javacc feature (conditional tree - it creates only a tree node if there are >1 nodes).

While the tree doesn't look as good as it could, it's still usable. Most important is, that you write tests for the rules, you create (see https://pmd.github.io/latest/pmd_userdocs_extending_testing.html) because then you can ensure that the rules work in case we improve the tree later.

@stokpop
Copy link

stokpop commented Sep 29, 2021

Thanks. Yes, we will create tests for the rules, like we have for our performance Java rules.

Starting with the designer to create the first Kotlin XPath rule, we have some questions:

  1. In our Java XPath rules we use the typeIs() function a lot, as in pmd-java:typeIs('java.lang.String'). Can we use the same function in Kotlin XPath rules? Maybe pmd-kotlin:typeIs() is available or should be created?

Update: it seems doable to create a custom XPath function for kotlin in a similar way as in net.sourceforge.pmd.lang.java.rule.xpath.internal.BaseContextNodeTestFun. Only the net.sourceforge.pmd.lang.java.types.TypeTestUtil#isA(...) method used to check the java type seems more difficult to implement for kotlin. It makes use of a JTypeMirror interface. Question: where do these J* interfaces come from? Can we have something similar for kotlin? If not, what would be a way to check types in/for kotlin?

  1. XPath attributes are really useful for our XPath rule creation. For example, in Java rules we use @Image tags often. The PMD designer shows that the @Image attribute and @joinTokenText are deprecated. @Text is available, but only at the lowest level, like T-Identifier. So for matching an import package: //ImportList/ImportHeader/Identifier[starts-with(@joinTokenText, "com.netflix.hystrix")]. The @joinTokenText seems very convenient. Is there an easy alternative?

  2. The Kotlin code below gives Errors in the AST. Also in the ANLTR plugin in intellij there are errors starting at the first {. Could this be an issue in the Kotlin antlr definition. Maybe similar to this: Grammar: empty body after delegate parsed as lambda expression Kotlin/kotlin-spec#69. Should we raise an issue in the kotlin-spec project for this?

object TestFoo {
    fun testBar() = 1
}

@jborgers
Copy link
Contributor

jborgers commented Oct 4, 2021

@adangel @oowekyala any ideas?

@oowekyala
Copy link
Member

oowekyala commented Oct 4, 2021

I'm afraid all of those are actually very real limitations, for which I have no good solution...

  1. the typeIs function in pmd-java uses our own implementation of a symbol and type resolver. This is a complicated component as it has to eg perform type inference. The Java type resolver in pmd 7, which was rewritten from scratch, is 40 000+ lines, excluding its tests. Building the same thing for Kotlin would be an incredibly (impossibly) massive undertaking. Kotlin has much more complicated scoping rules and type inference, it even has flow-sensitive types. I can only say that building and maintaining such a component is completely out of scope of PMD as I currently understand it. So supporting a typeIs function for kotlin would need, I think, maybe an integration with the kotlin compiler.
  2. Attributes are also something we implement manually, in javacc languages. Antlr does not let us replace or add members to the classes it generates. The recommended way to use antlr is in fact to build (manually) another AST on top of the parse tree. In those AST classes, we could write and expose our own attributes. That would also fix the problem that the tree is too big and has irrelevant nodes, as you pointed out. However, this is not how PMD's Antlr integration is using Antlr. [core] Node support for Antlr-based languages #1658 was based on the assumption that we could use Antlr's parse tree directly as an AST, and in retrospect I think it was pretty naive, and that continuing down this route will only cause more problems.
    One alternative would be to say that @Image and @joinTokenText are not deprecated in Antlr, because we don't have anything better anyway... But this would be setting up Antlr language modules to be second-class citizens and is honestly not a decent solution.
    Another alternative would be to manually add attributes to antlr nodes using a visitor. That would actually be pretty easy, but would only improve the situation for XPath rules. We could rely on utility classes/kotlin extensions to add API to nodes for non-xpath rules to use. But the tree would still be horribly big and undocumented.
    I'm afraid none of this sounds very compelling to me...
  3. This is probably a problem with the Antlr grammar and yes, it should be reported to kotlin-spec

@stokpop
Copy link

stokpop commented Oct 6, 2021

@oowekyala thanks for your elaborate answer!

As there are not many options for point 1 and 2, we will start by creating a very simple XPath rule and make some unit tests.
We will also investigate one of our Java rules and see if we can get a working prototype for kotlin with the current limitations.

For point 3: I checked the same kotlin code in the KotlinParser.g4 file of https://github.com/Kotlin/kotlin-spec and that parses without errors. So there might be a diff somewhere: replacing the verbatim list of unicodes with import UnicodeClasses; in the KotlinLexer.g4 makes it work without error in the antlr plugin in Intellij. Must be a reason why that list was not used via the import? The error in manven with import is: "can't find or load grammar". I managed to get it working using the import by using a <libDirectory> in the antlr4-maven-plugin. The errors are now also gone in the pmd-designer.

@jborgers
Copy link
Contributor

jborgers commented Oct 7, 2021

That was my doing, putting everything in one .g4 file like for Swift. It seemed to work. Great you solved it.

@stokpop
Copy link

stokpop commented Oct 18, 2021

Update:

  • added a simple rule XPath rule "FunctionNameTooShort" with a working unit test in net.sourceforge.pmd.lang.kotlin.rule.bestpractices.FunctionNameTooShortTest
  • added a working unit test for the KotlinParser in net.sourceforge.pmd.lang.kotlin.ast.KotlinParserTests
  • added pmd-kotlin:hasChildren(1) XPath function as PoC in net.sourceforge.pmd.lang.kotlin.rule.xpath.internal.BaseContextNodeTestFun

Next, we will try to migrate one of our Java performance rules into a Kotlin rule and we will check if we can call the "FunctionNameTooShort" from existing tools/plugins, e.g. in Sonar.

@jborgers
Copy link
Contributor

We are still working on this, see: [kotlin] Kotlin support #3505

@adangel adangel mentioned this issue Feb 25, 2022
8 tasks
@adangel adangel linked a pull request Jun 30, 2022 that will close this issue
8 tasks
@adangel adangel added this to the 7.0.0 milestone Jun 30, 2022
adangel added a commit to adangel/pmd that referenced this issue Jun 30, 2022
@adangel
Copy link
Member

adangel commented Jan 17, 2023

Status update for this issue:

  • The "basic" Kotlin support has been merged into pmd/7.0.x branch
  • It will be part of PMD 7
  • There are certain limitations currently that need to be improved later on

I'll create issues for these points and close this one as completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:new-language Proposal to add a new PMD or CPD language to the main distribution a:suggestion An idea, with little analysis on feasibility, to be considered
Projects
None yet
Development

Successfully merging a pull request may close this issue.