Skip to content

build: replace invalidJavaOnlyPatterns source-patterns with ast-grep #14812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 19, 2025

Conversation

rmuir
Copy link
Member

@rmuir rmuir commented Jun 18, 2025

The validation.source-patterns.gradle is intimidating, there's a lot going on here. Many of the rules are doing regex matches, which is tricky to maintain and can't give good error messages, autofixes in IDE, relevant explanations/URLs, etc.

Start with the 'invalidJavaOnlyPatterns' because it is one of many special cases, and better to just implement as ast-grep java rules.

The validation.source-patterns.gradle is intimidating, there's a lot
going on here. Many of the rules are doing regex matches, which is
tricky to maintain and can't give good error messages, autofixes in IDE,
relevant explanations/URLs, etc.

Start with the 'invalidJavaOnlyPatterns' because it is one of many
special cases, and better to just implement as ast-grep java rules.
Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@rmuir
Copy link
Member Author

rmuir commented Jun 18, 2025

This is just starting small. I really want to next clean up some of the rat interaction in this file, especially given the changing APIs and thread-unsafeness that makes the gradle code complex.

A lot of the logic around license/headers can be done here instead, especially since it isn't a "legal" check (rat-sources does that). I already use queries to match these headers in my editor and hide, e.g.:

;; fold and initially close license headers
(program
  (block_comment) @fold @foldclose
  (#lua-match? @foldclose "^/[*%s]*Licensed to the Apache.*"))

We can do similar stuff with yaml rules and ensure licenses aren't javadoc, come before package declaration, whatever all is going on here, and simplify the build.

rule:
pattern: import java.lang.$REF
kind: import_declaration
fix: ""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example of rule with a "fix". If it is safe and easy, it can make things nice, especially if rule is on the nitpicky side.

  • in editor it will show up as light-bulb, user can choose fix as a code action.
  • scan -U could be integrated if we wanted for some reason. though i'd recommend "tidy" after any such thing.
  • error messages improve, the problem+fix displays as diff, which helps clarify the issue:
> Task :applyAstGrepRules
lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
error[java-lang-import]: unnecessary import of `String` from java.lang
@@ -20,7 +20,7 @@
21 21│ 
22 22│ import java.io.Closeable;
23 23│ import java.io.IOException;
24   │-import java.lang.String;
   24│+
25 25│ import java.time.Instant;
26 26│ import java.util.ArrayDeque;
27 27│ import java.util.ArrayList;
Note:
classes in java.lang are implicitly imported

Error: 1 error(s) found in code.
Help: Scan succeeded and found error level diagnostics in the codebase.

> Task :applyAstGrepRules FAILED

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in editor it will show up as light-bulb, user can choose fix as a code action.

Can you elaborate on what's your toolstack? Is this the lang server in the background? What editor is it? Just curious - it looks very interesting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nvim + lang server. the ast-grep lang-server will provide diagnostics and code action fixes.

you can get similar setup easily / without hassle with just vscode + lang server. the help/IDEs has the setup procedure. In this case you would also want to install and configure ast-grep extension.

Comment on lines 18 to 21
# does the javac compiler allow this? did it ever?
rule:
pattern: var $$$ = new $$$<>($$$)
kind: local_variable_declaration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can actually be valid code, as in:

    var list = new ArrayList<String>();
    var foo = new ArrayList<>(list);

which compiles just fine but breaks the rule. This pattern doesn't exist anywhere in Lucene - I'm actually surprised it doesn't - I use this quite a lot elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dweiss there's not a problem here: the first line is valid. the second line is invalid. same as with the source-pattern today.

i'm not arguing for what the rule is trying to achieve, just trying to replace the regex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your explanation: I removed the TODO and renamed rule from "impossible" to "confusing" :)

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@rmuir rmuir merged commit 3a57b73 into apache:main Jun 19, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants