Skip to content

Checkstyle: Add checkstyle rule for %d in Preconditions.checkArgument#5057

Merged
szehon-ho merged 3 commits intoapache:masterfrom
xrl1:checkstyle-preconditions-check-argument-int-format
Jun 27, 2022
Merged

Checkstyle: Add checkstyle rule for %d in Preconditions.checkArgument#5057
szehon-ho merged 3 commits intoapache:masterfrom
xrl1:checkstyle-preconditions-check-argument-int-format

Conversation

@xrl1
Copy link
Contributor

@xrl1 xrl1 commented Jun 15, 2022

Closes #4618

From the issue:
Preconditions.checkArgument only works with %s, as %d does not parse and returns a bad error message.

Created a multiline regex to prevent this.

The regex is as follows:

Preconditions\.checkArgument\([^;]+%d[^;]+\);

The logic behind it: Find the function call, then any characters (except ;) until %d, and finally search until the function closes and a ';'.

  • Used [^;]* instead of .*, because this is a multiline search and the dotall regex param (enabled by matchAcrossLines config) is aggressive and this is the best way to break its search.

Also, fixed %d usages that the Checkstyle rule found

@szehon-ho
Copy link
Member

Looks like some violation in Spark 3.0? Also fyi , @kbendick

@xrl1 xrl1 force-pushed the checkstyle-preconditions-check-argument-int-format branch from dc65746 to d5aceef Compare June 22, 2022 07:28
@xrl1
Copy link
Contributor Author

xrl1 commented Jun 22, 2022

@szehon-ho I committed a fixup to the violation (missed it while running the checkstyle locally in the IDE for some reason)

@szehon-ho
Copy link
Member

Sounds good, sorry i am pretty bad at regex, what do you mean:

Used [^;]* instead of .*, because this is a multiline search and the dotall regex param (enabled by matchAcrossLines config) is aggressive and this is the best way to break its search.

It can be removed and the "\s+" can be enough

The [^;]* can be removed? The purpose now is to not match %d in the Precondition.checkArgument()'s first argument (or is it for something else)?

@xrl1
Copy link
Contributor Author

xrl1 commented Jun 23, 2022

@szehon-ho I'll explain a little bit more:
As requested, I used a multi-line regex search as asked, and I set matchAcrossLines to true to enable "DOTALL flag" as we don't know on which line the "%d" expression will appear after the function call.
Once this value is set, if I insert .* before the %d in the expression (to match the first argument to checkArgument, and not only to skip %d in it as you asked), it will search for matches for the second part of the regex, in all the file, even after checkArugment statement.
For example:

Preconditions.checkArgument(partitionFields.size() >= 1, "some text %s", 1);
someOtherFunc("%d", 100);

might be mistakenly matched.

To avoid matching it, the [^;] will terminate the search upon the first ; it encounters, and problem solved.

@xrl1 xrl1 force-pushed the checkstyle-preconditions-check-argument-int-format branch from d5aceef to e1e1208 Compare June 23, 2022 09:09
@xrl1
Copy link
Contributor Author

xrl1 commented Jun 23, 2022

Updated the PR again, and now also tested the build locally to pass, sorry for that.

@szehon-ho
Copy link
Member

@xrl1 thanks for the explanation for those points. How about the last point:

Added the quote and comma to avoid matching the first argument, but I can simplify the regex and not handle the edge cases where there is "%d" in the first argument

Do you think we should simplify this? I don't see much possibility of %d in the first argument of Precondition (which is evaluating a boolean), other than a modulo operation but in that case there is a space after %. Or let me know if I missed what you mean.

xrl1 added 3 commits June 25, 2022 16:16
Preconditions.checkArgument only works with %s, as %d does not parse and returns a bad error message
Added a RegexpMultiline to find wrong usages.
@xrl1 xrl1 force-pushed the checkstyle-preconditions-check-argument-int-format branch from e1e1208 to 2154578 Compare June 25, 2022 13:17
@xrl1
Copy link
Contributor Author

xrl1 commented Jun 25, 2022

@szehon-ho Yeah you are right, it is really unlikely to encounter %d in the first argument
I simplified it to:

Preconditions\.checkArgument\([^;]+%d[^;]+\);

@szehon-ho
Copy link
Member

Great, thanks for spending time on this. Will merge once it passes

Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@kbendick
Copy link
Contributor

@szehon-ho Yeah you are right, it is really unlikely to encounter %d in the first argument I simplified it to:

Preconditions\.checkArgument\([^;]+%d[^;]+\);

Could you update the PR description with the explanation of this regex (so we don't have to look it up in the conversation) @xrl1? I believe the original regex is still there.

@xrl1
Copy link
Contributor Author

xrl1 commented Jun 27, 2022

@kbendick Yeah sure, updated.

@szehon-ho szehon-ho merged commit 4956ec6 into apache:master Jun 27, 2022
@szehon-ho
Copy link
Member

Thanks @xrl1 and @kbendick for additional review

namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022
namrathamyske pushed a commit to namrathamyske/iceberg that referenced this pull request Jul 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add checkstyle rule for %d in Preconditions.checkArgument

3 participants