Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve error-prone configuration for int-overflow bugs #11910

Closed
rmuir opened this issue Nov 9, 2022 · 10 comments · Fixed by #11923
Closed

improve error-prone configuration for int-overflow bugs #11910

rmuir opened this issue Nov 9, 2022 · 10 comments · Fixed by #11923

Comments

@rmuir
Copy link
Member

rmuir commented Nov 9, 2022

Description

As mentioned by @dweiss and @benwtrent in #11905, some static analysis could help here. Maybe it can force us to do a pass reviewing other sketchy possible bugs like this in the codebase, and help prevent new ones from popping up in the future.

I already looked into ecj (including their latest mainline) and don't see any applicable checks. So, let's look into error-prone?

@rmuir rmuir added the type:task label Nov 9, 2022
@benwtrent
Copy link
Member

I think static analysis will be a significant help.

So, I did a quick check at turning on IntLongMath for error-prone and it is noisy.

Its flagging things like this as bad behavior:

long function(int v) {
   return v - 2;
}

or this

long foo() {
  return 2 * 2
}

As an aside, almost every single long ramBytesUsed() is flagged as auto-cast since it returns long but there are things like 2 * Integer.BYTE_VALUE.

We may need something more nuanced or focused besides error-prone

@rmuir
Copy link
Member Author

rmuir commented Nov 9, 2022

maybe we could even turn it on, and dump its output here as a one-time thing? we can just take a pass through it, and filter out the noisy ones.

I realize it isn't ideal and doesn't improve the build, but it still helps us try to look for similar bugs.

@dweiss
Copy link
Contributor

dweiss commented Nov 9, 2022

I can understand where it signals a problem but can't determine the domain (the v - 2 can underflow if you pass v small enough)... The 2* 2 case is odd, I'm surprised it doesn't see a constant there.

I'm not a big fan of many of those tools - I agree they're noisy - but from time to time they do find real problems.

@rmuir rmuir added this to the 9.4.2 milestone Nov 10, 2022
@rmuir
Copy link
Member Author

rmuir commented Nov 11, 2022

I'm not looking into IntLongMath check but instead a couple targeted checks:

These checks are finding some interesting stuff.

I wish i knew a way to make error-prone show all the failures (like javac would). Instead the plugin fails on the first error and I have to fix that one and run it again (slowly) to see the next. Its brutally painful.

@risdenk
Copy link
Contributor

risdenk commented Nov 11, 2022

I wish i knew a way to make error-prone show all the failures (like javac would). Instead the plugin fails on the first error and I have to fix that one and run it again (slowly) to see the next. Its brutally painful.

So I'm 95% sure this is possible. There are two parts to it.

  • Make errorprone only report as warnings
  • don't fail the gradle build on warnings

I was able to get the first part to work by adding -XepAllErrorsAsWarnings in gradle/validation/error-prone.gradle. I gave up trying to get gradle build to stop failing on warnings. I'm sure there is a way to do it.

PS FWIW I enabled a whole bunch of errorprone rules in Solr recently:

@dweiss
Copy link
Contributor

dweiss commented Nov 11, 2022

-Pjavac.failOnWarnings=false

should do it. Look at javac.gradle, we specifically enable -Werror:

      if (propertyOrDefault("javac.failOnWarnings", true).toBoolean()) {
        options.compilerArgs += "-Werror"
      }

@risdenk
Copy link
Contributor

risdenk commented Nov 11, 2022

Example of doing this in PR here: #11921

run with:

./gradlew check -x test -Pvalidation.errorprone=true -Pjavac.failOnWarnings=false -Pvalidation.git.failOnModified=false

should do what you want.

@risdenk
Copy link
Contributor

risdenk commented Nov 11, 2022

I put the output for enabling IntLongMath in the PR comment: #11921 (comment)

@rmuir
Copy link
Member Author

rmuir commented Nov 11, 2022

OK i checked out old git hash before #11905 commit, seems like "NarrowCalculation" is the best one? I think i made a mistake trying to turn on too many checks at once.

I enabled the check like this:

diff --git a/gradle/validation/error-prone.gradle b/gradle/validation/error-prone.gradle
index 962e237cc03..587c0611ffe 100644
--- a/gradle/validation/error-prone.gradle
+++ b/gradle/validation/error-prone.gradle
@@ -68,6 +68,7 @@ allprojects { prj ->

         options.errorprone.disableWarningsInGeneratedCode = true
         options.errorprone.errorproneArgs = [
+            '-XepAllErrorsAsWarnings',
             '-Xep:InlineMeSuggester:OFF', // We don't use this annotation

             // test
@@ -142,7 +143,6 @@ allprojects { prj ->
             '-Xep:ModifiedButNotUsed:OFF',
             '-Xep:MutablePublicArray:OFF',
             '-Xep:NarrowingCompoundAssignment:OFF',
-            '-Xep:NarrowCalculation:OFF',
             '-Xep:NonAtomicVolatileUpdate:OFF',
             '-Xep:NonCanonicalType:OFF',
             '-Xep:ObjectToString:OFF',

And I run checker with ./gradlew assemble -Pvalidation.errorprone=true -Pjavac.failOnWarnings=false > ~/overflow.log 2>&1

Here's what this check said for the old buggy code with overflow, it seems to find it?:

/home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java:402: warning: [NarrowCalculation] This product of integers could overflow before being implicitly
 cast to a long.
          graphOffsetsByLevel[level] = (1 + (M * 2)) * Integer.BYTES * numNodesOnLevel0;
                                                                     ^
    (see https://errorprone.info/bugpattern/NarrowCalculation)
  Did you mean 'graphOffsetsByLevel[level] = (1 + (M * 2)) * Integer.BYTES * ((long) numNodesOnLevel0);'?
/home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java:406: warning: [NarrowCalculation] This product of integers could overflow before being implicitly
 cast to a long.
              graphOffsetsByLevel[level - 1] + (1 + M) * Integer.BYTES * numNodesOnPrevLevel;
                                                                       ^
    (see https://errorprone.info/bugpattern/NarrowCalculation)
  Did you mean 'graphOffsetsByLevel[level - 1] + (1 + M) * Integer.BYTES * ((long) numNodesOnPrevLevel);'?
/home/rmuir/workspace/lucene/lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94HnswVectorsReader.java:440: warning: [NarrowCalculation] This product of integers could overflow before being implicitly cast to a long.
      this.bytesForConns0 = ((long) (entry.M * 2) + 1) * Integer.BYTES;
                                             ^
    (see https://errorprone.info/bugpattern/NarrowCalculation)
  Did you mean 'this.bytesForConns0 = ((long) (entry.M * 2L) + 1) * Integer.BYTES;'?

Here's the log file against current main branch.

overflow.log

This is about as narrow as I can get, as far as a check that would find the previous bug, to look for similar bugs and reduce the noise. I think this check looks worth enabling, we can just add some (long) casts to the rambytesUsed and change some x / 1024 / 1024. to be x / 1024. / 1024. in the merge policy stuff to address those cases.

Turning on other checks such as narrow-compound-assignment would be good and reveals even more interesting stuff but its a separate amount of work.

@rmuir
Copy link
Member Author

rmuir commented Nov 11, 2022

i've worked thru src/java and now i gotta deal with src/test, then ill make a PR so we can see what it looks like

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants