Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24452][SQL][Core] Avoid possible overflow in int add or multiple #21481

Closed
wants to merge 3 commits into from

Conversation

kiszk
Copy link
Member

@kiszk kiszk commented Jun 1, 2018

What changes were proposed in this pull request?

This PR fixes possible overflow in int add or multiply. In particular, their overflows in multiply are detected by Spotbugs

The following assignments may cause overflow in right hand side. As a result, the result may be negative.

long = int * int
long = int + int

To avoid this problem, this PR performs cast from int to long in right hand side.

How was this patch tested?

Existing UTs.

@SparkQA
Copy link

SparkQA commented Jun 2, 2018

Test build #91404 has finished for PR 21481 at commit 324fd5c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented Jun 2, 2018

cc @ueshin @hvanhovell

@kiszk
Copy link
Member Author

kiszk commented Jun 2, 2018

cc @cloud-fan

@JoshRosen
Copy link
Contributor

Hey @kiszk, thanks for tracking this down. This change looks good to me.

I have a couple of questions, mostly aimed towards figuring out how we can categorically solve this problem:

  1. What's the impact of this issue? Have you observed actual crashes or silent data corruption caused by this problem? I ask because it looks like this could be a plausible cause of a data corruption bug that I've been investigating.
  2. Are these five files the only occurrence of this problem or are there potentially others? I've noticed that all of the files modified here are .java files: is that a coincidence or is that the result of running some Java linting tool? If the latter, is it possible that we have similar issues in other files which we haven't found yet?
  3. Do you have any ideas for how we can categorically prevent this class of problem in the future? Are there compiler plugins or linters which can warn on these cases and turn them into compile-time errors? Or code-review checklists that we can employ to more easily spot these potential overflows? This is a hard problem, but I think it's worth brainstorming on categorical solutions since this isn't the first time we've hit this class of problem (but hopefully it can be the last!).

I think we should definitely backport this fix, at least to 2.3 and possibly earlier.

@cloud-fan
Copy link
Contributor

+1 on @JoshRosen 's ideas.

I've already seen several overflow fixes from @kiszk , it will be good if we have some tools to check it, even we need to run the tool manually. One idea may be to force to use java 8's safe math functions: https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#addExact-int-int-

@kiszk
Copy link
Member Author

kiszk commented Jun 3, 2018

Good questions.

For 2, at first I found one of these issues when I looked at a file. Then, I ran grep command with long .*=.*\* and long .*=.*\+ in .java file. Then, I picked them up manually. It looks labor-intensive.

For 3, here is my thought.
SpotBugs may be a good candidate to check it.
SpotBug is a successor of findBugs. When I ran FindBugs before, I found some problems regarding possible overflow and then made a PR that was integrated. On the other hand, these issues may not be detected at that time.

I will look at SpotBugs after my presentation at SAIS will be finished :)

@kiszk
Copy link
Member Author

kiszk commented Jun 8, 2018

@JoshRosen @cloud-fan
Here is an update.

I have just apply findBugs to OffHeapColumnVector.java and UnsafeArrayData.java.
In OffHeapColumnVector.java, most of possible overflows are detected. But, not all.
In UnsafeArrayData.java, two possible overflows are detected. Line 86 and 456. I overlooked Line 86.

@cloud-fan
Copy link
Contributor

is findBugs available for scala code as well?

@kiszk
Copy link
Member Author

kiszk commented Jun 8, 2018

Since it is Java bytecode analysis, it is available for Scala code, too.
In my quick test, findBugs overlooked a possible overflow. On the other hand, findBugs found another redundant null check.

@kiszk
Copy link
Member Author

kiszk commented Jun 10, 2018

Since I found an plug-in for maven, I will also include a patch to add findBugs/SpotBugs into maven in this PR.

@JoshRosen
Copy link
Contributor

Let's merge this as-is and do the build improvements in a separate PR. That's important because we may want to backport the overflow fix to maintenance branches and may want to do so independent of the build changes.

@kiszk
Copy link
Member Author

kiszk commented Jun 11, 2018

Thank you for your comment. I will create another PR for integrating findBugs/SpotBugs into maven.

@@ -703,7 +703,7 @@ public boolean append(Object kbase, long koff, int klen, Object vbase, long voff
// must be stored in the same memory page.
// (8 byte key length) (key) (value) (8 byte pointer to next value)
int uaoSize = UnsafeAlignedOffset.getUaoSize();
final long recordLength = (2 * uaoSize) + klen + vlen + 8;
final long recordLength = (2L * uaoSize) + (long)klen + (long)vlen + 8L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to cast everything to long? I think long + int + int + int is safe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. It was too conservative. (2L * uaoSize) + klen + vlen + 8 can generate LMUL or LADD as follows:

    LDC 2
    ILOAD 9
    I2L
    LMUL
    ILOAD 4
    I2L
    LADD
    ILOAD 8
    I2L
    LADD
    LDC 8
    LADD
    LSTORE 10

@@ -41,7 +41,7 @@
@Override
public UnsafeRow appendRow(Object kbase, long koff, int klen,
Object vbase, long voff, int vlen) {
final long recordLength = 8 + klen + vlen + 8;
final long recordLength = 8L + (long)klen + (long)vlen + 8L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@cloud-fan
Copy link
Contributor

cloud-fan commented Jun 11, 2018

@kiszk when you were running findBugs locally, did you find more overflow bugs that are not present in this PR? Let's put all discovered overflow bugs in this PR and have another PR to integrate findBugs with maven. Thanks!

@kiszk kiszk changed the title [SPARK-24452][SQL][Core] Avoid possible overflow in int add or multiple [WIP][SPARK-24452][SQL][Core] Avoid possible overflow in int add or multiple Jun 12, 2018
@SparkQA
Copy link

SparkQA commented Jun 12, 2018

Test build #91717 has finished for PR 21481 at commit 50a2585.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk kiszk changed the title [WIP][SPARK-24452][SQL][Core] Avoid possible overflow in int add or multiple [SPARK-24452][SQL][Core] Avoid possible overflow in int add or multiple Jun 15, 2018
@kiszk
Copy link
Member Author

kiszk commented Jun 15, 2018

@cloud-fan addressed all of the possible integer overflows detected by SpotBugs.

@SparkQA
Copy link

SparkQA commented Jun 15, 2018

Test build #91922 has finished for PR 21481 at commit a87c417.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/2.3!

@asfgit asfgit closed this in 90da7dc Jun 15, 2018
asfgit pushed a commit that referenced this pull request Jun 15, 2018
This PR fixes possible overflow in int add or multiply. In particular, their overflows in multiply are detected by [Spotbugs](https://spotbugs.github.io/)

The following assignments may cause overflow in right hand side. As a result, the result may be negative.
```
long = int * int
long = int + int
```

To avoid this problem, this PR performs cast from int to long in right hand side.

Existing UTs.

Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>

Closes #21481 from kiszk/SPARK-24452.

(cherry picked from commit 90da7dc)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants