Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid NPE on preallocated exception without message #1409

Closed
wants to merge 2 commits into from

Conversation

zkejid
Copy link
Contributor

@zkejid zkejid commented Jul 22, 2017

Hello.

I'm trying to make my own research on bitconj codebase. During test execution I ran into NPE in code added by commit 83995e9. There is check, which doesn't expect exception message to be null. So I just made additional null-check, so it will not throw NPE and will not log noisy messages.

The cause is an optimisation in HotSpot. Here is an article about it: http://jawspeak.com/2010/05/26/hotspot-caused-exceptions-to-lose-their-stack-traces-in-production-and-the-fix/

My solution is not so good. The best solution would be to completely avoid 'legal' exceptions in this case. However, my knowledge is not enough for it.

Hope this helps.

 - There is hot exception usage during tests execution, so HotSpot makes
   optimization.
   Optimization could be disabled with -XX:-OmitStackTraceInFastThrow.
   However this case can be solved with additional check in code.
 - Null message means there was a lot of exceptions before, so no need
   to log.warn about it one more time.
@Polve
Copy link

Polve commented Jul 22, 2017

Doe this problem arise only on very old versions of java?

@zkejid
Copy link
Contributor Author

zkejid commented Jul 22, 2017

Doe this problem arise only on very old versions of java?

No. Preallocated exception (NPE, ClassCastException, ArrayIndexOutOfBoundsException, etc) is the feature of HotSpot JVM. If your code produces a lot of exceptions, like it does during tests execution, JVM starts throwing exception without message and stacktrace. It improves performance. But code in catch clause should work properly with such empty exception instance.

I reproduced the issue running org.bitcoinj.script.ScriptTest#dataDrivenInvalidTransactions until failure. This test breaks on second hundred of executions.

here is my environment:

java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) Server VM (build 25.131-b11, mixed mode)

@zkejid
Copy link
Contributor Author

zkejid commented Jul 22, 2017

Oops. Looks like there is one more "until failure" issue in org.bitcoinj.wallet.WalletTest#basicSpendingWithEncryptedWallet. Test fails on ~175 sweep with exception

java.lang.IllegalArgumentException: private key exceeds 32 bytes: {} bits [376]

	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:180)
	at org.bitcoinj.core.ECKey.<init>(ECKey.java:193)
	at org.bitcoinj.crypto.DeterministicKey.<init>(DeterministicKey.java:69)
	at org.bitcoinj.crypto.DeterministicKey.derivePrivateKeyDownwards(DeterministicKey.java:427)
	at org.bitcoinj.crypto.DeterministicKey.findOrDeriveEncryptedPrivateKey(DeterministicKey.java:406)
	at org.bitcoinj.crypto.DeterministicKey.decrypt(DeterministicKey.java:377)
	at org.bitcoinj.crypto.DeterministicKey.decrypt(DeterministicKey.java:43)
	at org.bitcoinj.core.ECKey.decrypt(ECKey.java:1113)
	at org.bitcoinj.crypto.DeterministicKey.decrypt(DeterministicKey.java:388)
	at org.bitcoinj.crypto.DeterministicKey.decrypt(DeterministicKey.java:43)
	at org.bitcoinj.wallet.DecryptingKeyBag.maybeDecrypt(DecryptingKeyBag.java:50)
	at org.bitcoinj.wallet.DecryptingKeyBag.findKeyFromPubHash(DecryptingKeyBag.java:67)
	at org.bitcoinj.core.TransactionOutPoint.getConnectedRedeemData(TransactionOutPoint.java:169)
	at org.bitcoinj.core.TransactionInput.getConnectedRedeemData(TransactionInput.java:295)
	at org.bitcoinj.wallet.Wallet.signTransaction(Wallet.java:4091)
	at org.bitcoinj.wallet.Wallet.completeTx(Wallet.java:4028)
	at org.bitcoinj.wallet.WalletTest.basicSpendingCommon(WalletTest.java:340)
	at org.bitcoinj.wallet.WalletTest.basicSpendingWithEncryptedWallet(WalletTest.java:164)

@schildbach
Copy link
Member

Thanks for finding this! I was already trying to investigate into these sporadic unit test failures, but failed and would have searched forever.

I wonder if we should perhaps disable this optimization for the tests. According to the article you linked, the -XX:-OmitStackTraceInFastThrow flag would accomplish this. Any thoughts?

@zkejid
Copy link
Contributor Author

zkejid commented Jul 23, 2017

PR addresses issue in production code, and user can face the same problem while using bitconj. Also, this code goes against javadoc for java.lang.Throwable#getMessage:

 @return  the detail message string of this {@code Throwable} instance
          (which may be {@code null}).

Don't know if there could be more ways to get null in message. So I think it is better to change code than Travis configuration.

Flag should help. It is OK as a temporary solution. Or if you'd like to see all stacktraces in build log, no matter how many exceptions were thrown before. However this option would shadow bugs similar to this one.

@zkejid
Copy link
Contributor Author

zkejid commented Jul 23, 2017

Also I can look for more use cases of getMessage, which can lead to the same NPE.

@schildbach
Copy link
Member

Ok I see your point about production code. Ideally production code should never depend on the exception message. However in this case I guess the upstream API just didn't think about this case, and upstream APIs are difficult to fix/enhance. I'm willing to merge your patch, but I think we should document it a bit in code.

As far as I could see, there are no other dependencies of production code on the exception message (except printing it to a log, wrapping it a domain specific exception, etc.). However, we have tons of unit tests that assert for a specific exception message. In this case imho we have no other choice but to fail the test, be it through an (unexpected) NullPointerException or by properly failing the assert – the outcome will be the same, the test will fail. At the same time, running the tests will likely be the only "usecase" that raises enough exceptions to trigger the HotSpot optimization. Therefore, I still suggest disabling the optimization for the tests.

@schildbach
Copy link
Member

Yes, please feel free to look for other instances of production code depending on the value of or the non-null-ness of getMessage().

@zkejid
Copy link
Contributor Author

zkejid commented Jul 23, 2017

I'm willing to merge your patch, but I think we should document it a bit in code.

Should I change my PR somehow? (I'm sorry, I don't get it clear enough from your message, so have to ask :) )

@schildbach
Copy link
Member

If you find other similar cases in production code, feel free to add more commits. I will write a line or two of documentation when merging.

Meanwhile, I've proposed a patch for disabling the optimization in unit tests: #1412

@zkejid
Copy link
Contributor Author

zkejid commented Jul 23, 2017

There is one more case potentially leading to NPE. Proto.Error builder doesn't allow null arguments and throws NPE. I've put an empty string for null messages.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 72.601% when pulling 165d6fa on zkejid:preallocatedExceptionNPE into 89fb8c3 on bitcoinj:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 72.601% when pulling 9d7857b on zkejid:preallocatedExceptionNPE into 89fb8c3 on bitcoinj:master.

@@ -527,7 +527,7 @@ private void error(String message, Protos.Error.ErrorCode errorCode, CloseReason
Protos.Error.Builder errorBuilder;
errorBuilder = Protos.Error.newBuilder()
.setCode(errorCode)
.setExplanation(message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, explanation is optional in the protobuf. This means the better way would be to not set the explanation at all rather than set it to the empty string.

@@ -527,7 +527,7 @@ private void error(String message, Protos.Error.ErrorCode errorCode, CloseReason
Protos.Error.Builder errorBuilder;
errorBuilder = Protos.Error.newBuilder()
.setCode(errorCode)
.setExplanation(message);
.setExplanation(message != null ? message : "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…same here.

 - 'explanation' argument in builder should not be null.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 72.55% when pulling d9f0aa8 on zkejid:preallocatedExceptionNPE into 6b42e47 on bitcoinj:master.

@schildbach
Copy link
Member

I squashed, added some comments and merged. Again, thanks a lot for investing your time into this issue! I'm glad this is now fixed.

@schildbach schildbach closed this Jul 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants