Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: abort transaction in all cases #881

Merged
merged 1 commit into from Apr 13, 2022

Conversation

emasab
Copy link
Contributor

@emasab emasab commented Feb 18, 2022

catching RuntimeException all other subclasses of Exception are not caught and don't abort the transaction

catching RuntimeException all other subclasses of Exception are not caught and don't abort the transaction
@esabellico-facephi
Copy link

a bit of context: maybe this can happen only in Kotlin you can actually compile code that throws checked exceptions or calling java code that throws checked exceptions without declaring them with throws in the TransactionBody interface

@emasab
Copy link
Contributor Author

emasab commented Mar 12, 2022

here is some code to reproduce the error https://github.com/emasab/mongo-client-fix-881

@jyemin jyemin requested a review from stIncMale April 5, 2022 20:09
@@ -209,7 +209,7 @@ public <T> T withTransaction(final TransactionBody<T> transactionBody, final Tra
try {
startTransaction(options);
retVal = transactionBody.execute();
} catch (RuntimeException e) {
} catch (Throwable e) {
Copy link
Member

@stIncMale stIncMale Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such a tiny change, and so many things come to mind when thinking about it.

Errors in Java the way they are, make no sense

My view on this is expressed here. A consequence of this view is that we should not catch Errors, which means we should catch Exceptions instead of catching Throwables each time we want to catch "everything" that makes sense catching (surely, there are exceptions from this rule, like catching assertion errors in test frameworks, or catching specific Errors that should have been Exceptions, but were made Errors instead).

Unlike Java, Rust uses two drastically different mechanisms to represent recoverable and unrecoverable errors:

  • Recoverable errors are represented via return values. Unlike in C, the compiler forces a caller to either handle them or to explicitly propagate as the caller's return value via the ? operator.
  • Unrecoverable errors are called panics, which can be implemented via unwinding or aborting. Unwinding is similar to throwing an exception in Java, and an unwinding panic can even be caught.

It is worth pointing out that the ability to catch unwinding panics in Rust should not be interpreted as "see, even in Rust catching unrecoverable errors is possible, which suggests that trying to handle unrecoverable errors (whether in Java or in Rust) may make sense!". The reason Rust allows catching unwinding panics is that "Rust's unwinding strategy is not specified to be fundamentally compatible with any other language's unwinding. As such, unwinding into Rust from another language, or unwinding into another language from Rust is Undefined Behavior. You must absolutely catch any panics at the FFI boundary!" This point leads us directly to the issue that this PR is trying to address.

Kotlin's exception handling is not compatible with Java's exception handling

Unlike Java, Kotlin does not have the concept of checked exceptions (good for Kotlin). However, this means that Kotlin's exception mechanism is not compatible with Java's exception mechanism, and just like Rust code must catch any unwinding panics at the FFI boundary, Kotlin generally should catch all exceptions that are considered checked in Java (the check is simple: they are of the Exception type and not of the RuntimeException type) at the Kotlin-Java boundary. Unfortunately, the Kotlin's interop guide erroneously does not require anything like this. It partially addresses the incompatibility for a subset of cases, but that subset does not include the case addressed in this PR.

Conclusion

  1. Even though I consider catching Throwables instead of Exceptions an undesirable practice, the code of the driver contains many places where this is done, which is why introducing one more such catch changes nothing.
  2. Even though having unchecked exception thrown from Kotlin code that implements a Java API method that does not declare checked exceptions is a problem in the Kotlin code, we better work around it as suggested in the PR. Also, from now on we should remember to consider any method as being able to throw checked exception even it is not declared1 (despite this not being true for all methods, for simplicity and robustness it is better to treat all of them this way).

1 One may argue that even without calling Kotlin code, sneaky throws are possible in Java due to the bug-feature it introduced. However, in this case it is easy to argue that whoever uses the technique is fully responsible for the consequences. The same is much harder to say about Kotlin code because it is more abundant and Kotlin's documentation fails to require the right thing to be done for interoperability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice reasoning. I add that if you decompile a Kotlin class file you see that a checked exception is wrapped inside a Throwable and then thrown, I've not tested it but I'd guess that it's done in Scala and Groovy too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, it is not obvious to me why Kotlin compiler wraps checked exceptions in Throwable given that an exception of the Throwable type is considered checked by Java compiler.

@stIncMale
Copy link
Member

Thanks, @emasab, @esabellico-facephi, for informing the driver team about the problem. JAVA-4575 Driver code should catch Exception instead of RuntimeException even if no checked exception is declared will address the problem in other places in the code.

@stIncMale stIncMale merged commit ebada8e into mongodb:master Apr 13, 2022
@emasab
Copy link
Contributor Author

emasab commented Apr 13, 2022

Not at all @stIncMale. Correction about the Kotlin bytecode: it's not wrapped, it's casted to Throwable. And as it's checked in Java, it should be the Kotlin compiler that, not doing the checks, allows to compile to bytecode. If you decompile the Kotlin bytecode to Java it gives the error: you have to declare it with throws.

@emasab
Copy link
Contributor Author

emasab commented Apr 27, 2022

@stIncMale may I ask another question, given that we're still experiencing a case where the same sessionId is used with two different transaction ids even if the session is started and closed with the transaction.

In the current version: https://github.com/emasab/mongo-java-driver/blob/1294de8a3daf0bcc0403d127f8c932511a51ea18/driver-sync/src/main/com/mongodb/client/internal/ClientSessionImpl.java

  1. at line 238 shouln't it abort the transaction before calling start transaction again?
    } else if (e.hasErrorLabel(TRANSIENT_TRANSACTION_ERROR_LABEL)) { continue outer; }
  2. at line 241 shouln't it abort the transaction before giving up after 2 minutes have passed?
    } throw e; }

Thanks!

@stIncMale
Copy link
Member

stIncMale commented Apr 29, 2022

@emasab I have not been familiar with this code, but it appears that the code is written with the following idea behind it: a client may try to either commit or rollback a transaction, but cannot try both:

  • Both commitTransaction and abortTransaction change transactionState to COMMITTED/ABORTED respectively regardless of whether they succeed or fail.
  • commitTransaction fails if transactionState is ABORTED. abortTransaction fails if transactionState is COMMITTED.

clearTransactionContextOnError called when commitTransaction fails must be doing the client-side cleanup needed to either retry commitTransaction or abandon the transaction; and the server must cleanup resources related to an abandoned transaction eventually.

we're still experiencing a case where the same sessionId is used with two different transaction ids even if the session is started and closed with the transaction

startTransaction may be executed multiple times per a withTransaction execution. This method advances the transaction number. That is, it appears to be normal to see more than one txnNumber for a session even if there is only one withTransaction for that session.

If you still think you are observing a bug, it may be worth reporting it via Jira. And if you have a reproducer, it would be very helpful.

@emasab
Copy link
Contributor Author

emasab commented May 4, 2022

@stIncMale thanks for the explanation it's much clearer now, about the txnId it's giving this error Caused by: com.mongodb.MongoQueryException: Query failed with error code 251 and error message 'cannot continue txnId 28 for session f1949373-d84d-468e-8667-98389bc480b9 - W/Ao6W6j9bCrE7yhAkO0wEm49NrHTolA71DjF3jfPsw= with txnId 29 but it's using a version of withTransaction that is copied from this one and adapted to Kotlin's coroutines so if debugging further I find something that applies to this code too, I'm sending some test to reproduce it.

ispringer pushed a commit to evergage/mongo-java-driver that referenced this pull request May 16, 2023
See JAVA-4575 for more details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants