-
Notifications
You must be signed in to change notification settings - Fork 91
changes in connection #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
MySQL and SQL more generally do not support nested transactions. Users should not get to the situation when they need to check whether they are inside a transaction because code that runs within transactions should never start new transactions. No syntactic clutter. If they attempt to start a transaction from within a transaction, they should get a hard error not a warning. This allows them to realize their mistake and decide what to do with the ongoing transaction. The risk of ignoring the start of an inner transaction is in incorrect execution of the outer transaction. When the inner transaction commits, it will prematurely commit the outer transaction. Throwing an error and preventing entering nested transactions is the only cogent behavior in this case. |
|
Just curious: what's the significance of |
|
I'm also concerned about the proposed handling of the transaction. When a person writes |
|
Anything other than throwing an error may lead to data loss and/or corruption. |
|
@eywalker: I think this would actually be the correct behaviour. If you start a transaction you want it rolled back if anything his wrong. Even if part of the code runs through. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm also curious what this does? I'm guessing that you have encountered a problem with a large query hitting the packet size limit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I encountered the broken pipe issues in pymysql, I found that pymysql has its own max_allowed_packet for some reason that does not reflect the servers max_allowed_packet. Therefore I set it to the maximally allowed value by MySQL to let the server settings decide which packet sizes are accepted. Alternatively, a cleaner solution would be to put that value in dj.config.
|
Yes, but what I'm saying is that if a person has code and assumes that they have an inner transaction, they would expect that the completion of the inner transaction code means that the inner transaction is fully committed, even if the rest of the outer transaction fails afterwards. Obviously this is not true because there is only one level of transaction and any inner transaction calls will be ignored with the current implementation. However, even allowing such nested transaction calls to be executed may lead users to have incorrect assumptions about how transactions are handled. Since there is no such thing as nested transaction, I think it is safer to throw an error upon any attempt of nested transaction. |
|
I'm also generally curious what kind of use case results in a potential call to |
|
Users should never allow for nested transactions in their code because they are not supported. If they allow such a situation, they should get an error and correct their code. Here is a pseudocode that would produce data loss or corruption if you do anything other than throw an error: start_transaction 1
insert(A)
start_transaction 2
insert(B)
commit or rollback 2
insert(C)
commit or rollback 1 |
|
But he is suggesting that inner transaction do NOT actually insert start and commit transaction statements. So given that no data corruption will occur per se. If an error occurs anywhere, everything rolls back.
|
|
In the pseudocode above, if the inner transaction decides to rollback its transaction, it will rollback If instead the user got an error when starting the inner transaction, they could decide to commit |
|
I think whether that's unnecessary depends right? If the outer transaction wants to ensure that the inner statement insert(B) to complete for it to say all of its queries are good and thus commit, it only make sense that error within causes roll back of everything outer including insert(A) Now that being said, this behavior is not one would expect of "nested transactions", and given MySQL doesn't support nested transaction this is an expected problem...
|
|
@eywalker, Yes, if the inner transaction rollbacked, it should not rollback the outer transaction. Otherwise, it would be equivalent to a single outer transaction. The inner transaction should not commit either since the outcome of the outer transaction is still pending. The inner transaction cannot be considered a transaction in any sense and should not be allowed. |
|
Hmm it sounds like you are suggesting that the inner transaction should act like a thread fork that basically should not affect the parent thread? I guess at least you are suggesting that the success or failure of the inner transaction should not affect that of the outer one? I thought it made sense to say that outer transaction required then success of the inner one. I totally agree ( and I have had since the beginning) that we cannot have a meaningful inner transaction since MySQL don't support it but I'm also not sure what the behavior of a nested transaction would be should MySQL support it... Without a good idea what a real nested transaction should be like I'm having hard time telling what should be the expected behavior when the inner transaction fails. This being said I have a feeling that @fabee is considering quite a different use case/perspective in wanting this implementation.
|
|
Ok, maybe to clarify: I put in that behaviour, because I decorated Regarding nested transactions, I am not convinced by your arguments for the following reason: If I start a transaction, I want that anything that gets inserted during that transaction is either inserted together or not at all (different to @dimitri-yatsenko expected behaviour, in line with @eywalker expected behaviour). If the "inner" transaction fails, then parts of the data could not be inserted. Therefore, the outer transaction should fail as well. It that sense, @dimitri-yatsenko is right: A nested transaction does not make sense. What you really want is:
This is exactly what I do not see how this can lead to data loss or corruption, because the alternative in @dimitri-yatsenko example would have been to run |
|
@fabiansinz, Using the pseudocode above, if transaction 2 rollsback, what will happen to |
|
It will not be inserted as it should because the outer transaction is only complete if A, B, and C are inserted. B fails so C does not need to be inserted. |
|
Exactly. How does the outer transaction know not to |
|
okay, @eywalker just explained to me that our disagreement stemmed from the fact that @fabiansinz entangled exception handling with transaction handling. So here is an example that disentangles them and exposes problems with the suggested solution: def fun():
try:
with conn.transaction:
insert(B1)
insert(B2)
except:
handle_all_my_errors
# the following code does not know that fun() uses transactions
with conn.transaction:
insert(A)
fun()
insert(C)Does this clarify why emulating nested transactions is a bad idea? What will happen if |
|
Yes, this "entanglement" is how this context manager works. If you don't want that behaviour, don't use it, but Regarding your example, I am not convinced because it is basically a misuse of the context manager. You could have made the example just like with conn.transaction:
insert(A)
try:
insert(C)
except:
# handle all my errorsand caused a inconsistent state because of the misuse of the context manager. Alternatively, you can run into the same situation without using the context manager at all. def fun():
try:
insert(B1)
insert(B2)
except:
# handle_all_my_errors
start_transaction()
try:
insert(A)
fun()
insert(C)
except:
rollback()
else:
commit_transaction()Same inconsistent state because the outer You did convince me though that a failure in the inner transaction should already rollback the outer transaction. This would solve your example. If you are unhappy with potential misunderstandings, what about naming the context manager |
|
@fabiansinz , No, in my example above the errors were handled outside the context manager in |
|
In your second example The whole point of my example is that |
|
Here is a clearer example then: Example 3def fun():
start_transaction()
try:
insert(B1)
insert(B2)
except:
rollback()
else:
commit()
# the author of the following code does not know that fun() uses transactions
with conn.transaction:
insert(A)
fun()
insert(C) |
|
I think the main assumption that you are making is that any subroutine that rollbacks a transaction also raises an exception immediately after. This cannot generally be true. Rolling back a transaction can be handled by subroutines without raising an exception all the way up the entire calling stack. We should be able to create routines with transactions that take care of their own exceptions. |
|
This is exactly what I am assuming with the context manager. It is the same logic as implemented in That said, I am not trying to introduce nested transactions. I just want a context manager that ensures that a transaction is open. I am happy to give it a better name like |
|
@fabiansinz No, you misunderstood. Even if you use the context manager in Example 3, you are also assuming that all functions inside the context manager at any depth also use context managers and none of them at any level handle exceptions raised below. Please consider Example 3 carefully. The problem arises from the fact that |
|
An Example 4def fun():
start_transaction()
try:
insert(B1)
insert(B2)
except:
rollback()
else:
commit()
# the author of the following code does not know that fun() uses transactions
with conn.ensure_transaction:
insert(A)
fun()
insert(C)If |
|
Example 4 will cause a rollback automatically because you are trying to open a transaction in a transaction. Maybe I wasn't clear enough. As I already wrote before, the assumption of the context manager is: Any error that would cause an inconsistent state should be passed to the context manager. It does not matter whether the error is re-raised by another context manager or whether it is re-raised by any other exception handling. If the error is not passed to the context manager, the assumption is violated and it will break. You provided enough examples for that. By now I would like to know whether somebody will accept the pull request with the following updated solution:
|
|
Ah, I see, my bad. I was oversimplified. The following example will break things. What conventions does it break? Example 5def sub()
with conn.transaction:
insert(B1)
insert(B2)
def fun():
try:
sub()
except: pass
# the author of the following code does not know that fun() uses transactions
with conn.ensure_transaction:
insert(A)
fun()
insert(C)If |
|
In Example 5, |
|
The correct solution is to never create code that uses transactions in some cases and not others. It should either use transactions correctly or not use them. Here is a scenario that is quite realistic and closer to home. Example 6For example, if you override |
|
No, I don't think I can accept a method that implicitly decides to use transactions in some contexts but not others. If something does not need to be enclosed in a transaction, it should never be enclosed in a transaction. If something does, then it always does. Let's not cheapen transactions by making them break occasionally in ways that will be difficult to debug. |
|
What's wrong with the logic:
|
|
The code that uses this logic uses transactions in some contexts and not others. "Jumping on the ongoing transaction" is synonymous with "not using a transaction" as you described. So you need to rephrase: What's wrong with the logic:
Remember, the calling code might not be aware that it's inside a transaction or not. So you might be debugging without a transaction but running inside a transaction. |
|
A transaction is a contract with the calling function "I will execute everything correctly or nothing." This logic breaks that contract, effectively, annulling the function of transactions. |
|
Example 6 above illustrates one such problem that would require extensive knowledge of datajoint's works in order to debug. Instead, we should simply not allow code with transactions to be called from |
conn_infosince pymysql does not fetch that info from the server.