client: `Txn.CleanupOnError` logs if transaction is already aborted #10490

bdarnell · 2016-11-06T09:24:11Z

Txn.CleanupOnError doesn't return an error, but it calls log.Errorf if an error occurs during rollback, which will happen if the transaction is already known to be aborted (this error comes from the client package; no request is sent to the server in this case).

We should at least handle the case of an already-aborted transaction silently, and for other kinds of errors log.Error is probably not the right way to handle it - this should probably go into the trace and not the server's log.

The text was updated successfully, but these errors were encountered:

dianasaur323 · 2017-04-05T15:03:26Z

@bdarnell Assigning you on this, but please feel free to reassign!

a-robinson · 2017-04-24T13:55:54Z

I don't fully understand the interaction between different errors here. It doesn't look possible to get a proper TransactionAbortedError from the rollback code within txn.CleanupOnError. Are the "already-aborted" transactions you're talking about hitting the "does not exist" errors that I see on some of our production nodes?

a-robinson · 2017-04-24T16:29:26Z

@bdarnell ^

bdarnell · 2017-04-24T17:41:30Z

The error that prompted this issue was

internal/client/txn.go:602: attempting to use transaction with wrong status or finalized: ABORTED; abort caused by: retry txn "sql/executor.go:464 sql txn" id=b31485b7 key=/Table/51/1/159918331268857857/0 rw=true pri=0.02860422 iso=SERIALIZABLE stat=PENDING epo=1 ts=1468873601.672242849,1 orig=1468873601.672242849,1 max=1468873601.909616479,0 wto=false rop=false

This might be some sort of concurrency problem because we check for Status == PENDING in CleanupOnError. It may also no longer be an issue any more because there have been other changes to client.Txn (including Nathan's work for concurrency).

This is mainly a log spam issue - if it's happening a lot, we should probably change it from log.Errorf to log.Eventf. If it's no longer happening, we can leave it alone and close this issue. Which prod cluster were you seeing this on?

a-robinson · 2017-04-24T17:51:06Z

Hm, I didn't see any of those.

On blue, I only saw two rollback errors, caused by deadlines being exceeded.
On cobalt, there weren't any rollback errors.
On adriatic, there were hundreds of rollback errors caused by the transaction not existing.
On indigo, there are hundreds of "does not exist" errors on 2 of the 9 nodes.

Although I decided to check cyan now as well, and it does have a single instance of the transaction already being aborted. It has 5-10 "already committed" errors per node, though.

None of this seems particularly bad to me. If anything, I'd say the error really worth quieting down a bit is the "does not exist" error, since it appears that the draining process before shutdown can lead to a lot of them.

petermattis added this to the 1.0 milestone Feb 23, 2017

dianasaur323 assigned bdarnell Apr 5, 2017

bdarnell assigned a-robinson and unassigned bdarnell Apr 18, 2017

a-robinson mentioned this issue Apr 25, 2017

client: Don't log frequent, expected abort errors #15326

Merged

a-robinson closed this as completed in #15326 Apr 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client: `Txn.CleanupOnError` logs if transaction is already aborted #10490

client: `Txn.CleanupOnError` logs if transaction is already aborted #10490

bdarnell commented Nov 6, 2016

dianasaur323 commented Apr 5, 2017

a-robinson commented Apr 24, 2017

a-robinson commented Apr 24, 2017

bdarnell commented Apr 24, 2017

a-robinson commented Apr 24, 2017

client: Txn.CleanupOnError logs if transaction is already aborted #10490

client: Txn.CleanupOnError logs if transaction is already aborted #10490

Comments

bdarnell commented Nov 6, 2016

dianasaur323 commented Apr 5, 2017

a-robinson commented Apr 24, 2017

a-robinson commented Apr 24, 2017

bdarnell commented Apr 24, 2017

a-robinson commented Apr 24, 2017

client: `Txn.CleanupOnError` logs if transaction is already aborted #10490

client: `Txn.CleanupOnError` logs if transaction is already aborted #10490