Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fix the isolation problem when rollback in AT mode #2918

Merged
merged 9 commits into from
Aug 4, 2020

Conversation

yangxb2010000
Copy link
Contributor

Ⅰ. Describe what this PR did

执行回滚阶段检查数据是否被其他程序修改过时,为查询数据sql添加排他锁。防止数据被查询出来之后,执行undosql之前再被其他程序修改时可能导致脏数据的问题。

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@@ -68,7 +68,7 @@
* template of check sql
* TODO support multiple primary key
*/
private static final String CHECK_SQL_TEMPLATE = "SELECT * FROM %s WHERE %s";
private static final String CHECK_SQL_TEMPLATE = "SELECT * FROM %s FOR UPDATE WHERE %s";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for update It should be at the end of the statement

@funky-eyes funky-eyes added the first-time contributor first-time contributor label Jul 22, 2020
@yangxb2010000 yangxb2010000 changed the title bugfix: AT模式下回滚阶段检查数据是否被修改过的逻辑存在数据库并发问题 bugfix: fix the isolation problem when rollback in AT mode Jul 22, 2020
@codecov-commenter
Copy link

codecov-commenter commented Jul 22, 2020

Codecov Report

Merging #2918 into develop will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff              @@
##             develop    #2918      +/-   ##
=============================================
- Coverage      50.22%   50.21%   -0.02%     
+ Complexity      3055     3053       -2     
=============================================
  Files            599      599              
  Lines          19465    19465              
  Branches        2396     2396              
=============================================
- Hits            9776     9774       -2     
  Misses          8706     8706              
- Partials         983      985       +2     
Impacted Files Coverage Δ Complexity Δ
...seata/rm/datasource/undo/AbstractUndoExecutor.java 66.90% <ø> (ø) 27.00 <0.00> (ø)
...o/seata/server/coordinator/DefaultCoordinator.java 54.63% <0.00%> (-0.52%) 28.00% <0.00%> (-1.00%)
...in/java/io/seata/server/session/GlobalSession.java 83.71% <0.00%> (-0.46%) 71.00% <0.00%> (-1.00%)

@l81893521
Copy link
Contributor

When undo failed. Seata will keep retry, so this time, maybe the data will always keep locked. I think this idea maybe not that good. And we should use @GlobalTransactional or @GlobalLock to keep the transaction isolation.

@yangxb2010000
Copy link
Contributor Author

When undo failed. Seata will keep retry, so this time, maybe the data will always keep locked. I think this idea maybe not that good. And we should use @GlobalTransactional or @GlobalLock to keep the transaction isolation.

可能我没描述清楚场景,我再详细描述一下:

背景:
Seata的AT模式,在执行undosql阶段会检查对应的数据是否已经被修改了(被分布式事务外的程序修改了),如果被修改了则不再执行回滚,直接返回。代码如下:

io.seata.rm.datasource.undo.AbstractUndoExecutor.class
if (IS_UNDO_DATA_VALIDATION_ENABLE && !dataValidationAndGoOn(conn)) {
  return;
}

在检查数据是否被修改过的逻辑dataValidationAndGoOn中,会先从数据库中查询、比对,如果数据没有变更则执行undosql。由于该查询操作没有加排他锁,如果在查询数据之后,执行undosql之前数据被分布式事务外的程序修改,就会导致数据被undosql覆盖了。

解决方案
查询数据库数据时添加排他锁,防止查询之后、执行undosql之前数据被修改。

@@ -68,7 +68,7 @@
* template of check sql
* TODO support multiple primary key
*/
private static final String CHECK_SQL_TEMPLATE = "SELECT * FROM %s WHERE %s";
private static final String CHECK_SQL_TEMPLATE = "SELECT * FROM %s WHERE %s FOR UPDATE";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it fails, how do release the X lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undo操作所有的sql执行是在一个事务中的,事务提交之后排它锁就会释放。

您的意思是不是这样的:

如果数据被分布式事务外的程序修改了,undo操作会抛异常,然后TC会无限重试执行undo操作(1s重试一次),从而导致TC长期持有该锁。

如果是的话,这个排它锁可能还要等TC的重试策略优化之后才能添加,比如重试时间随着重试次数的增加而不断加长,或者超时等待人工介入

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no more questions, this can be optimized according to the lock retry strategy.

@l81893521
Copy link
Contributor

When undo failed. Seata will keep retry, so this time, maybe the data will always keep locked. I think this idea maybe not that good. And we should use @GlobalTransactional or @GlobalLock to keep the transaction isolation.

可能我没描述清楚场景,我再详细描述一下:

背景:
Seata的AT模式,在执行undosql阶段会检查对应的数据是否已经被修改了(被分布式事务外的程序修改了),如果被修改了则不再执行回滚,直接返回。代码如下:

io.seata.rm.datasource.undo.AbstractUndoExecutor.class
if (IS_UNDO_DATA_VALIDATION_ENABLE && !dataValidationAndGoOn(conn)) {
  return;
}

在检查数据是否被修改过的逻辑dataValidationAndGoOn中,会先从数据库中查询、比对,如果数据没有变更则执行undosql。由于该查询操作没有加排他锁,如果在查询数据之后,执行undosql之前数据被分布式事务外的程序修改,就会导致数据被undosql覆盖了。

解决方案
查询数据库数据时添加排他锁,防止查询之后、执行undosql之前数据被修改。

Sorry, maybe I m think about the wrong side, I already know what s your mean, thanks for you contribution.

@l81893521 l81893521 added the module/rm-datasource rm-datasource module label Jul 27, 2020
@l81893521 l81893521 added this to the 1.4.0 milestone Jul 27, 2020
Copy link
Member

@slievrly slievrly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@l81893521 l81893521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@l81893521 l81893521 merged commit d1e6355 into apache:develop Aug 4, 2020
l81893521 pushed a commit to l81893521/seata that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
first-time contributor first-time contributor module/rm-datasource rm-datasource module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants