Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fixed a bug that after branch deletion, there are still remaining branch lock #3056

Merged
merged 19 commits into from
Aug 24, 2020

Conversation

li469791221
Copy link
Contributor

Ⅰ. Describe what this PR did

修改一个bug, 当二阶段回滚的时候,删除了分支事务之后,准备删除分支事务锁的时候突然由于某些原因造成删除锁失败。那么这个锁数据将会一直存在,造成之后的事务无法获取此锁。
fix a bug, when the branch phase two rollbacked, and the branch info had been delete from the branch_table. But
then delete the branch lock from the table of lock_table, an error occurred which cause the branch lock delete failed.
Then the branch lock will be always in the table, and cause new transactions could not get this lock.

Ⅱ. Does this pull request fix one issue?

fixes #2976

Ⅲ. Why don't you add test cases (unit test/integration test)?

This problem occurs only in exceptional cases, test cases are difficult to write
此问题在某些特殊情况下才会发生,用例不好写。

Ⅳ. Describe how to verify it

My test is to manually throw an exception directly in the unlock and delete branch transaction table。
我的测试方式是在解锁和删除分支事务表直接手动抛出异常。
branchSession.unlock();
// throw an exception
for (SessionLifecycleListener lifecycleListener : lifecycleListeners) {
lifecycleListener.onRemoveBranch(this, branchSession);
}

Ⅴ. Special notes for reviews

问题发生的原因就是删除了分支事务表,但是锁没有删掉。如果两个都还在,那么重试的时候会通过分支事务数据删除该锁。所以问题的关键在于锁需要先于分支事务删除。
因为GlobalSession.removeBranch()仅发生在二阶段相应的分支已经完成seata-server的回滚和响应请求之后。也就是说此时RM的相应操作已经完成,比如删除undoLog,或者设置undoLog状态为完成,异常等等。所以RM的任务已经完成了,此时已经可以删除锁了。所以即使在删除锁后分支事务表还在,也没有事情,当他发起重试的时候,RM已经完成了操作,不会重复再次执行一篇。

@codecov-commenter
Copy link

codecov-commenter commented Aug 24, 2020

Codecov Report

Merging #3056 into develop will decrease coverage by 0.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@              Coverage Diff              @@
##             develop    #3056      +/-   ##
=============================================
- Coverage      50.38%   50.37%   -0.02%     
+ Complexity      3090     3088       -2     
=============================================
  Files            600      600              
  Lines          19518    19519       +1     
  Branches        2410     2411       +1     
=============================================
- Hits            9835     9832       -3     
- Misses          8697     8698       +1     
- Partials         986      989       +3     
Impacted Files Coverage Δ Complexity Δ
...in/java/io/seata/server/session/GlobalSession.java 82.43% <0.00%> (-1.28%) 70.00 <2.00> (-1.00)
...o/seata/server/coordinator/DefaultCoordinator.java 54.63% <0.00%> (-0.52%) 28.00% <0.00%> (-1.00%)

@l81893521 l81893521 changed the base branch from master to develop August 24, 2020 08:00
@funky-eyes
Copy link
Contributor

首先那个issue描述的不是pr中处理方式的问题。
其次,lifecycleListener.onRemoveBranch(this, branchSession); 会出现异常吗?
希望可以提供个某些场景下会出现删除分支事务不删除锁的描述

@@ -246,10 +246,10 @@ public void addBranch(BranchSession branchSession) throws TransactionException {

@Override
public void removeBranch(BranchSession branchSession) throws TransactionException {
branchSession.unlock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is false, please throw exception, otherwise the problem will still exist.

@li469791221
Copy link
Contributor Author

首先那个issue描述的不是pr中处理方式的问题。
其次,lifecycleListener.onRemoveBranch(this, branchSession); 会出现异常吗?
希望可以提供个某些场景下会出现删除分支事务不删除锁的描述

以数据库存储方式为例,lifecycleListener.onRemoveBranch(this, branchSession)这里是删除分支事务的逻辑,之后进行branchSession.unlock()解锁的时候,如果出现数据库网络波动,或者sever挂了等情况,就会出现删除分支事务不删除锁的情况。

@funky-eyes
Copy link
Contributor

首先那个issue描述的不是pr中处理方式的问题。
其次,lifecycleListener.onRemoveBranch(this, branchSession); 会出现异常吗?
希望可以提供个某些场景下会出现删除分支事务不删除锁的描述

以数据库存储方式为例,lifecycleListener.onRemoveBranch(this, branchSession)这里是删除分支事务的逻辑,之后进行branchSession.unlock()解锁的时候,如果出现数据库网络波动,或者sever挂了等情况,就会出现删除分支事务不删除锁的情况。

你说的没错,修改一下标题以bugfix: 开头,内容描述你pr改动的原因,简短易懂一些

@li469791221 li469791221 changed the title fix issue #2976 bugfix: #2976 修改全局事务回滚删除分支锁的时候,由于网络等原因造成锁未删除的问题 Aug 24, 2020
@li469791221
Copy link
Contributor Author

首先那个issue描述的不是pr中处理方式的问题。
其次,lifecycleListener.onRemoveBranch(this, branchSession); 会出现异常吗?
希望可以提供个某些场景下会出现删除分支事务不删除锁的描述

以数据库存储方式为例,lifecycleListener.onRemoveBranch(this, branchSession)这里是删除分支事务的逻辑,之后进行branchSession.unlock()解锁的时候,如果出现数据库网络波动,或者sever挂了等情况,就会出现删除分支事务不删除锁的情况。

你说的没错,修改一下标题以bugfix: 开头,内容描述你pr改动的原因,简短易懂一些

好的,改了。

@wangliang181230
Copy link
Contributor

wangliang181230 commented Aug 24, 2020

标题改成英文吧。
另外,我提的问题也修复一下吧。

@funky-eyes
Copy link
Contributor

funky-eyes commented Aug 24, 2020

#2976 不需要放在标题,内容里有就可以里。
标题要英文,大概意思就是,修复分支删除后锁还残留的bug,自己翻译翻译看看还能不能更通俗易懂一些

@li469791221 li469791221 changed the title bugfix: #2976 修改全局事务回滚删除分支锁的时候,由于网络等原因造成锁未删除的问题 bugfix: #2976 Fixed a bug that after branch deletion, there are still remaining branch lock Aug 24, 2020
@funky-eyes funky-eyes changed the title bugfix: #2976 Fixed a bug that after branch deletion, there are still remaining branch lock bugfix: fixed a bug that after branch deletion, there are still remaining branch lock Aug 24, 2020
@funky-eyes funky-eyes added module/server server module first-time contributor first-time contributor labels Aug 24, 2020
@funky-eyes funky-eyes added this to the 1.4.0 milestone Aug 24, 2020
@li469791221
Copy link
Contributor Author

我提的问题也修复一下吧。

好的,加了

Copy link
Contributor

@wangliang181230 wangliang181230 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@funky-eyes funky-eyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -246,10 +246,12 @@ public void addBranch(BranchSession branchSession) throws TransactionException {

@Override
public void removeBranch(BranchSession branchSession) throws TransactionException {
if (!branchSession.unlock()) {
throw new RuntimeException("Unlock branch lock failed!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw TransactionException, not RuntimeException, otherwise, some forEach will break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@l81893521 l81893521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for @wangliang181230

@l81893521 l81893521 merged commit 7b1fcac into apache:develop Aug 24, 2020
Copy link
Member

@slievrly slievrly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

l81893521 pushed a commit to l81893521/seata that referenced this pull request Oct 22, 2020
hicf pushed a commit to hicf/seata that referenced this pull request Nov 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
first-time contributor first-time contributor module/server server module
Projects
None yet
7 participants