New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: fix rollback transaction metrics are inaccurate #4662
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #4662 +/- ##
=============================================
+ Coverage 48.43% 48.61% +0.17%
- Complexity 4035 4054 +19
=============================================
Files 735 735
Lines 25614 25616 +2
Branches 3162 3163 +1
=============================================
+ Hits 12406 12452 +46
+ Misses 11869 11826 -43
+ Partials 1339 1338 -1
|
@@ -135,6 +135,7 @@ public static void endCommitted(GlobalSession globalSession, boolean retryGlobal | |||
beginTime, retryBranch); | |||
} else { | |||
MetricsPublisher.postSessionDoneEvent(globalSession, false, false); | |||
globalSession.changeGlobalStatus(GlobalStatus.WaitingCommittedFinished); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之所以之前没有加状态的原因1.是因为锁和分支可能残留 2.是为了少二次网络io+磁盘io消耗(changestatus为committed和delete)使二阶段提交/回滚的效率加快,现在加了个新状态是把2带来的好处给去掉了,需要想一下是否有别的方式既可以记录,又可以解决这种情况的
…metrics # Conflicts: # changes/en-us/develop.md # changes/zh-cn/develop.md
@tuwenlin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
事务回滚的时候,为了防止分支和锁的残留,现状是2分钟后在进行清理,在此期间,事务的状态是Rollbacking,实际上已经回滚成功了,这样在控制台上看起来很奇怪。
另外,在prometheus中,不会统计rollbacked的事务数和TPS
解决方案:新增两个状态:WaitingCommittedFinished表示事务已经提交成功,等待2分钟后清理,为Committed和AfterCommited中间状态,表示事务已经提交成功,在做扫尾工作;WaitingRollbackedFinished,同理,为Rollbacked和AfterRollbacked的中间状态,表示事务已经回滚成功,在做扫尾工作
测试线程组如下:
1.场景1:两个微服务,都执行插入操作,AT模式测试回滚:
修改之前的Prometheus中rollbacked事务数,一直都为0:
修改前的Prometheus中rollbacked的TPS,一直为0:
修改前的Prometheus中afterRollbacked的数量正常:
修改后的Prometheus中rollbacked事务数,正常:
修改后Prometheus中rollbacked的TPS正常,和Jmeter中基本一致:
修改后,从时间可以看出,开始测试后2分钟,rometheus中afterRollbacked的数量正常:
2.场景2:TCC模式下提交,成功提交的事务修改前和修改后没有变化,不再详细展示