Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fix AsyncWorker potential OOM problem #3258

Merged
merged 13 commits into from
Dec 21, 2020

Conversation

selfishlover
Copy link
Contributor

Ⅰ. Describe what this PR did

Fix AsyncWorker may cause OOM when TC continuously send async commit message to RM.

Ⅱ. Does this pull request fix one issue?

fixes #3240

Ⅳ. Describe how to verify it

use method descripted in the issue.

@codecov-io
Copy link

codecov-io commented Nov 5, 2020

Codecov Report

Merging #3258 (c63fe60) into develop (de56e17) will increase coverage by 0.11%.
The diff coverage is 46.66%.

Impacted file tree graph

@@              Coverage Diff              @@
##             develop    #3258      +/-   ##
=============================================
+ Coverage      51.44%   51.56%   +0.11%     
- Complexity      3327     3329       +2     
=============================================
  Files            616      616              
  Lines          20206    20192      -14     
  Branches        2536     2529       -7     
=============================================
+ Hits           10395    10412      +17     
+ Misses          8766     8736      -30     
+ Partials        1045     1044       -1     
Impacted Files Coverage Δ Complexity Δ
...java/io/seata/rm/datasource/DataSourceManager.java 24.39% <28.57%> (-10.31%) 4.00 <1.00> (-3.00)
.../main/java/io/seata/rm/datasource/AsyncWorker.java 47.29% <48.52%> (+32.29%) 9.00 <8.00> (+4.00)
...in/java/io/seata/server/session/GlobalSession.java 84.09% <0.00%> (+0.45%) 72.00% <0.00%> (+1.00%)

@funky-eyes funky-eyes added the type: bug Category issues or prs related to bug. label Nov 5, 2020
@funky-eyes funky-eyes added this to the 1.5.0 milestone Nov 5, 2020
@funky-eyes funky-eyes added the module/rm-datasource rm-datasource module label Nov 5, 2020
Copy link
Contributor

@ls9527 ls9527 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, 给我的话,做法是达到指定循环次数就跳出循环。 目前你的做法更棒,甘拜下风。

修改的内容如下:
配置中心的缓冲区配置,虽然不知道有没有用, 配上总是没错的。
用drainTo的方式+toMap提前确认数据源需要操作的数据。
Lists.partition 的方式逻辑分割数据。

DataSourceManager resourceManager = (DataSourceManager) DefaultResourceManager.get().getResourceManager(BranchType.AT);
DataSourceProxy dataSourceProxy = resourceManager.get(resourceId);
if (dataSourceProxy == null) {
LOGGER.warn("Failed to find resource for {}", resourceId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是不是应该直接抛出异常?而不是打了日志之后return?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该不能抛出异常, 抛出异常的话for循环处理就中断了。
在多数据源的场景下,第一个数据源找不到影响第二个数据源无法提交也不合理。
是不是搞个大catch,让这个dealWithGroupedContexts不抛出异常比较好?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大而全地catch的话,不好区分异常发生的地方(以便针对性地输出日志)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

分组函数有点丑

Copy link
Member

@xingfudeshi xingfudeshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@jsbxyyx jsbxyyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@funky-eyes funky-eyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@funky-eyes funky-eyes merged commit 80ea72a into apache:develop Dec 21, 2020
@selfishlover selfishlover deleted the bugfix-AsyncWorker-OOM branch December 21, 2020 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module/rm-datasource rm-datasource module type: bug Category issues or prs related to bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Asyncworker in queue.poll will OOM
8 participants