Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #2732] Fix message loss problem when rebalance with LitePullConsumer #2832

Merged
merged 2 commits into from
Apr 27, 2021

Conversation

areyouok
Copy link
Contributor

Make sure set the target branch to develop

What is the purpose of the change

see issue #2732

this commit will help reproduce the bug : areyouok@a54599e

Brief changelog

  1. method updatePullOffset should check proccessQueue is valid.
  2. if proccessQueue is invalid, skip pull task

Verifying this change

Follow this checklist to help us incorporate your contribution quickly and easily. Notice, it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR.

  • Make sure there is a Github issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a Github issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [ISSUE #123] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test(over 80% coverage) to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle to make sure basic checks pass. Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@coveralls
Copy link

coveralls commented Apr 22, 2021

Coverage Status

Coverage decreased (-0.08%) to 51.801% when pulling dee9230 on areyouok:fix_litepull into bc4ecb3 on apache:develop.

@RongtongJin RongtongJin self-requested a review April 23, 2021 02:15
@panzhi33
Copy link
Contributor

Is the range of this lock a bit large? If a consumer allocates more queues, will many pull tasks get stuck in updatePullOffset?

@@ -83,10 +83,15 @@ public long getPullOffset(MessageQueue messageQueue) {
return -1;
}

public void updatePullOffset(MessageQueue messageQueue, long offset) {
public void updatePullOffset(MessageQueue messageQueue, long offset, ProcessQueue processQueue) {
MessageQueueState messageQueueState = assignedMessageQueueState.get(messageQueue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If return null here, it may report to NPE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 88 to 95
synchronized (this.assignedMessageQueueState) {
if (messageQueueState.getProcessQueue() != processQueue) {
return;
}
if (messageQueueState != null) {
messageQueueState.setPullOffset(offset);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synchronized here seems to be unnecessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems putting the synchronized before assignedMessageQueueState.get will be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the synchronized lock can be removed.
I have update this branch.

@codecov-commenter
Copy link

Codecov Report

Merging #2832 (dee9230) into develop (bc4ecb3) will increase coverage by 0.06%.
The diff coverage is 0.00%.

Impacted file tree graph

@@              Coverage Diff              @@
##             develop    #2832      +/-   ##
=============================================
+ Coverage      46.52%   46.59%   +0.06%     
- Complexity      3426     3427       +1     
=============================================
  Files            307      307              
  Lines          29059    29065       +6     
  Branches        4172     4175       +3     
=============================================
+ Hits           13520    13543      +23     
+ Misses         13697    13674      -23     
- Partials        1842     1848       +6     
Impacted Files Coverage Δ Complexity Δ
...tmq/client/impl/consumer/AssignedMessageQueue.java 61.66% <0.00%> (-1.05%) 21.00 <0.00> (ø)
...ent/impl/consumer/DefaultLitePullConsumerImpl.java 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...nt/impl/consumer/ConsumeMessageOrderlyService.java 38.98% <0.00%> (-2.53%) 16.00% <0.00%> (-3.00%)
...a/org/apache/rocketmq/store/StoreStatsService.java 29.50% <0.00%> (-1.32%) 26.00% <0.00%> (-2.00%)
...che/rocketmq/namesrv/kvconfig/KVConfigManager.java 59.18% <0.00%> (-1.03%) 11.00% <0.00%> (-1.00%)
...he/rocketmq/client/impl/consumer/ProcessQueue.java 57.67% <0.00%> (-0.94%) 31.00% <0.00%> (ø%)
...main/java/org/apache/rocketmq/store/CommitLog.java 66.32% <0.00%> (-0.21%) 79.00% <0.00%> (ø%)
...ketmq/client/impl/consumer/PullMessageService.java 75.55% <0.00%> (ø) 9.00% <0.00%> (ø%)
...org/apache/rocketmq/store/DefaultMessageStore.java 55.31% <0.00%> (+0.19%) 109.00% <0.00%> (ø%)
.../apache/rocketmq/logging/inner/LoggingBuilder.java 64.08% <0.00%> (+0.31%) 3.00% <0.00%> (ø%)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc4ecb3...dee9230. Read the comment docs.

@vongosling vongosling added this to the 4.9.0 milestone Apr 26, 2021
@vongosling vongosling linked an issue Apr 26, 2021 that may be closed by this pull request
@RongtongJin RongtongJin merged commit f77d7fe into apache:develop Apr 27, 2021
vongosling added a commit that referenced this pull request May 19, 2021
* [ISSUE #1233] Fix CVE-2011-1473

* fix Multiple instances in the same application share MQClientInstance

* [ISSUE #2748] Fix deleteSubscriptionGroup not remove consumer offset

* [ISSUE #2745] Changed the support time of the request/reply feature to 4.6.0.

Co-authored-by: von gosling <vongosling@apache.org>

* [ISSUE #2729] Replace with Math.min method call

* [ISSUE #2801]Fix NamesrvAddr connot set in Producer

* [ISSUE 2800] optimize: the spelling of topicSynFlag

Co-authored-by: ph3636 <tianxingguang@kanzhun.com>

* [ISSUE #2803] Fix the endpoint cannot get instanceId without http (#2804)

* fix the endpoint cannot get instanceId without http

* fix the endpoint cannot get instanceId without http

* add unit test

* add unit test

* add unit test

Co-authored-by: panzhi33 <wb-pz502261@alibaba-inc.com>

* fix messageArrivingListener NPE

* [ISSUE #2538]Optimize log output when message trace saving fails

* [ISSUE #2811] Fix the wrong topic was consumed in the DefaultMessageStoreTest test program

* [ISSUE #2821] Overriding the ServiceThread#shutdown in HAClient class

* [ISSUE #2805] remove redundant package imports

* [ISSUE #2833] Support trace for TranscationProducer (#2834)

* [ISSUE #2732] Fix message loss problem when rebalance with LitePullConsumer (#2832)

* [ISSUE #2732] Fix message loss problem when rebalance with LitePullConsumer

* Fix message loss problem when rebalance with LitePullConsumer, update 2

* [ISSUE #2846]fix -E might not port to other systems

* fix some nonconformity after checkstyle

* Support OpenTracing(#2861)

* [ISSUE #2872] remove log files created by integration test when mvn clean

* [ISSUE #2872] move log files created by integration test to target dir

* Change log level to debug: "Half offset {} has been committed/rolled back"

* Fix unit test stability

Bump mockito-core to 3.10.0, remove powermock dependency, suppress useless logging

* [ISSUE #2898] Resolve rocketmq-example project failed during checkstyle execution (#2899)

Co-authored-by: SSpirits <shadowyspirits@outlook.com>
Co-authored-by: panzhi33 <wb-pz502261@alibaba-inc.com>
Co-authored-by: panzhi <panzhi33@qq.com>
Co-authored-by: ArronHuang <41609451+ArronHuang@users.noreply.github.com>
Co-authored-by: von gosling <vongosling@apache.org>
Co-authored-by: drgnchan <40224023+drgnchan@users.noreply.github.com>
Co-authored-by: zhangjidi2016 <zhangjidi@cmss.chinamobile.com>
Co-authored-by: ph3636 <38041490+ph3636@users.noreply.github.com>
Co-authored-by: ph3636 <tianxingguang@kanzhun.com>
Co-authored-by: BurningCN <1015773611@qq.com>
Co-authored-by: francis lee <francislee.cn@outlook.com>
Co-authored-by: 灼华 <43363120+BurningCN@users.noreply.github.com>
Co-authored-by: yuz10 <845238369@qq.com>
Co-authored-by: huangli <areyouok@gmail.com>
Co-authored-by: chenrl <raymond2366@outlook.com>
Co-authored-by: ayanamist <ayanamist@gmail.com>
Co-authored-by: zhangjidi2016 <1017543663@qq.com>
GenerousMan pushed a commit to GenerousMan/rocketmq that referenced this pull request Aug 12, 2022
…PullConsumer (apache#2832)

* [ISSUE apache#2732] Fix message loss problem when rebalance with LitePullConsumer

* Fix message loss problem when rebalance with LitePullConsumer, update 2
pulllock pushed a commit to pulllock/rocketmq that referenced this pull request Oct 19, 2023
…PullConsumer (apache#2832)

* [ISSUE apache#2732] Fix message loss problem when rebalance with LitePullConsumer

* Fix message loss problem when rebalance with LitePullConsumer, update 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

possbile LitePullConusmer rebalance bug
7 participants