Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove failed stale producer from the connection #4741

Merged
merged 1 commit into from
Jul 18, 2019

Conversation

rdhabalia
Copy link
Contributor

Motivation

As described in #4138, broker tries to clean up stale failed-producer from the connection however, while cleaning up producer-future, it tries to remove newly created producer-future rather old-failed producer because of that broker still gives below error

17:22:00.700 [pulsar-io-21-26] WARN  org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:1111][453] Producer with id persistent://prop/cluster/ns/topic is already present on the connection

@rdhabalia rdhabalia added this to the 2.5.0 milestone Jul 17, 2019
@rdhabalia rdhabalia self-assigned this Jul 17, 2019
@rdhabalia
Copy link
Contributor Author

rerun java8 tests

@sijie
Copy link
Member

sijie commented Jul 17, 2019

retest this please

@merlimat
Copy link
Contributor

@rdhabalia The check for the same future on delete was there to prevent deleting the wrong future, eg. after a timeout from client that resend the create-producer command. The tests were actually simulating this condition.

@rdhabalia
Copy link
Contributor Author

rdhabalia commented Jul 17, 2019

@merlimat
No, actually check was added as part of #4145 and #4138 to fix the same issue (Producer/Consumer is already connected) but that PR was trying to remove wrong producer/consumer which is not present into the cache so, underlying issue was not fixed.

I modified the test because test was wrong and having exact same issue which we are trying to fix: test-sequence was

  1. try to create consumer (which will not be created immediately because test has delayed topic creation logic)
  2. close the consumer: which fails the pending-consumer future and doesn't clean up from the cache.
  3. so, any subsequent consumer request always receives error: Consumer with id is already connected. Test was not checking failure reason of consumer : consumer is already connected else we could have caught it long time back.
    Because broker was never cleaning failed future and because of the consumer will never be able to connect.

Same thing was happening for producers and we are keep seeing this issue when broker unloads the bundle to different broker and same bundle again comes back to the old broker. the entire issue was described into #4138. So, we tried to fix in #4138 but tried to remove producer-future for current request which is definitely not in the cache and therefore, client keeps seeing error Producer is already connected and we have to manually restart the broker to fix this issue.

@rdhabalia
Copy link
Contributor Author

rerun integration tests

@sijie
Copy link
Member

sijie commented Jul 18, 2019

@rdhabalia you might need to rebase to latest master to fix the integration tests issues.

remove failed consumer from cnx

fix test
@rdhabalia
Copy link
Contributor Author

retest this please

@rdhabalia rdhabalia merged commit bbac857 into apache:master Jul 18, 2019
@rdhabalia rdhabalia deleted the srv_prod_cnx branch July 18, 2019 23:01
easyfan pushed a commit to easyfan/pulsar that referenced this pull request Jul 26, 2019
rdhabalia added a commit to rdhabalia/pulsar that referenced this pull request Aug 8, 2019
@wolfstudy
Copy link
Member

Change the Milestone to 2.4.2, because of conflict.

@wolfstudy wolfstudy modified the milestones: 2.5.0, 2.4.2 Nov 19, 2019
wolfstudy pushed a commit that referenced this pull request Nov 20, 2019
remove failed consumer from cnx

fix test

(cherry picked from commit bbac857)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants