Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Broker] Fix LeaderElectionService.getCurrentLeader and add support for empheralOwner in MockZooKeeper #13066

Merged

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Dec 1, 2021

Motivation

LeaderElectionService.getCurrentLeader() always returned Optional.empty() on all other than the leader broker.
This has several severe consequences in the Pulsar Broker code base.

When a lookup is made to a topic that isn't currently loaded, the decision will be made in a distributed fashion on the follower brokers since the information about the leader broker is missing (because LeaderElectionService.getCurrentLeader() always returned Optional.empty()). This leads to races when assigning the topic ownership to a broker since the decision isn't made centrally on the leader broker.
I'll provide a separate PR for improving and fixing the topic ownership assignment. That PR is #13069 and it contains a failing test case for the topic ownership assignment / lookup problem.

Modifications

  • Add support for empheralOwner in MockZooKeeper
  • Add MultiBrokerBaseTest base test class for writing tests that involve multiple brokers
  • Add test case that fails without the fix
  • Make the fix to LeaderElectionImpl

@lhotari lhotari added type/bug The PR fixed a bug or issue reported a bug area/broker doc-not-needed Your PR changes do not impact docs labels Dec 1, 2021
@lhotari lhotari added this to the 2.10.0 milestone Dec 1, 2021
@lhotari lhotari self-assigned this Dec 1, 2021
@lhotari lhotari changed the title [Broker] Fix LeaderElectionService.getCurrentLeader and support for empheralOwner in MockZooKeeper [Broker] Fix LeaderElectionService.getCurrentLeader and add support for empheralOwner in MockZooKeeper Dec 1, 2021
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@merlimat can you please take a look ?

@lhotari my understanding is that this patch should go in 2.8 as well

@lhotari
Copy link
Member Author

lhotari commented Dec 1, 2021

my understanding is that this patch should go in 2.8 as well

@eolivelli yes, 2.8.x and 2.9.x are also impacted.

@merlimat
Copy link
Contributor

merlimat commented Dec 1, 2021

BTW: I also plan to get rid of MockZookeeper in the short term :)

@lhotari
Copy link
Member Author

lhotari commented Dec 1, 2021

BTW: I also plan to get rid of MockZookeeper in the short term :)

@merlimat +1, that would reduce the risk that it's the MockZookeeper that causes certain behavior in tests.

@merlimat
Copy link
Contributor

merlimat commented Dec 1, 2021

BTW: I also plan to get rid of MockZookeeper in the short term :)

@merlimat +1, that would reduce the risk that it's the MockZookeeper that causes certain behavior in tests.

Yes, the way it is, it doesn't help much. We've been chasing down many times issues that stemmed from it semantic being different from the real ZK in subtle ways.

For unit tests, I believe the LocalMemory is a better alternative, although we need to add the ephemeral owner there too.

@lhotari
Copy link
Member Author

lhotari commented Dec 1, 2021

Closing and reopening to get fix to flaky test for builds.

@lhotari lhotari closed this Dec 1, 2021
@lhotari lhotari reopened this Dec 1, 2021
@lhotari lhotari force-pushed the lh-fix-LeaderElectionService.getCurrentLeader branch from dad26de to e266578 Compare December 2, 2021 12:19
@lhotari lhotari force-pushed the lh-fix-LeaderElectionService.getCurrentLeader branch from e266578 to f68da4c Compare December 2, 2021 15:29
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeClass;

public abstract class MultiBrokerBaseTest extends MockedPulsarServiceBaseTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
this is actually not a "Test", we should not name it "**Test"

btw this is the current code style in Pulsar, so no big deal, just sharing this thought

@eolivelli
Copy link
Contributor

@linlinnn we must pick this patch to branch-2.8 before cutting a release.

2.8 is too unstable due to this problem

@lhotari lhotari merged commit 36a45ee into apache:master Dec 2, 2021
lhotari added a commit that referenced this pull request Dec 2, 2021
…or empheralOwner in MockZooKeeper (#13066)

* Add leader election unit test that uses multiple brokers

* Support empheralOwner in MockZooKeeper

* Use unique sessions for all brokers

* Add failing test case for leader broker information not available on other brokers

* Add test for reading the current leader

* Fix issue in leader election

* Address review feedback: make methods static

* Use unique-session only in MultiBrokerBaseTests

* Move tenant and namespace creation to it's own method

* Improve cleanup

* Add alwaysRun to BeforeClass

* Fix leaks in locking in MockZooKeeper

* Reduce code duplication

* Fix NPE when CreateMode is null

(cherry picked from commit 36a45ee)
@lhotari lhotari added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Dec 2, 2021
lhotari added a commit that referenced this pull request Dec 2, 2021
…or empheralOwner in MockZooKeeper (#13066)

* Add leader election unit test that uses multiple brokers

* Support empheralOwner in MockZooKeeper

* Use unique sessions for all brokers

* Add failing test case for leader broker information not available on other brokers

* Add test for reading the current leader

* Fix issue in leader election

* Address review feedback: make methods static

* Use unique-session only in MultiBrokerBaseTests

* Move tenant and namespace creation to it's own method

* Improve cleanup

* Add alwaysRun to BeforeClass

* Fix leaks in locking in MockZooKeeper

* Reduce code duplication

* Fix NPE when CreateMode is null

(cherry picked from commit 36a45ee)
@lhotari lhotari added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Dec 2, 2021
@lhotari
Copy link
Member Author

lhotari commented Dec 2, 2021

@eolivelli @linlinnn I cherry picked to branch-2.8 and branch-2.9 .

@lhotari
Copy link
Member Author

lhotari commented Dec 2, 2021

@linlinnn @eolivelli PR #13069 would also be necessary for 2.8 to fix the problems with topic ownership assignments.

lhotari added a commit to lhotari/pulsar that referenced this pull request Dec 3, 2021
…or empheralOwner in MockZooKeeper (apache#13066)

* Add leader election unit test that uses multiple brokers

* Support empheralOwner in MockZooKeeper

* Use unique sessions for all brokers

* Add failing test case for leader broker information not available on other brokers

* Add test for reading the current leader

* Fix issue in leader election

* Address review feedback: make methods static

* Use unique-session only in MultiBrokerBaseTests

* Move tenant and namespace creation to it's own method

* Improve cleanup

* Add alwaysRun to BeforeClass

* Fix leaks in locking in MockZooKeeper

* Reduce code duplication

* Fix NPE when CreateMode is null

(cherry picked from commit 36a45ee)
(cherry picked from commit 72690be)
lhotari added a commit that referenced this pull request Dec 3, 2021
lhotari added a commit to lhotari/pulsar that referenced this pull request Dec 3, 2021
- fixes cherry picking of apache#13066 in 72690be

(cherry picked from commit 59d8ad9)
BewareMyPower pushed a commit to streamnative/kop that referenced this pull request Dec 6, 2021
…951)

Fix https://github.com/streamnative/kop/runs/4425667410?check_suite_focus=true#step:5:283

#950

### Motivation

MockZookeeper API was changed in apache/pulsar#13066, the `CreateMode` shouldn't be null. We should use MetadataStore instead of mockZooKeeper since KoP already remove all zkClient usage.

### Modifications

Use MetadataStore instead of mockZooKeeper in ProducerIdManagerTest
BewareMyPower pushed a commit to streamnative/kop that referenced this pull request Dec 6, 2021
…951)

Fix https://github.com/streamnative/kop/runs/4425667410?check_suite_focus=true#step:5:283

#950

### Motivation

MockZookeeper API was changed in apache/pulsar#13066, the `CreateMode` shouldn't be null. We should use MetadataStore instead of mockZooKeeper since KoP already remove all zkClient usage.

### Modifications

Use MetadataStore instead of mockZooKeeper in ProducerIdManagerTest
BewareMyPower pushed a commit to streamnative/kop that referenced this pull request Dec 6, 2021
…951)

Fix https://github.com/streamnative/kop/runs/4425667410?check_suite_focus=true#step:5:283

#950

### Motivation

MockZookeeper API was changed in apache/pulsar#13066, the `CreateMode` shouldn't be null. We should use MetadataStore instead of mockZooKeeper since KoP already remove all zkClient usage.

### Modifications

Use MetadataStore instead of mockZooKeeper in ProducerIdManagerTest
BewareMyPower pushed a commit to streamnative/kop that referenced this pull request Dec 6, 2021
Fix #950

### Motivation

apache/pulsar#13066 introduced a breaking change to the `MockZooKeeper` API and causes test failure. 

### Modifications

* Fix failure test
* Upgrade pulsar version to 2.8.1.0
fxbing pushed a commit to fxbing/pulsar that referenced this pull request Dec 19, 2021
…or empheralOwner in MockZooKeeper (apache#13066)

* Add leader election unit test that uses multiple brokers

* Support empheralOwner in MockZooKeeper

* Use unique sessions for all brokers

* Add failing test case for leader broker information not available on other brokers

* Add test for reading the current leader

* Fix issue in leader election

* Address review feedback: make methods static

* Use unique-session only in MultiBrokerBaseTests

* Move tenant and namespace creation to it's own method

* Improve cleanup

* Add alwaysRun to BeforeClass

* Fix leaks in locking in MockZooKeeper

* Reduce code duplication

* Fix NPE when CreateMode is null
@lhotari
Copy link
Member Author

lhotari commented Sep 1, 2022

Further improvement in #17401

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.2 release/2.9.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants