Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix leader broker cannot be determined when the advertised address and advertised listeners are configured #21894

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Jan 13, 2024

Fixes #21897

Motivation

There's currently a problem that the leader broker cannot be determined when the advertised address and advertised listeners are configured.
The workaround to the problem is to properly configure an internal advertised listener which matches the advertised address. However, applying the workaround is brittle.

The code base has had inconsistent ways for a unique identifier for a broker. In some cases, there's a dummy "http://" prefix for the identifier even when the port is the https port. This is very confusing.
This PR makes the broker identifier match the "lookup service address" in all cases. The "lookup service address" has been also renamed in this PR to "broker id" since that is the true meaning of the previously called "lookup service address".

Modifications

  • consistently use the "broker id" as the broker's identifier in Pulsar load balancing.
    • Get rid of the dummy "http://" prefix (which isn't a http url since it might point to a https port)
  • for backwards compatibility & mixed version support, add brokerId field to the leader broker information and keep the previous serviceUrl without changing it's meaning and content.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: lhotari#170

@lhotari lhotari added the type/bug The PR fixed a bug or issue reported a bug label Jan 13, 2024
@lhotari lhotari added this to the 3.3.0 milestone Jan 13, 2024
@lhotari lhotari self-assigned this Jan 13, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 13, 2024
@lhotari lhotari requested a review from merlimat January 13, 2024 01:22
@lhotari
Copy link
Member Author

lhotari commented Jan 13, 2024

bug report on Pulsar Slack, https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1705041494476059
Waiting for the user to create an issue. In the meantime, error messages (to make this PR searchable with error messages):

05:19:57.541 [pulsar-2-5] WARN  org.apache.pulsar.broker.namespace.NamespaceService - The current leader broker LeaderBroker(serviceUrl=http://xx.yy.zz.xy:8080/) isn't active. Handling load manager decisions in a decentralized way. NamespaceBundle[tenant/ns/0x2fffffff_0x3269f69d]

...

05:19:55.176 [pulsar-2-6] WARN  org.apache.pulsar.broker.namespace.NamespaceService - Broker http://xx.yy.zz.xy:8080/ (xx.yy.zz.xy:8080) couldn't be found in available brokers prod-xx-1-pulsar-broker-10:8080,prod-xx-1-pulsar-broker-yy-7:8080,prod-xx-1-pulsar-broker-zz-5:8080,prod-xx-1-pulsar-broker-7:8080,prod-xx-1-pulsar-broker-zz-2:8080,prod-xx-1-pulsar-broker-yy-1:8080,prod-xx-1-pulsar-broker-yy-4:8080,prod-xx-1-pulsar-broker-2:8080,prod-xx-1-pulsar-broker-12:8080,prod-xx-1-pulsar-broker-yy-2:8080,prod-xx-1-pulsar-broker-15:8080

UPDATE: reported as #21897

@codecov-commenter
Copy link

codecov-commenter commented Jan 13, 2024

Codecov Report

Attention: 49 lines in your changes are missing coverage. Please review.

Comparison is base (c834feb) 73.72% compared to head (96fe11b) 73.60%.
Report is 13 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #21894      +/-   ##
============================================
- Coverage     73.72%   73.60%   -0.12%     
- Complexity    32405    32406       +1     
============================================
  Files          1859     1861       +2     
  Lines        138273   138592     +319     
  Branches      15158    15182      +24     
============================================
+ Hits         101939   102015      +76     
- Misses        28488    28686     +198     
- Partials       7846     7891      +45     
Flag Coverage Δ
inttests 24.12% <22.22%> (-0.16%) ⬇️
systests 23.65% <19.19%> (-0.25%) ⬇️
unittests 72.90% <73.23%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...n/java/org/apache/pulsar/broker/PulsarService.java 82.31% <100.00%> (+0.16%) ⬆️
...g/apache/pulsar/broker/admin/impl/BrokersBase.java 70.86% <100.00%> (+0.25%) ⬆️
...lsar/broker/loadbalance/LeaderElectionService.java 100.00% <100.00%> (ø)
...ker/loadbalance/extensions/BrokerRegistryImpl.java 84.76% <100.00%> (ø)
...dbalance/extensions/ExtensibleLoadManagerImpl.java 79.56% <100.00%> (-0.40%) ⬇️
...xtensions/channel/ServiceUnitStateChannelImpl.java 84.39% <100.00%> (-0.07%) ⬇️
...e/extensions/policies/IsolationPoliciesHelper.java 81.81% <100.00%> (ø)
...roker/loadbalance/impl/ModularLoadManagerImpl.java 84.65% <100.00%> (-0.20%) ⬇️
...er/loadbalance/impl/ModularLoadManagerWrapper.java 94.44% <100.00%> (+3.14%) ⬆️
...rg/apache/pulsar/broker/service/BrokerService.java 81.09% <ø> (-0.16%) ⬇️
... and 14 more

... and 82 files with indirect coverage changes

Copy link
Contributor

@dragosvictor dragosvictor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix! These URLs were always confusing.

Left a couple of comments below.

@lhotari
Copy link
Member Author

lhotari commented Jan 14, 2024

Nice fix! These URLs were always confusing.

Left a couple of comments below.

Thanks for the review and suggestions. I made the suggested changes.

@lhotari
Copy link
Member Author

lhotari commented Jan 14, 2024

There's also a test case that reproduces the issue.

Copy link
Contributor

@dragosvictor dragosvictor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Meet0861
Copy link

@lhotari can this be cherry-picked in 2.10.4 and 2.11 also please

@lhotari
Copy link
Member Author

lhotari commented Jan 18, 2024

@lhotari can this be cherry-picked in 2.10.4 and 2.11 also please

@Meet0861 I don't think that a cherry-pick patch could be applied cleanly. A backport would be needed.

@lhotari lhotari merged commit 3158fd3 into apache:master Jan 19, 2024
47 checks passed
@lhotari
Copy link
Member Author

lhotari commented Jan 19, 2024

There seems to be 2 other patches that are related and might be needed when backporting to older branches: #21015 and #21633 .

lhotari added a commit that referenced this pull request Jan 21, 2024
…sed address and advertised listeners are configured (#21894)

(cherry picked from commit 3158fd3)
@lhotari lhotari modified the milestones: 3.3.0, 3.2.0 Jan 21, 2024
lhotari added a commit that referenced this pull request Jan 26, 2024
…sed address and advertised listeners are configured (#21894)

(cherry picked from commit 3158fd3)
@lhotari
Copy link
Member Author

lhotari commented Jan 26, 2024

@lhotari can this be cherry-picked in 2.10.4 and 2.11 also please

@Meet0861 I don't think that a cherry-pick patch could be applied cleanly. A backport would be needed.

@Meet0861 The support for 2.10.x and 2.11.x has ended according to the Apache Pulsar Release Policy. There's a mailing list discussion about making an explicit decision about 2.10 and 2.11 . Please participate in the discussion.

lhotari added a commit that referenced this pull request Jan 26, 2024
…sed address and advertised listeners are configured (#21894)

(cherry picked from commit 3158fd3)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 1, 2024
…sed address and advertised listeners are configured (apache#21894)

(cherry picked from commit 3158fd3)
(cherry picked from commit 358d122)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 6, 2024
…sed address and advertised listeners are configured (apache#21894)

(cherry picked from commit 3158fd3)
(cherry picked from commit 358d122)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants