Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] fix UniformLoadShedder selecet wrong overloadbroker and underloadbroker #19

Closed
wants to merge 74 commits into from

Conversation

aloyszhang
Copy link
Owner

This PR is for running tests for upstream PR apache#21025.

heesung-sn and others added 28 commits August 18, 2023 21:26
…currently (apache#20971)

### Motivation

**Background**: when calling `pulsar-admin topics stats --get-earliest-time-in-backlog <topic name>`, Pulsar will read the first entry which is not acknowledged, and respond with the entry write time. The flow is like this:
- get the mark deleted position of the subscription
- if no backlog, response `-1`
- else read the next position of the mark deleted position, and respond with the entry write time.

**Issue**: if the command `pulsar-admin topics stats --get-earliest-time-in-backlog <topic name>` and `consumer.acknowledge` are executed at the same time, the step 2 in above flow will get a position which is larger than the last confirmed position, lead a read entry error.

| time | `pulsar-admin topics stats --get-earliest-time-in-backlog <topic name>` | `consumer.acknowledge` |
| --- | --- | --- |
| 1 | mark deleted position is `3:1` and LAC is `3:2` now |
| 2 | the check `whether has backlog` is passed |
| 3 | | acknowledged `3:2`, mark deleted position is `3:2` now |
| 4 | calculate next position: `3:3` |
| 5 | Read `3:3` and get an error: `read entry failed` |

Note: the test in PR is not intended to reproduce the issue.

### Modifications

Respond `-1` if the next position of the mark deleted position is larger than the LAC
…data. sec ver. (apache#20620)

Co-authored-by: wangjinlong <wangjinlong@zhihu.com>
…ic migration (apache#21029)

Co-authored-by: Vineeth Polamreddy <vineeth.polamreddy@verizonmedia.com>
### Motivation

Removing `webUrl` null-check, because it couldn't be null.

### Modifications

Removing `webUrl` null-check
…ed on topic, when dedup is enabled and no producer is there (apache#20951)
…pache#21035)

Motivation: After [PIP-118: reconnect broker when ZooKeeper session expires](apache#13341), the Broker will not shut down after losing the connection of the local metadata store in the default configuration. However, before the ZK client is reconnected, the events of BK online and offline are lost, resulting in incorrect BK info in the memory. You can reproduce the issue by the test `BkEnsemblesChaosTest. testBookieInfoIsCorrectEvenIfLostNotificationDueToZKClientReconnect`(90% probability of reproduce of the issue, run it again if the issue does not occur)

Modifications: Refresh BK info in memory after the ZK client is reconnected.
liangyepianzhou and others added 27 commits September 1, 2023 16:06
…and schema. (apache#21093)

Fixes apache#21075 

### Motivation

When the topic is loaded, it will delete the topic-level policy if it is enabled. But if the topic is not loaded, it will directly delete through managed ledger factory. But then we will leave the topic policy there. When the topic is created next time, it will use the old topic policy

### Modifications

When deleting the topic, delete the schema and topic policies even if the topic is not loaded.
## Motivation
Handle ack hole case:
For example:
```markdown
                     Chunk-1 sequence ID: 0, chunk ID: 0, msgID: 1:1
                     Chunk-2 sequence ID: 0, chunk ID: 1, msgID: 1:2
                     Chunk-3 sequence ID: 0, chunk ID: 0, msgID: 1:3
                     Chunk-4 sequence ID: 0, chunk ID: 1, msgID: 1:4
                     Chunk-5 sequence ID: 0, chunk ID: 2, msgID: 1:5
```
 Consumer ack chunk message via ChunkMessageIdImpl that consists of all the chunks in this chunk
 message(Chunk-3, Chunk-4, Chunk-5). The Chunk-1 and Chunk-2 are not included in the
 ChunkMessageIdImpl, so we should process it here.
## Modification
Ack chunk-1 and chunk-2.
### Modifications
When upgraded the pulsar version from 2.9.2 to 2.10.3, and the isolated group feature not work anymore.
Finally, we found the problem. In IsolatedBookieEnsemblePlacementPolicy, when it gets the bookie rack from the metadata store cache, uses future.isDone() to avoid sync operation. If the future is incomplete, return empty blacklists. 
The cache may expire due to the caffeine cache `getExpireAfterWriteMillis` config, if the cache expires, the future may be incomplete. (apache#21095 will correct the behavior)

In 2.9.2, it uses the sync to get data from the metadata store, we should also keep the behavior.
… the fatal exception (apache#21079)

### Motivation

This PIP is to improve the current function exception handler. It will be applied to both Pulsar Function and Pulsar IO Connector.

For more details, please refer to `pip-297.md`
Signed-off-by: Jiwe Guo <technoboy@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.