Skip to content

Ensure broker is fully boostrapped before load manager register itself#2935

Merged
sijie merged 3 commits intoapache:masterfrom
merlimat:fix-init-sequence
Nov 20, 2018
Merged

Ensure broker is fully boostrapped before load manager register itself#2935
sijie merged 3 commits intoapache:masterfrom
merlimat:fix-init-sequence

Conversation

@merlimat
Copy link
Contributor

@merlimat merlimat commented Nov 6, 2018

Motivation

In some cases the broker can immediately gets assigned traffic before it's fully boostrapped.

This happens because the load manager is registering the broker in ZK before some of the initialization steps are completed.

This results in NPE, like :

Caused by: java.lang.NullPointerException
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.hasSchema(PersistentTopic.java:1815) ~[org.apache.pulsar-pulsar-broker-2.2.0-streamlio-22.jar:2.2.0-streamlio-22]
	at org.apache.pulsar.broker.service.ServerCnx.lambda$25(ServerCnx.java:836) ~[org.apache.pulsar-pulsar-broker-2.2.0-streamlio-22.jar:2.2.0-streamlio-22]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656) ~[?:1.8.0_181]

Modifications

  • Register the broker in ZK only after the full start sequence has been done. This will ensure other brokers will not discover this broker before it's ready.
  • Expose the "is ready" state in the VipStatus -- This will be used to make sure the load balancer will not direct any lookup request to the broker before it's ready.

@merlimat merlimat added the type/bug The PR fixed a bug or issue reported a bug label Nov 6, 2018
@merlimat merlimat added this to the 2.2.1 milestone Nov 6, 2018
@merlimat merlimat self-assigned this Nov 6, 2018
@merlimat
Copy link
Contributor Author

merlimat commented Nov 6, 2018

run integration tests

2 similar comments
@merlimat
Copy link
Contributor Author

merlimat commented Nov 6, 2018

run integration tests

@merlimat
Copy link
Contributor Author

merlimat commented Nov 6, 2018

run integration tests

Copy link
Contributor

@ivankelly ivankelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. Minor comments.

Probably out of scope for this patch, but perhaps we should run a healthcheck before advertising?

@merlimat
Copy link
Contributor Author

merlimat commented Nov 6, 2018

run cpp tests
run integration tests

@merlimat
Copy link
Contributor Author

merlimat commented Nov 6, 2018

run integration tests

4 similar comments
@merlimat
Copy link
Contributor Author

merlimat commented Nov 6, 2018

run integration tests

@merlimat
Copy link
Contributor Author

merlimat commented Nov 7, 2018

run integration tests

@merlimat
Copy link
Contributor Author

merlimat commented Nov 8, 2018

run integration tests

@sijie
Copy link
Member

sijie commented Nov 16, 2018

run integration tests

@sijie
Copy link
Member

sijie commented Nov 19, 2018

@ivankelly can you review this, since @merlimat already addressed your comments?

Copy link
Contributor

@rdhabalia rdhabalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sijie sijie dismissed ivankelly’s stale review November 20, 2018 20:10

the review comments are addressed

@sijie sijie merged commit 7efce98 into apache:master Nov 20, 2018
merlimat added a commit that referenced this pull request Dec 13, 2018
#2935)

### Motivation

In some cases the broker can immediately gets assigned traffic before it's fully boostrapped. 

This happens because the load manager is registering the broker in ZK before some of the initialization steps are completed. 

This results in NPE, like : 

```
Caused by: java.lang.NullPointerException
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.hasSchema(PersistentTopic.java:1815) ~[org.apache.pulsar-pulsar-broker-2.2.0-streamlio-22.jar:2.2.0-streamlio-22]
	at org.apache.pulsar.broker.service.ServerCnx.lambda$25(ServerCnx.java:836) ~[org.apache.pulsar-pulsar-broker-2.2.0-streamlio-22.jar:2.2.0-streamlio-22]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:656) ~[?:1.8.0_181]
```

### Modifications

 * Register the broker in ZK only after the full start sequence has been done. This will ensure other brokers will not discover this broker before it's ready.
 * Expose the "is ready" state in the VipStatus -- This will be used to make sure the load balancer will not direct any lookup request to the broker before it's ready.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type/bug The PR fixed a bug or issue reported a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants