Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PIP 70][Issue 8617] Introduce lightweight broker entry metadata #8618

Merged
merged 26 commits into from
Dec 12, 2020

Conversation

aloyszhang
Copy link
Contributor

Fixes #8617

Motivation

Introduce lightweight raw Message metadata, details can be found PIP-70

Modifications

  1. wire protocol add RawMessageMetadata and supports_raw_message_meta for FeatureFlags
  2. change how produced message is saved in bookkeeper: add raw metadata for message
  3. change how message is seek-by-time
  4. change how message send back to Consumer: skip metadata if consumer not supprot raw metadata

Verifying this change

  • Added tests for parse/skip raw message metadata
  • Added test for how message seek-by broker timestamp for message

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • The wire protocol: (yes / no)

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@jiazhai jiazhai changed the title [Issue 8617] Introduce lightweight raw Message metadata [PIP 70][Issue 8617] Introduce lightweight raw Message metadata Nov 19, 2020
@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

1 similar comment
@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@jiazhai
Copy link
Member

jiazhai commented Nov 21, 2020

@BewareMyPower Would you please also take a look?

@BewareMyPower
Copy link
Contributor

@jiazhai Ok

Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a method isRawMetadataEnable() to ManagedLedgerConfig like:

public boolean isRawMetadataEnable() {
    return isBrokerTimestampForMessageEnable();
}

Then if we want to support more features like message sequence id, we'll only need to change this method to return isConfig1() || isConfig2() || .... And other occurrences like OpAndEntry#initiate() need no changes.

@aloyszhang
Copy link
Contributor Author

@BewareMyPower Would you give some deatils about what the method ManagedLedgerConfig#isRawMetadataEnable() is used for?

@BewareMyPower
Copy link
Contributor

See

if (config.isBrokerTimestampForMessageEnable()) {
existsOp = OpAddEntry.create(existsOp.ml, existsOp.dataWithRawMetadata, existsOp.callback, existsOp.ctx);
} else {

and

if (ml.getConfig().isBrokerTimestampForMessageEnable()) {
duplicateBuffer = Commands.addRawMessageMetadata(duplicateBuffer);
dataWithRawMetadata = duplicateBuffer.retainedDuplicate();

These two code blocks both check isBrokerTimestampForMessageEnable and determine whether to serialize or deserialize RawMetadata. However, if we add a new field to RawMessageMetadata, e.g. message_sequence_id, the check needs to be changed to

if (config.isBrokerTimestampForMessageEnable() || config.isMessageSequenceIdEnable())

Right? So we should wrap the check into a single method, like:

public boolean isRawMetadataEnable() {
    return isBrokerTimestampForMessageEnable() /* || isMessageSequenceIdEnable() or more checks */;
}

@aloyszhang
Copy link
Contributor Author

@BewareMyPower
Thanks for your suggention.
I'll add ManagedLedgerConfig#isRawMetadataEnable() before serialize/deserialize.

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@aloyszhang
Copy link
Contributor Author

@BewareMyPower I applied your comment, PTAL.

Copy link
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

4 similar comments
@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@jiazhai
Copy link
Member

jiazhai commented Nov 24, 2020

/pulsarbot run-failure-checks

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@jiazhai
Copy link
Member

jiazhai commented Nov 24, 2020

@aloyszhang Thanks for the great work. Would you please help add backward compact test, there was already some tests, e.g. SmokeTest2_5 to do this, maybe we need only change the broker.conf to enable this feature, and test the old client could read data correctly.

@aloyszhang
Copy link
Contributor Author

aloyszhang commented Nov 25, 2020

@jiazhai I will hanlde backward compact test soon.

@jiazhai
Copy link
Member

jiazhai commented Dec 1, 2020

/pulsarbot run-failure-checks

1 similar comment
@jiazhai
Copy link
Member

jiazhai commented Dec 2, 2020

/pulsarbot run-failure-checks

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@@ -865,6 +866,11 @@
"please enable the system topic first.")
private boolean topicLevelPoliciesEnabled = false;

@FieldContext(
category = CATEGORY_SERVER,
doc = "List of interceptors for broker metadata.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
doc = "List of interceptors for broker metadata.")
doc = "List of interceptors for entry metadata.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fix this.


ByteBuf duplicateBuffer = data.retainedDuplicate();
if (ml.getConfig().isBrokerEntryMetaEnabled()) {
duplicateBuffer = Commands.addBrokerEntryMetadata(duplicateBuffer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not convinced why do we need to this here. ManagedLedger only handles serialized entry. The entry metadata should be appended at the broker level. I think the right place to add this logic should be done in https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java#L342.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we described in PIP-70, this feature can support continous sequenceId for Pulsar entry in the future, so, I think do these operations here is in favour of the future features. For this broker-timestamp feature only, move this logic as you suggested is a good choice. But for further extension, maybe do this logic by ManagedLedger is better. What's your opinion about this?

try {
msg = MessageImpl.deserialize(entry.getDataBuffer());
return msg.isExpired(messageTTLInSeconds);
pair = MessageImpl.deserializeWithBrokerEntryMetaData(entry.getDataBuffer());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should expose the broker metadata in the Message. So this would avoid using Pair and a lot of if-else logic.

We can improve msg.isExpired logic. If entry metadata is present, use broker timestamp; otherwise use client timestamp.

Copy link
Contributor Author

@aloyszhang aloyszhang Dec 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion. I'll optimize the implement as follow steps:

  1. expose the entry metadata to MessageImpl
  2. if entry metadata exist, skip the deserialization of MessageMetadata
  3. if not exist, deserialization the MessageMetadata

@sijie sijie changed the title [PIP 70][Issue 8617] Introduce lightweight raw Message metadata [PIP 70][Issue 8617] Introduce lightweight broker entry metadata Dec 10, 2020
@codelipenghui codelipenghui added this to the 2.8.0 milestone Dec 11, 2020
@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

@aloyszhang
Copy link
Contributor Author

/pulsarbot run-failure-checks

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aloyszhang the change looks great now! I left a couple of comments. Once you address them, I think this change is ready to merge.

@sijie
Copy link
Member

sijie commented Dec 12, 2020

@aloyszhang great job!

@jiazhai jiazhai merged commit 6275297 into apache:master Dec 12, 2020
@aloyszhang aloyszhang deleted the raw-meta branch December 25, 2020 03:07
@Anonymitaet
Copy link
Member

@aloyszhang thanks for your great work.

Shall the changes be documented in the user guide?

If so, could you please help add the docs accordingly? Then you can ping me to review, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]Introduce lightweight raw Message metadata
6 participants