Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions metadata compaction #7377

Merged
merged 39 commits into from
Jul 1, 2020

Conversation

srkukarni
Copy link
Contributor

(If this PR fixes a github issue, please add Fixes #<xyz>.)

Fixes #

(or if this PR is one task of a github issue, please add Master Issue: #<xyz> to link to the master issue.)

Master Issue: #

Motivation

Currently we do not compact FunctionMetadata topic. This pr adds the ability to do that

Modifications

Describe the modifications you've done.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)
  • The rest endpoints: (yes / no)
  • The admin cli options: (yes / no)
  • Anything that affects deployment: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@srkukarni srkukarni added this to the 2.7.0 milestone Jun 27, 2020
@srkukarni srkukarni self-assigned this Jun 27, 2020
@jerrypeng
Copy link
Contributor

@srkukarni the reader in the FunctionMetadataTopicTailer needs to

.readCompacted(true)

@srkukarni
Copy link
Contributor Author

@srkukarni the reader in the FunctionMetadataTopicTailer needs to

.readCompacted(true)

https://github.com/apache/pulsar/pull/7377/files#diff-fbc6eb611de17f87b29ba52e30bb7fcbR140

break;
default:
log.warn("Received request with unrecognized type: {}", serviceRequest);
if (workerConfig.getCompactMetadataTopic()) {
Copy link
Contributor

@jerrypeng jerrypeng Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should we just check if the message has a data or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually prefer explicit setting.
Plus with this change, key is being set for whether you are compacting the topic or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way for a user to that is already using functions to be able to turn compaction on? Is there a path of migration for these users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest thing maybe to change this check from

if (workerConfig.getCompactMetadataTopic()

to

if (messsage.getdata() == null)

So old messages can get processed correctly as well as the the new messages in the new format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format of the messages are now different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the version is added as part of properties right? We can still process the messages differently based on whether the data == null or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just the version. We actually write function metadata when compaction is turned on. In case when it’s not compacted, we write the service request for backwards compatibility.

category = CATEGORY_FUNC_METADATA_MNG,
doc = "Should the metadata topic be compacted?"
)
private Boolean compactMetadataTopic = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current approach, this is a dangerous flag. If a user mistakenly flips the flag on. It could render their cluster corrupt. We should perhaps add more warnings for what this config with do. Instead of "compactMetadataTopic" maybe we can rename it to "useCompactedMetadataTopic"? So that what it does is more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change the name. Although the possibility of corruption is low. Because before a leader can actually start writing, it needs to read all existing messages and the messages will not deserialize either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add some more content to the doc annotation on what the impact of flipping the flag is.

lastMessageSeen = exclusiveLeaderProducer.newMessage()
.key(key)
.value(toWrite)
.property("version", Long.toString(functionMetaData.getVersion()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use a

private const final String

variable for "version"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

try {
lastMessageSeen = exclusiveLeaderProducer.send(serviceRequest.toByteArray());
lastMessageSeen = exclusiveLeaderProducer.newMessage()
.key(key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will add a key for both the old format and new format right? Not sure we should add a key for the old format. Perhaps, whether checking whether the message has a key or not can be used to determine whether its the old format or new format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

break;
default:
log.warn("Received request with unrecognized type: {}", serviceRequest);
if (workerConfig.getUseCompactedMetadataTopic()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change this check to if a key exists or not? This creates an avenue in which a existing cluster can transition to use a compacted metadata topic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think thats a good idea. I would rather have worker fail here unless specifically configured to have compaction enabled or disabled.

@srkukarni srkukarni merged commit 3d94553 into apache:master Jul 1, 2020
@srkukarni srkukarni deleted the functions_metadata_compaction branch July 1, 2020 04:27
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
* Function workers re-direct call update requests to the leader

* Fixed test

* tests pass

* Working version

* Fix test

* Short circuit update

* Fix test

* Fix test

* Fix tests

* Added one more catch

* Added one more catch

* Seperated internal and external errors

* Fix test

* Address feedback

* Do not expose updateOnLeader to functions

* hide api

* hide api

* removed duplicate comments

* Do leadership changes in function metadata manager

* make the function sync

* Added more comments

* Throw error

* Changed name

* address comments

* Deleted unused classes

* Rework metadata manager

* Working

* Fix test

* A better way for test

* Address feedback

* Added an option to compact function metadata topic

* Address feedback

* Incorporate feedback

Co-authored-by: Sanjeev Kulkarni <sanjeevk@splunk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants