-
Notifications
You must be signed in to change notification settings - Fork 14.5k
KAFKA-12467: Add controller-side snapshot generation #10366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
da4d236
to
5c4ed91
Compare
9ae7f3c
to
6cc1e49
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @cmccabe, this looks nice 👍
Returning an iterator of batches works well when the control managers only have a single iterable thing that needs to be translated to records. Do you think we might have cases (eventually) where the control classes have more complex state? It might be more flexible to take a consumer rather than return an iterable:
// In the control manager
void toRecords(Consumer<List<ApiMessageAndVersion>> batchConsumer) {
...
}
A few other comments/questions inline.
|
||
ClientQuotaControlIterator(long epoch) { | ||
this.epoch = epoch; | ||
this.iterator = clientQuotaData.entrySet(epoch).iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're getting the entry set of a particular epoch, does this mean we'll be getting consistent snapshots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, snapshots are consistent.
metadata/src/main/java/org/apache/kafka/controller/QuorumController.java
Show resolved
Hide resolved
if (result < 0) { | ||
return OptionalLong.empty(); | ||
} else if (result > 0) { | ||
return OptionalLong.of(exponentialBackoff.backoff(numWriteTries - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use result
here instead of numWriteTries
? If not, maybe we can change generateBatch() to return a boolean instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to use result
here, I guess.
Hmm... I don't think that would be a problem. You could have a single iterator that iterates over multiple data structures X and Y. You would just need to store the state of whether you were done with X before starting Y. In general one of the goals of controller snapshots is to avoid "stopping the world" while the snapshot is generated. So a synchronous API would not be useful here. The iterator API allows us to take only a few batches of records at a time. |
Implement controller-side snapshot generation.
Fix spotbugs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the iterator vs consumer. All this LGTM
The only thing I might consider is to move SnapshotGeneratorManager into its own class, but that's not critical to for this PR
Implement writing snapshots in the controller, and reading them back.