Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't retain abstract req in response #1

Closed
wants to merge 3 commits into from

Conversation

seglo
Copy link
Owner

@seglo seglo commented Nov 7, 2017

I picked up ticket 5859 because it was marked with the newbie flag and it seemed like a good opportunity to learn more about about the request and response lifecycle in the Kafka broker.

We currently store AbstractRequest in RequestChannel.Request.bodyAndSize. RequestChannel.Request is, in turn, stored in RequestChannel.Response. We keep the latter until the response is sent to the client.

However, after KafkaApis.handle, we no longer need AbstractRequest apart from its string representation for logging. We could potentially replace AbstractRequest with a String representation (if the relevant logging is enabled). The String representation is generally small while some AbstractRequest subclasses can be pretty large. The largest one is ProduceRequest and we clear the underlying ByteBuffer explicitly in KafkaApis.handleProduceRequest. We could potentially remove that special case if AbstractRequest subclasses were not retained.

This was originally suggested by Jason Gustafson in the following PR apache#3801 (comment)

I removed the parsed AbstractRequest that was stored in RequestChannel.Request in the private bodyAndSize: RequestAndSize variable. Ismael points out that the AbstractRequest is mainly used by KafkaApi, so we should only need a reference there and nowhere else. However when I investigated I found several other usages within RequestChannel.Request itself. I've summarized these usages below:

  • updateRequestMetrics - Checks to see if the parsed request is a FetchRequest to access the isFromFollower value. Ismail mentioned in the ticket 5859 that "it needs the ApiKeys instance", but I don't know what he means by that
  • updateRequestMetrics - Requires the parsed size of the request in bytes to update the requestBytesHist metric
  • updateRequestMetrics - Calls requestDesc(true) or requestDesc(false) depending on whether trace logging enabled. It looks like it's used when request logging is enabled.
  • Request constructor has a trace logging line that calls requestDesc(true)

I made some decisions on my own. Below is a summary for review.

  • Parse the request in each handler in KafkaApi. Ideally this would be done a higher level and passed into each handler, but I was hesitant about changing all the handler method signatures. Effectively the response is only parsed once since only one handler can be called per request AFAIK.
  • I generate a detailed request description in the default constructor of RequestChannel.Request. I generate a detailed description because the codebase isn't consistent about requiring details or not. For example, there's a usage that's called for every instance of RequestChannel.Request (https://github.com/seglo/kafka/blob/trunk/core/src/main/scala/kafka/network/RequestChannel.scala#L107), so I thought it would be acceptable to parse the request once for details and retain it for use in other logging RequestChannel.Request.requestDesc.
  • I handle a special case while I have a local reference of the parsed request to determine if it's a FetchRequest and cache its isFromFollower property for use in updateRequestMetrics. I'm not sure how else to gain this information.
  • This implementation has the consequence of parsing the AbstractRequest twice, once for KafkaApi and once for the RequestChannel.Request constructor. I would be interested in finding a non-invasive way to avoid this, while maintaining immutability as requested by Ismael in the original ticket.

@seglo seglo force-pushed the dont-retain-abstractreq-5859 branch from a738d27 to aecfbba Compare November 7, 2017 20:11
@seglo
Copy link
Owner Author

seglo commented Nov 12, 2017

Closed in favour of the implementation in #2

@seglo seglo closed this Nov 12, 2017
seglo pushed a commit that referenced this pull request Oct 29, 2019
…pache#7305)

A partition log in initialized in following steps:

1. Fetch log config from ZK
2. Call LogManager.getOrCreateLog which creates the Log object, then
3. Registers the Log object

Step #3 enables Configuration update thread to deliver configuration
updates to the log. But if any update arrives between step #1 and #3
then that update is missed. It breaks following use case:

1. Create a topic with default configuration, and immediately after that
2. Update the configuration of topic

There is a race condition here and in random cases update made in
second step will get dropped.

This change fixes it by tracking updates arriving between step #1 and #3
Once a Partition is done initializing log, it checks if it has missed any
update. If yes, then the configuration is read from ZK again.

Added unit tests to make sure a dirty configuration is refreshed. Tested
on local cluster to make sure that topic configuration and updates are
handled correctly.

Reviewers: Jason Gustafson <jason@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant