Don't retain abstract req in response #1

seglo · 2017-11-07T15:55:10Z

I picked up ticket 5859 because it was marked with the newbie flag and it seemed like a good opportunity to learn more about about the request and response lifecycle in the Kafka broker.

We currently store AbstractRequest in RequestChannel.Request.bodyAndSize. RequestChannel.Request is, in turn, stored in RequestChannel.Response. We keep the latter until the response is sent to the client.

However, after KafkaApis.handle, we no longer need AbstractRequest apart from its string representation for logging. We could potentially replace AbstractRequest with a String representation (if the relevant logging is enabled). The String representation is generally small while some AbstractRequest subclasses can be pretty large. The largest one is ProduceRequest and we clear the underlying ByteBuffer explicitly in KafkaApis.handleProduceRequest. We could potentially remove that special case if AbstractRequest subclasses were not retained.

This was originally suggested by Jason Gustafson in the following PR apache#3801 (comment)

I removed the parsed AbstractRequest that was stored in RequestChannel.Request in the private bodyAndSize: RequestAndSize variable. Ismael points out that the AbstractRequest is mainly used by KafkaApi, so we should only need a reference there and nowhere else. However when I investigated I found several other usages within RequestChannel.Request itself. I've summarized these usages below:

updateRequestMetrics - Checks to see if the parsed request is a FetchRequest to access the isFromFollower value. Ismail mentioned in the ticket 5859 that "it needs the ApiKeys instance", but I don't know what he means by that
updateRequestMetrics - Requires the parsed size of the request in bytes to update the requestBytesHist metric
updateRequestMetrics - Calls requestDesc(true) or requestDesc(false) depending on whether trace logging enabled. It looks like it's used when request logging is enabled.
Request constructor has a trace logging line that calls requestDesc(true)

I made some decisions on my own. Below is a summary for review.

Parse the request in each handler in KafkaApi. Ideally this would be done a higher level and passed into each handler, but I was hesitant about changing all the handler method signatures. Effectively the response is only parsed once since only one handler can be called per request AFAIK.
I generate a detailed request description in the default constructor of RequestChannel.Request. I generate a detailed description because the codebase isn't consistent about requiring details or not. For example, there's a usage that's called for every instance of RequestChannel.Request (https://github.com/seglo/kafka/blob/trunk/core/src/main/scala/kafka/network/RequestChannel.scala#L107), so I thought it would be acceptable to parse the request once for details and retain it for use in other logging RequestChannel.Request.requestDesc.
I handle a special case while I have a local reference of the parsed request to determine if it's a FetchRequest and cache its isFromFollower property for use in updateRequestMetrics. I'm not sure how else to gain this information.
This implementation has the consequence of parsing the AbstractRequest twice, once for KafkaApi and once for the RequestChannel.Request constructor. I would be interested in finding a non-invasive way to avoid this, while maintaining immutability as requested by Ismael in the original ticket.

seglo · 2017-11-12T02:50:56Z

Closed in favour of the implementation in #2

…pache#7305) A partition log in initialized in following steps: 1. Fetch log config from ZK 2. Call LogManager.getOrCreateLog which creates the Log object, then 3. Registers the Log object Step #3 enables Configuration update thread to deliver configuration updates to the log. But if any update arrives between step #1 and #3 then that update is missed. It breaks following use case: 1. Create a topic with default configuration, and immediately after that 2. Update the configuration of topic There is a race condition here and in random cases update made in second step will get dropped. This change fixes it by tracking updates arriving between step #1 and #3 Once a Partition is done initializing log, it checks if it has missed any update. If yes, then the configuration is read from ZK again. Added unit tests to make sure a dirty configuration is refreshed. Tested on local cluster to make sure that topic configuration and updates are handled correctly. Reviewers: Jason Gustafson <jason@confluent.io>

seglo added 2 commits November 7, 2017 10:05

Don't retain abstract req in response

6c79015

convert sizeOfBodyInBytes to constant

aecfbba

seglo force-pushed the dont-retain-abstractreq-5859 branch from a738d27 to aecfbba Compare November 7, 2017 20:11

Unmangle imports in KafkaApis

ba84962

seglo mentioned this pull request Nov 12, 2017

Use RequestSummary in RequestChannel.Response #2

Closed

seglo closed this Nov 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't retain abstract req in response #1

Don't retain abstract req in response #1

seglo commented Nov 7, 2017 •

edited

seglo commented Nov 12, 2017

Don't retain abstract req in response #1

Don't retain abstract req in response #1

Conversation

seglo commented Nov 7, 2017 • edited

seglo commented Nov 12, 2017

seglo commented Nov 7, 2017 •

edited