You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the minFetchSize property of requested (topic, partition) for consume requests represents bytes; that is, this is a hint to the Tank broker, so that if the request needs to tail the partition, then Tank will not immediately respond as soon as it has any data, but as it soon as it has equal or more bytes worth of data(bundles data) in the topic partition.
However, unless Kafka, we can and we do track the number of messages appended to the segment for requests tailing the partition -- we can do that because that information is encoded in the publish request, where e.g Kafka would have to parse message sets which would potentially mean decompressing them.
So maybe we should change the semantics of that property to mean number of messages, not number of bytes -- so that by e.g setting that to 5, it would ask the Tank broker to return with a consume response as soon as it has 5 or more new messages appended in the broker. Which seems to me far more useful.
The text was updated successfully, but these errors were encountered:
We should also respect the (minFetchSize, maxWait) hints not just for when processing a 'tail' request, but even when we are consuming from a specific endpoint.
This is mostly about supporting edge cases where we repeatedly ask for new messages, and since last time we asked, say, only one message has been appended, so we won't need to wait until we get something, but will immediately respond with that 1 message, which may not be optimal for the application.
Currently, the
minFetchSize
property of requested (topic, partition) for consume requests represents bytes; that is, this is a hint to the Tank broker, so that if the request needs to tail the partition, then Tank will not immediately respond as soon as it has any data, but as it soon as it has equal or more bytes worth of data(bundles data) in the topic partition.However, unless Kafka, we can and we do track the number of messages appended to the segment for requests tailing the partition -- we can do that because that information is encoded in the publish request, where e.g Kafka would have to parse message sets which would potentially mean decompressing them.
So maybe we should change the semantics of that property to mean number of messages, not number of bytes -- so that by e.g setting that to 5, it would ask the Tank broker to return with a consume response as soon as it has 5 or more new messages appended in the broker. Which seems to me far more useful.
The text was updated successfully, but these errors were encountered: