-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add possibility to consume EOF messages #803
Add possibility to consume EOF messages #803
Conversation
…nable this flag When the flag is not set, the behaviour doesn't change - refactor consume to return some of the errors as messages instead - eof messages include partition and offset
Is there anything I can do to get some feedback on this? @webmakersteve @iradul ? |
Thank you for PR! Library is certainly missing this feature. |
Hey, thanks for you reply :) Good point. I see some pros and cons. As for cons:
Pros is that it doesn't need this new switch to enable a different mode. I could also imagine offering both (always having the callbacks, but then also a switch to enable them as messages). Okay, now for questions how this would work. I guess we would introduce a new callback, something like |
I don't see this as a significant performance cost but it's certainly worth testing and comparing.
Good point. Guess we'll have to look at the source and figure how to implement this to get the answer.
Yes, I think |
Okay. I'll try this callback approach for now, let's see where that leads us. |
Ah, if we use |
Hm, I wonder how callback registration should work. The other callbacks are configured as part of the configuration for the consumer. But that's because they're part of librdkafka CONFIGURATION. |
You are right. Setting |
So you mean like is done in this PR? Maybe we don't even need this |
Yes, if we emit EOF events we don't need any extra setting. When |
Okay. Then I'll remove the |
Good name for EOF event could be |
Where would that name appear? |
I didn't entirely get your vision for how we communicate the EOF events back yet. Are you okay with mixing them with normal messages now? Or through what other means should they be communicated back (as we discarded the callback idea AFAICT). Then the question is how the json should look like and I would expect maybe something like
Maybe also adding a field for the error description (which would be |
I removed the |
Let me know when the changes go in the right direction code-wise. Then I can also add some documentation to the README and add the feature to the other APIs. |
3250b88
to
26bc529
Compare
I don't think we should mix normal messages with EOF events. Here is how I see this:
I don't think error message and error code are useful since they're constant so we can omit those. |
Ah that makes sense yes 👍 |
Oh, that way we also won't have the additional switches between js and c++ land in the |
…h messages. Also add typescript bindings
Co-authored-by: Gabriel Assis Bezerra <gabriel.bezerra@gmail.com>
3872501
to
739dfa0
Compare
@iradul May you review again, please. I implemented it now as you suggested and also added a few e2e tests :). |
Thanks you! |
Hey @iradul, cool that the PR finally got merged :). I wonder, did you get the emails I send some time ago to you and @webmakersteve where I asked about getting contribution rights and being more involved in the overall project? Best Fabian |
librdkafka supports to send EOF messages when reaching the end of a partition. This has to be enabled by setting the
enable.partition.eof
setting which is documented with the following information:In our application, we need to receive these messages to distinguish between having reached the end of a partition and the case where we just haven't received all data yet.
This PR adds the possibility to receive these eof messages to the non-flowing mode of the standard API. It would also be possible to add this possibility to the other APIs. This PR is backward-compatible: It doesn't change any behaviour when not enabling this feature by calling a function on the consumer. This function is modelled in the same way as
setDefaultTimeout
is modelled, i.e. it sets an instance variable on the consumer.When designing this feature, the question came up if these EOF messages should be handled as errors (i.e.
consume
should return a promise error) or as just another kind of message. I decided to use the approach to return this as another message type. That is, with this change, the resulting array inconsume
can not only return content messages but also EOF error messages when this feature is enabled. It can be enabled by callingconsumer.setErrorsAsMessages(true)
. The reason for this decision are as follows:consume
discards errors when it already has some messages to be returned. This is the case because it can't return both an error and messages. I think it's useful in general to have an option to return these errors as messages instead, as that way both can be returned and they're not discarded. As this would be a breaking change, a flag is added to enable this. In this PR, only the EOFs are returned in this way though. But we already saw another important use case for us: We want to know when we get anOffsetOutOfRange
error. Currently, this is sometimes not forwarded fromnode-rdkafka
: If we get some messages from one partition and then get an OffsetOutOfRange error from another partition we miss this message. For the EOF messages, this is even more problematic, as they are usually returned after some other messages are returned.Internally, this behaviour is implemented by changing the behaviour of
KafkaConsumer::Consume
to return some errors not asBaton
s but still keep them as messages. This is done because we still need more information than just the error code: We also want to return theoffset
andpartition
for EOF messages as they may be useful for consumers. I first tried to add these additional fields in theBaton
but found it too complicated to figure out how to delete the data at the right time then. This approach also makes serializing the data to JSON easier, as theToV8Object
just had to be extended for the eof message type. I checked all use sites of this function and changed them accordingly.