Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move error handling of Consumer.consume timeouts to Node and allow to disable #443

Closed

Conversation

JaapRood
Copy link
Contributor

Finally I had the opportunity to follow up on discussions about #404. By moving the error handling logic for consuming a number of messages to the Node layer, it can be further controlled to implement such things are exponential back-off when being throttled by the Broker.

In order to not introduce any breaking changes I added an optional options argument to consumer.consume. The timeoutErrors option can be set to a true to prevent time-out errors from being handled, so the caller can handle those themselves. By default they are ignored and return an empty array of messages instead.

@JaapRood
Copy link
Contributor Author

There's some work left to be done, especially documentation wise, but I was hoping to get @webmakersteve's thoughts on this before investing more time into that.

@webmakersteve
Copy link
Contributor

Going to need to hold on doing this right away. It is definitely a breaking change, and I would almost prefer that the functionality just change so it does not provide it as an option to not handle the error code cases.

The point of the streams (Writable and Readable) is that they can manage the lower level APIs of consuming from Kafka for people who do not want to deal with the cases of errors, backoff, full queues, etc. I think there it makes sense that the logic be handled, but on the lower level functions, like consume, the library should strive to be 1:1 mappings.

So... I like the path we are moving in, but it is not extreme enough! I will likely be making a feature branch to work on this, but my time is very limited now!

@JaapRood
Copy link
Contributor Author

I like where you're going with the changes not being extreme enough! Running on top of this change I'm currently running a reimplemented KafkaConsumerStream, which pauses assignments to poll consumer.consume when messages aren't being consumed, fixing a whole class of problems like #440. Most notably, it also moves the retry logic, once again.

That said, it's quite a big change with many opinions applied, which I can imagine will take some time to get through. I was hoping that, by using this backwards-compatible API, we can make a step in the right direction, enabling such other implementations to be tried on top, while shipping as much as possible straight away. It should hopefully allow us to get to a point where we can try multiple things and not make a breaking change until we've got a proven way forward. The more common ground rather than local hacks, the better I'd say!

@JaapRood
Copy link
Contributor Author

Maybe it's an idea to merge this, but leave it undocumented for now? That gives those interested in collaborating common ground to do so, without this intermediate step ever being supported officially.

@JaapRood
Copy link
Contributor Author

JaapRood commented Jan 2, 2019

Having done more work on consumer streams lately, I've come to the realisation that we actually don't want these changes. The time-outs described are actually not time-outs from librdkafka fetching from Kafka, as that happens in a separate mechanism from consuming messages. Therefore, it would be totally ineffective to change consumption based on these time-out errors.

@JaapRood JaapRood closed this Jan 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants