-
Notifications
You must be signed in to change notification settings - Fork 123
-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not consuming the first message on the topic #62
Comments
I'm using the following versions, if it helps:
|
Updated to |
Verified that it happens in both sync and async commitmode. Line 432 in 00a6a9e
Next message on topic "foo" (offset 1) is ignored as offset was already committed. Don't understand why this happens only after restart and not while running 🤔 |
@astubbs : Added a simple integration-test that reproduces the issue: JorgenRingen@ad52540 |
Hi @mauricioszabo! Welcome to the project and thanks for the feedback! Sorry I missed this issue - been a bit crazy over the New Years :) That’s awesome you’re using it with Clojure! It’s been on my list of languages to play with for years… I'll put this at the top of my todo list now - I'm very much a bugs first kind of guy. Ah, good old off by one bugs :) Thank you again @JorgenRingen ! Great detective work there people - though I’m sad though that this wasn’t already covered in our test suite :/ Perhaps it is, and it only happens when against a real broker 🤔 ... I've iterated a bit on the test, added another one that demonstrates what @mauricioszabo describes, as well as keeping @JorgenRingen 's identification where it first goes wrong.. On another note @mauricioszabo - if you come across any area where the API could be improved for Clujore users please let us know! It possible we can still tweet things, or create a Clojure wrapper in a seperate module. (That goes for any jvm language BTW..) |
Btw, did you come across this accidents or were you intentionally boundary testing? Have you had good results with the library otherwise? I wonder if there's some other interesting boundary conditions that aren't converted with integration tests. |
Ok, I think I've got it. Try out the linked PR? It's caused when resuming, when there is no encoded offset information - the first message will always be skipped. The reason why it works on the 3rd invocation, is because nothing was committed in the 2nd (as there were no messages processed as the 2nd message was skipped). - In the third invocation, it again skips the 2nd message (the +1 bug), and continues from what it sees as the +2 message in the queue (which it is incorrectly is told was the previous resume point) - something like that - the code probably explains it better :) |
Just tested PR locally and issue seems to be solved as far as I can see 👍 Would be nice if @mauricioszabo could verify :)
Agree, thought it might be uncovered in the |
@astubbs Is it possible to prioritize this as a patch-release? Application-restarts are very common on platforms like kubernetes, so it causes a few to many skipped messages :-) (current workaround is using offset=latest and unique consumer-group id on startup) |
Releasing today :) |
…e encoded in metadata #62 Added simple test for reproducing issue 62 where offset is skipped after restart Test for full example as described Test for first error
…o offsets are encoded in metadata confluentinc#62 The off by one issue would cause the first message to be skipped in some situations. Added simple test for reproducing issue 62 where offset is skipped after restart Test for full example as described Test for first error
…o offsets are encoded in metadata confluentinc#62 The off by one issue would cause the first message to be skipped in some situations. Added simple test for reproducing issue 62 where offset is skipped after restart Test for full example as described Test for first error
…e encoded in metadata #62 The off by one issue would cause the first message to be skipped in some situations. Added simple test for reproducing issue 62 where offset is skipped after restart Test for full example as described Test for first error
Tested and verified 👍 |
Ok, this is a really strange bug: I'm trying to use parallel-consumer with Clojure. I'm using a local Kafka cluster (only one broker) to test things on my machine.
If I send a single message, and fire up the consumer, it consumes that message. So far, so good.
If I immediately stop the consumer, send another message, and fire up the broker... nothing happens. The message is not consumed at all.
BUT, if I stop the consumer, send ANOTHER message, and fire up the consumer again, it consumes ONLY the new message... not the old one.
If I send a batch of messages, the same problem happens: it ignores the first message, and consumes the rest. Here's a video for reference:
parallel-error.mp4
The code that causes this error is the following: please notice that there's nothing special about it - just instantiates a single consumer that prints the received message:
The text was updated successfully, but these errors were encountered: