Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race in e2e tests: consumer is not up and running after calling .run() #1629

Open
Baptiste-Garcin opened this issue Oct 6, 2023 · 1 comment

Comments

@Baptiste-Garcin
Copy link

Hi,

Describe the bug
I am facing a really frustrating issue while coding some e2e tests around kafka consumers. This issue only triggers during CI making it really hard to reproduce.

I am testing a component that subscribe to a topic A, execute some business logic and publish the result in a topic B. In order to test it properly, I add a dummyConsumer who will receive the event from topic B and make it available for a bunch of expect.

My problem is that most of the times, the first test fails because the dummyConsumer never received the message to be tested. I am using wait-for-expect so I am waiting around 5 seconds before concluding that the message is lost somewhere. I notice, and that's the workaround I am using, that if I explicitly wait for 1 second between subscribing with the dummy consumer and starting to publish the message, my issue is gone.

To Reproduce

  1. Start a consumer that subscribe to topic A and forward every message to topic B
  2. Start a consumer (dummyConsumer) that subscribe to topic B
  3. Start a series of test that publish in topic A and run expect on the message received in topic B

As it is a data race, here are my hooks and what they do:

  • beforeAll: start the dummyConsumer and the producer we will use in the coming tests
  • beforeEach: instantiate a new version of the consumer we are testing and subscribe to the relevent topic.
  • afterEach: shutdown the tested consumer
  • afterAll: shutdown the producer and the dummyConsumer

Expected behavior
Every message should be received by the dummyConsumer.

Observed behavior
The first (and sometimes second) messages never reach the dummyConsumer. If I add a timeOut of 1 second before starting each test (so in the beforeEach), this issue is gone. This last workaround makes me think that the promise returned by consumer.run() does not imply that the consumer is up and ready to receive message.

Environment:

  • OS: alpine: 3.17
  • KafkaJS version 2.2.4 (^2.1.0 in the package.json)
  • Kafka version 3.5.1
  • NodeJS version 18.18
  • Vitest version 0.31.4

Additional context
I checked the "Instrumentation Events" documentation but I didn't find a relevant event. I tried waiting for some of those signals but I never succeed in avoiding to wait explicitly 1 second between each test.

@wrslatz
Copy link

wrslatz commented Mar 16, 2024

I was able to work around this issue by using https://kafka.js.org/docs/admin#a-name-reset-offsets-by-timestamp-a-reset-consumer-group-offsets-by-timestamp implemented in #604.

Specifically, using

await admin.setOffsets({ groupId, topic, partitions: await admin.fetchTopicOffsetsByTimestamp(topic, timestamp) })

and using a consumer on the same groupId and topic causes that consumer to start consuming messages at the required offset. We then followed this example for a promise that resolves when a matching message is found within consumer.run.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants