-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[misc] Commit read offsets in Kafka integration tests #4310
Conversation
Shouldn't we actually be deleting the topics after the test finishes? |
Yes and as far as I know, we are doing this. Why do you ask? |
@pnowojski only checking that we're actually not reading the same topic again in the tests. |
Maybe some method Javadoc explaining that would be nice. |
I would like to step back a bit and revisit tests that use this method by first discussing: |
For consumer side or mapper side it is natural to use that kind of validating mappers, because you could just add them at the end of your pipeline. For producers tests it isn't, because you need to spawn additional Flink job for this purpose, which seems unnatural to me. It would add a test dependency to a consumer code (bug in consumer would/could brake producer tests making the error messages very confusing). Furthermore using second Flink job would be definitely more heavy and more time/resources consuming - this second job would need to execute exactly same code as those methods, but wrapped into additional layer (Flink application). Lastly this wrapping would add additional complexity that could make this tests more prone for intermittent failures and timeouts. If you have the data written somewhere, why don't you want to read them directly? One more bonus reason for doing it as it is, it makes possible to test producers without spawning any Flink job altogether in some mini IT cases (which I'm doing in tests for |
@pnowojski alright, that makes sense. |
Travis seems to have a large amount of abnormal timeouts, though. I'm not sure if it is really related to this change or otherwise. Could you do a rebase on the latest master so that the recent Travis build changes are included, and we wait for another Travis run? |
Previously offsets were not commited so the same records could be read more then once. It was not a big issue, because so far this methods were used only for at-least-once tests.
I didn't know that you can have disconnected graph in Flink :) It shouldn't be caused by this commit, since it is included in my other PR. Rebased and let's make sure that it passes. |
yes, Travis passes @tzulitai :) |
Ok :) LGTM, merging .. |
Previously offsets were not commited so the same records could be read more then once. It was not a big issue, because so far this methods were used only for at-least-once tests. This closes #4310.
Thanks! |
Previously offsets were not committed so the same records could be read more then once.
It was not a big issue, because so far this methods were used only for at-least-once tests.