Skip to content

Conversation

@tweise
Copy link
Contributor

@tweise tweise commented Mar 4, 2019

What is the purpose of the change

This PR adds an end to end test for Kinesis consumer and producer, similar to what we already have for Kafka. The test is based on Kinesalite (running as docker container).

Brief change log

Verifying this change

This change adds a test and can be verified as follows:

The test can be run as single test via:

FLINK_DIR=/your/flink/dir/flink-1.9-SNAPSHOT flink-end-to-end-tests/run-single-test.sh ./flink-end-to-end-tests/test-scripts/test_streaming_kinesis.sh

Does this pull request potentially affect one of the following parts:

This change is test code only.

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 4, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

new Thread(
() -> {
try {
env.execute();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tillrohrmann the test currently fails (after it actually succeeded) because the job isn't cancelled and generates exceptions in the log (after Kinesalite was terminated). I could not quite figure out how I can accommodate both the verification and execution within the main method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is resolved by stopping the flink cluster before Kinesalite in the script. But please see below for need for System.exit(0) to bypass the CLI job check. Is there a better way to fork the job and have main run the test logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check for exceptions in the log is a piece of questionable design that is hopefully improved in the future. I think so far, we added certain exceptions that can safely be ignored to a white-list in the script function that checks the logs for exceptions. So you could add the Kinesalite exception there to not have it fail your test. If the exception is of a very generic type ... this is where the questionable design part becomes problematic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer an issue. The exceptions are avoided with graceful shutdown, terminating the job prior to Kinesalite.

@tweise tweise force-pushed the FLINK-9007.kinesisTest branch 2 times, most recently from e1c4ed4 to 0811e2e Compare March 5, 2019 18:43
@tweise tweise force-pushed the FLINK-9007.kinesisTest branch 3 times, most recently from 58994a3 to 20eeb11 Compare March 10, 2019 17:49
@tweise tweise changed the title [WIP] [FLINK-9007] [kinesis] Add Kinesis e2e test [FLINK-9007] [kinesis] [e2e] Add Kinesis end-to-end test Mar 10, 2019
@tweise tweise requested a review from StefanRRichter March 10, 2019 17:51
@tweise tweise force-pushed the FLINK-9007.kinesisTest branch 2 times, most recently from b9af7ee to f57c396 Compare March 10, 2019 18:12
@tweise
Copy link
Contributor Author

tweise commented Mar 11, 2019

@StefanRRichter @tillrohrmann PTAL

executeException.set(e);
}
});
executeThread.start();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to also join on this thread somewhere later, maybe before we enter the readMessages part (assuming the generator will not block, if it does join can still happen somewhere later).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is possible before the loop, we can also check the exception status just once right there.

Copy link
Contributor Author

@tweise tweise Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread that runs the job does not terminate (I don't have a reference to the job and cannot cancel it). The job needs to start after the streams are created and run in parallel to the validation logic. Once the expected results are in, the driver exits and the job/cluster is terminated from the script, prior to terminating Kinesalite.

@tweise
Copy link
Contributor Author

tweise commented Mar 12, 2019

@StefanRRichter could you approve the PR or let me know if anything else should be done here?

@tweise tweise force-pushed the FLINK-9007.kinesisTest branch from 77999cd to 05672ba Compare March 12, 2019 16:59
@zentol
Copy link
Contributor

zentol commented Mar 13, 2019

I recently opened a PR containing a prototype of a java-based E2E framework here.
From what I can tell this should make the implementation easier since you can better separate the actual job execution and preparation/validation logic.

It may make sense to delay this PR until mine has been approved.

@tweise
Copy link
Contributor Author

tweise commented Mar 13, 2019

@StefanRRichter thanks for taking a look.

It's good to learn that we have an effort to provide a Java based e2e framework. It should simplify new tests and make existing ones more maintainable.

This Kinesis test was under discussion for a while and it is a current gap in coverage. I would like to merge it and also backport it to 1.8.x. Happy to make changes to adopt the new framework when it is ready, but we should not block on it.

As for separating test logic and program, that's more or less already the case. The pipeline code, the test driver and the Kinesis client (in the future "resource") live in separate classes. Should be easy to port to the new framework when it is ready.

@StefanRRichter
Copy link
Contributor

Test looks good to me, if @zentol is ok with delaying the port to the new framework I would approve and merge.

@zentol
Copy link
Contributor

zentol commented Mar 15, 2019

I won't block this PR.

Copy link
Contributor

@StefanRRichter StefanRRichter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@tweise tweise merged commit 19570c6 into apache:master Mar 15, 2019
@tweise
Copy link
Contributor Author

tweise commented Mar 15, 2019

@zentol I subscribed to #7605 - let me know once it is far enough and I will port this test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants