Skip to content

Conversation

@gaoyunhaii
Copy link
Contributor

@gaoyunhaii gaoyunhaii commented Oct 29, 2020

What is the purpose of the change

This PR adds the e2e tests for the new FileSink. In the e2e tests we mainly focus on cases of TaskMangers get killed, thus we only test the streaming mode since for batch mode, in this version the new file sink still not be able to achieve exactly-once if the ResultPartition gets lost. The stream & batch mode for cases of task failover is covered by FileSinkITCase.

Therefore, the PR is roughly a copy of the E2E test of the StreamingFileSink.

Brief change log

  • 86e3c59 adds the test program and the test scripts.

Verifying this change

This change is a test for itself.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 86e3c59 (Thu Oct 29 11:04:08 UTC 2020)

Warnings:

  • 6 pom.xml files were touched: Check for build and licensing issues.
  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 29, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@gaoyunhaii gaoyunhaii force-pushed the file_sink_e2e_test branch 2 times, most recently from 23f664b to 87f6e43 Compare November 5, 2020 10:41
@aljoscha aljoscha self-requested a review November 5, 2020 11:08
@aljoscha
Copy link
Contributor

aljoscha commented Nov 5, 2020

Instead of copying the test_streaming_file_sink.sh, could we just parameterize that existing test with the test program jar? And maybe rename it to test_file_sink.sh?

@kl0u
Copy link
Contributor

kl0u commented Nov 5, 2020

I would also second @aljoscha 's opinion if this is possible. If I recall correctly it is using the OnCheckpointPolicy so it does not need processing time support. But we have to check again.

@gaoyunhaii
Copy link
Contributor Author

Very thanks for the review! The initial thought is that we might need also support batch mode in the future, which would change the scripts and test program, but I also agree with that it would be better to reuse the current codes to decrease the duplication until they diverge. I'll modify the PR to reuse the existing test~

@gaoyunhaii gaoyunhaii force-pushed the file_sink_e2e_test branch 2 times, most recently from c9f6bed to 62cc004 Compare November 5, 2020 16:39
@gaoyunhaii gaoyunhaii closed this Nov 6, 2020
@gaoyunhaii gaoyunhaii reopened this Nov 6, 2020
@aljoscha
Copy link
Contributor

aljoscha commented Nov 6, 2020

Merged!

@aljoscha aljoscha closed this Nov 6, 2020
@rmetzger
Copy link
Contributor

rmetzger commented Nov 6, 2020

Note: This PR has been merged despite a build failure, which now permanently breaks build on master.

@rmetzger
Copy link
Contributor

rmetzger commented Nov 6, 2020

I reverted the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants