Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-14366][tests] Annotate MiniCluster tests in flink-tests with AlsoRunWithSchedulerNG #9900

Conversation

zhuzhurk
Copy link
Contributor

@zhuzhurk zhuzhurk commented Oct 15, 2019

What is the purpose of the change

This PR is to annotate all MiniCluster tests with AlsoRunWithSchedulerNG in flink-tests, so that we can know breaking changes in time when further improving the new generation scheduler.

All annotated tests should pass with both legacy and ng scheduler.

7 failed tests are not annotated and will be fixed in separate PRs:

Brief change log

  • annotated all MiniCluster tests with AlsoRunWithSchedulerNG in flink-tests
  • fixed failed tests

Verifying this change

This change is already covered by the annotated tests themselves.
It's also manually verified by running mvn verify -Dscheduler-ng.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 15, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 52978d1 (Thu Oct 24 18:02:04 UTC 2019)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Oct 15, 2019

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build

@zhuzhurk
Copy link
Contributor Author

@flinkbot run travis

@GJL GJL self-assigned this Oct 16, 2019
@zhuzhurk
Copy link
Contributor Author

@flinkbot run travis

simplejason pushed a commit to simplejason/flink that referenced this pull request Oct 17, 2019
@zhuzhurk
Copy link
Contributor Author

@flinkbot run travis

@GJL GJL removed their assignment Oct 17, 2019
@tillrohrmann tillrohrmann self-assigned this Oct 17, 2019
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @zhuzhurk. The changes look good to me. I'll run the tests locally to see that they pass with the new scheduler. If they pass and Travis gives green light, then I'll merge this PR.

walterddr pushed a commit to walterddr/flink that referenced this pull request Oct 18, 2019
@zhuzhurk zhuzhurk force-pushed the FLINK_14366_AlsoRunWithSchedulerNG_tests branch from 5eb806c to cfe3e5a Compare October 18, 2019 08:38
@zhuzhurk
Copy link
Contributor Author

Some failed tests were left behind. I have fixed or removed the annotation for them.

Here's a result of the testing with scheduler NG https://travis-ci.com/zhuzhurk/flink/builds/132499016.
image

Thescheduler_ng job runs all annotated tests in core modules and flink-tests.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WindowCheckpointingITCase failed on my machine when running the tests locally. This seems to be reproducible. Maybe we may not activate this tests to run with the new scheduler right now.

@zhuzhurk
Copy link
Contributor Author

zhuzhurk commented Oct 18, 2019

The WindowCheckpointingITCase failed on my machine when running the tests locally. This seems to be reproducible. Maybe we may not activate this tests to run with the new scheduler right now.

You are right. The test failure is unstable but reproducible. The issue looks similar to that of EventTimeWindowCheckpointingITCase, which should be caused by FLINK-14375. I tried run it with local fixes of FLINK-14375 and no failure happens after a hundred of retries.

Will remove the annotation from it in this PR. And annotate it in FLINK-14371 along with EventTimeWindowCheckpointingITCase.

…lsoRunWithSchedulerNG

AbstractTestBase in flink-test-utils is also annotated here to enabled tests based on it.
7 failed tests are not included and will be fixed in separate PRs:
* ClassLoaderITCase, EventTimeWindowCheckpointingITCase and WindowCheckpointingITCase in FLINK-14371
* KeyedStateCheckpointingITCase in FLINK-14372
* ZooKeeperHighAvailabilityITCase in FLINK-14373
* RegionFailoverITCase in FLINK-14374
* BatchFineGrainedRecoveryITCase in FLINK-14440
@zhuzhurk zhuzhurk force-pushed the FLINK_14366_AlsoRunWithSchedulerNG_tests branch from cfe3e5a to 52978d1 Compare October 18, 2019 14:00
@tillrohrmann
Copy link
Contributor

Thanks for the update @zhuzhurk. I'll try to run it again.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing now. Merging this PR now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants