-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-27518][tests] Refactor migration tests to support version update automatically #21736
Conversation
@XComp could you have a look at the PR? Very thanks! |
f7edc0b
to
c2c8cbd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gaoyunhaii for providing this test data automation. I think, it's a good way to improve the release procedure. I didn't go through all the test classes. The points I made in StatefulJobSnapshotMigrationITCase
seem to apply also in other test classes. Therefore, I will leave the comments like this for now and wait for your response and/or updates to the PR before going through the rest of the code.
flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java
Outdated
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Outdated
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Outdated
Show resolved
Hide resolved
String normalizedVersionName = | ||
"v" + versionMatcher.group(1) + "_" + versionMatcher.group(2); | ||
FlinkVersion version = FlinkVersion.valueOf(normalizedVersionName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move this logic into FlinkVersion
providing a method valueOf(int majorVersion, int minorVersion)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move it to a new constructor function for FlinkVersion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You created the constructor but didn't use it. I'm also not sure whether we should actually use a constructor here. It feels odd to use a constructor outside of the enum definition. I still would vote for providing a static method that returns the right enum value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, that makes sense, I'll change it to a static method.
...ts/src/test/java/org/apache/flink/test/checkpointing/StatefulJobSnapshotMigrationITCase.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/flink/test/checkpointing/StatefulJobWBroadcastStateMigrationITCase.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/flink/test/checkpointing/StatefulJobWBroadcastStateMigrationITCase.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/flink/test/checkpointing/StatefulJobWBroadcastStateMigrationITCase.java
Outdated
Show resolved
Hide resolved
...ts/src/test/java/org/apache/flink/test/checkpointing/StatefulJobSnapshotMigrationITCase.java
Outdated
Show resolved
Hide resolved
c2c8cbd
to
3759c53
Compare
Hi @XComp sorry it took some time for the fixes, I have update the PR, could you have another round of look? Very thanks for the revewing! |
...tion-test-utils/src/main/java/org/apache/flink/test/migration/SnapshotGeneratorExecutor.java
Outdated
Show resolved
Hide resolved
String normalizedVersionName = | ||
"v" + versionMatcher.group(1) + "_" + versionMatcher.group(2); | ||
FlinkVersion version = FlinkVersion.valueOf(normalizedVersionName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You created the constructor but didn't use it. I'm also not sure whether we should actually use a constructor here. It feels odd to use a constructor outside of the enum definition. I still would vote for providing a static method that returns the right enum value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another pass over the code but didn't manage to touch all the classes. But I though it was already worth it to push my comments out.
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Outdated
Show resolved
Hide resolved
flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java
Outdated
Show resolved
Hide resolved
...ts/src/test/java/org/apache/flink/test/checkpointing/StatefulJobSnapshotMigrationITCase.java
Outdated
Show resolved
Hide resolved
* <p>For regenerating the binary snapshot files run {@link #writeSnapshot(FlinkVersion)} on the | ||
* corresponding Flink release-* branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we rather refer to the test data generation framework here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove the outdated comments since generating will be done by the RM as a whole.
...rc/test/scala/org/apache/flink/api/scala/migration/StatefulJobSavepointMigrationITCase.scala
Outdated
Show resolved
Hide resolved
.../test/java/org/apache/flink/test/state/operator/restore/unkeyed/ChainLengthIncreaseTest.java
Outdated
Show resolved
Hide resolved
.../test/java/org/apache/flink/test/state/operator/restore/unkeyed/ChainLengthDecreaseTest.java
Outdated
Show resolved
Hide resolved
...ts/src/test/java/org/apache/flink/test/checkpointing/StatefulJobSnapshotMigrationITCase.java
Outdated
Show resolved
Hide resolved
...-core/src/test/java/org/apache/flink/api/common/typeutils/TypeSerializerUpgradeTestBase.java
Show resolved
Hide resolved
3759c53
to
649e590
Compare
Thanks @XComp for the review! I have updated the PR according to the comments. I'll further update the PR if #21736 (comment) is acceptable. |
Thanks for letting me know, I'm gonna have another pass over it. Could you rebase the branch in the meantime? |
Ok, got that, I'll rebase the branch to the latest master. |
649e590
to
10e162c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @gaoyunhaii and sorry for the late response. The 1.17 release keeps me busy. I did another pass and only found minor things when looking over the code change. Could you rebase this branch and check what's going on with the CI failures?
...-core/src/test/java/org/apache/flink/api/common/typeutils/TypeSerializerUpgradeTestBase.java
Outdated
Show resolved
Hide resolved
...-core/src/test/java/org/apache/flink/api/common/typeutils/TypeSerializerUpgradeTestBase.java
Show resolved
Hide resolved
|
||
```xml | ||
<profile> | ||
<id>generate-snapshots</id> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go for generate-migration-test-data
. "data" is already plural - adding the plural to the "test" keyword doesn't add any value but is rather unusual, in my opinion.
10e162c
to
4d5e0e2
Compare
Hi @XComp very thanks for the review! The CI failure is due to that now we have change the test versions to be [xx, recently published version] for all the migration tests, now the latest version has been increased to 1.18, but we have not generate the snapshots for 1.17 yet. I generated the snapshots with this tool. For the formal process in the future, I think the generating might happen at the time of cutting branch, before we adding the new version tag. Do you think this would be reasonable? |
Yeah, I guess that makes sense. About this PR: Could you squash all the commits into reasonable chunks? Especially, the test data generation should be covered in a separate commit. I will do a final pass over it after that is done. |
Hi @XComp I have some concern with this point: for the formal process, if we want to split the generated files, it will still require a lot of manual operations, since there is no explicit mappings from the generated files to the migration classes. Do we think it is necessary to split the commits? Since now it looks to me that they are some kind of auto-managed generated binary files, and the developers seems not have requirements to check its history (In fact it will only have one log that the files are added). If we still think it is necessary to split the commits, I think we might need to extend this functionality to support some kind of post-processing actions that could execute external scripts to commit files for each migrating test classes. What do you think about this point? |
Not sure, whether I understand you in the right way here: I didn't mean to push the generated test da in multiple commits (e.g. one commit per test class). I meant that we want to prepare this PR to have one commit for the refactoring of the test data generation and one commit for the generated data. Does that make sense?🤔 But on the other note: I reiterated on your proposal for how the process should look like. I came to the conclusion that we shouldn't create the test data when creating the release branch. We still have to do it after the new minor release (in our current case 1.17.0) is published. The test data should be generated using the code version of the minor version's git tag (i.e. |
Hi @XComp For the first point, sorry it is indeed my misunderstanding, and we are in fact consistent. For the second point, after some more thoughts it now looks to me we indeed need to do the data generation on publishing instead of cutting branch, since there are still new codes merged during the period before formally published. For migration test data generation, we in fact have a parameter to specify the target version so that the snapshots data could be located in the right location, thus I think this step is ok. The main issue is that currently the migration tests relies on the mostRecentlyPublishedVersion() to list the versions to test against. Before 1.17 is published, there is no test data for release 1.17, and the versions should be [some start version, 1.16], after 1.17 is published, the versions would become [some start version, 1.17]. It looks its not easy to distinguish the two cases without more information. The git tags might not always exist (for example, users download sources from the website) and is not easy to acquire from the java code. Do you think it is ok for us to have a constant in
and the RM will update the variable on publishing after generating the test data ? |
Yeah, one option would be to use a constant variable. The RM would need to update that one along generating the test data. A minor thing: The variable should be named Alternatively, we could write this information into a metadata file that's located in the |
Hi @XComp the option of having the information in the resource direction also looks good to me, I'll update the PR and also squash the commits. |
Hi @XComp sorry when I try to implement the logic of update the file that records the most-recently published file, I found that it is not easy to locate the file:
Do you think it is also ok that we still leave the most-recently flink version in file, but we update it with the bash scripts on publishing? |
should be fine, I guess. |
4d5e0e2
to
92a8586
Compare
Hi @XComp sorry for the long delay due to being bound recently, I have merged the previous commits except for the last change that reads latest published version from the file. I'll further squash it after this piece get reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @gaoyunhaii . I added a few comments. PTAL
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Outdated
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Outdated
Show resolved
Hide resolved
...est-utils/src/main/java/org/apache/flink/test/migration/MigrationTestsSnapshotGenerator.java
Outdated
Show resolved
Hide resolved
.../test/java/org/apache/flink/test/state/operator/restore/AbstractOperatorRestoreTestBase.java
Show resolved
Hide resolved
.../test/java/org/apache/flink/test/state/operator/restore/AbstractOperatorRestoreTestBase.java
Show resolved
Hide resolved
...rc/test/scala/org/apache/flink/api/scala/migration/StatefulJobSavepointMigrationITCase.scala
Show resolved
Hide resolved
...t/scala/org/apache/flink/api/scala/migration/StatefulJobWBroadcastStateMigrationITCase.scala
Show resolved
Hide resolved
...igration-test-utils/src/main/java/org/apache/flink/test/migration/PublishedVersionUtils.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...just additional thoughts after I went through generating the data manually for FLINK-31593.
.../test/java/org/apache/flink/test/state/operator/restore/AbstractOperatorRestoreTestBase.java
Outdated
Show resolved
Hide resolved
...fs-tests/src/test/java/org/apache/flink/hdfstests/ContinuousFileProcessingMigrationTest.java
Show resolved
Hide resolved
...t-utils-parent/flink-migration-test-utils/src/main/resources/most_recently_published_version
Outdated
Show resolved
Hide resolved
Hi @gaoyunhaii , I'm wondering what the status of the PR is. I merged FLINK-31593 now which is causing the conflicts, I guess. Would it be possible to finalize this PR rather sooner than later to be able to close this issue? |
Hi @XComp sorry for the long delay due to being heavily occupied in the previous weeks, I'll update the PR today and will try to make it being mergable in this week. |
92a8586
to
3425682
Compare
Hi @XComp Very sorry for the long delay, I updated the PR according to the comments. |
CI is still failing. I didn't do an entire pass over the PR because of that but rather focused on the open discussions. |
Hi @XComp sorry for failing the Ci previously, the CI is now passed, could you have another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gaoyunhaii . I have just a few nitty comments. But generally, it looks good. I also tried the test data generation and it generated files. I would suggest squashing everything together and rebasing the branch.
Just as a test run: Could you generate the data and commit them so that we would get a full CI run with the generated data? We wouldn't merge that data commit to master
. This would be just so that we have a test run with the generated data. WDYT?
flink-annotations/src/main/java/org/apache/flink/FlinkVersion.java
Outdated
Show resolved
Hide resolved
.../test/java/org/apache/flink/test/state/operator/restore/AbstractOperatorRestoreTestBase.java
Outdated
Show resolved
Hide resolved
.../test/java/org/apache/flink/test/state/operator/restore/AbstractOperatorRestoreTestBase.java
Outdated
Show resolved
Hide resolved
…d refactor existing tests.
71b7e8d
to
6db6455
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 Great, this took quite some effort. Good job, @gaoyunhaii :-) I verified the CI build in #22567 as well (see related comment). I guess, we're good to go. Could you update the Create Flink Release wiki article accordingly as a follow-up?
Thanks @XComp a lot for the review all the way! I'll update the wiki today. |
What is the purpose of the change
This PR refactors the state migration tests so that when cutting branch, we need only add new version and could generates the states of stale version automatically.
In general, there are two options:
We finally choose the option 2. This is because Maven have a bad support for depending on the tests classes of other modules, we could only use the
test-jar
, which do not support transitive dependency and make it hard to manage these transitive dependencies.Since previously the generating methods are written with JUnit tests, some of them are bounded with the JUnit infrastructures, like
@ClassRule
,@Rule
. To avoid the burden of re-written the generating methods, we have to have some minimum support for JUnit tests interfaces.Except for the generating, during the refactoring we also make each migration tests use a dynamic version lists:
[start, FlinkVersion.last()]
, which free us from manually change the list on cutting branch for each version.Brief change log
Verifying this change
Manually verified the process of
FlinkVersion
.mvn clean package -Pgenerate-snapshots -Dgenerate.version=1.17 -nsu -DskipRat -Dcheckstyle.skip -Drat.ignoreErrors=true -DspotlessFiles=ineffective -Dfast -DskipTests
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation