Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-9283] Disable caching ValidatesRunner tests tasks #14640

Merged
merged 1 commit into from Apr 28, 2021

Conversation

iemejia
Copy link
Member

@iemejia iemejia commented Apr 25, 2021

It seems the Java 11 VR tests on our Jenkins CI are being skipped because they have been previously cached.
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Java11/
This PR disables caches so they are always run.

R: @ibzib

@iemejia iemejia requested a review from ibzib April 25, 2021 20:19
@iemejia
Copy link
Member Author

iemejia commented Apr 25, 2021

Run Spark ValidatesRunner

@iemejia
Copy link
Member Author

iemejia commented Apr 25, 2021

Run Spark ValidatesRunner Java 11

@iemejia
Copy link
Member Author

iemejia commented Apr 25, 2021

Run Spark StructuredStreaming ValidatesRunner

@iemejia
Copy link
Member Author

iemejia commented Apr 25, 2021

Run Flink ValidatesRunner

@ibzib
Copy link
Contributor

ibzib commented Apr 26, 2021

My understanding is, this will cause the validates runner tasks to not be cached, but the build artifacts they depend on could be. Could that cause problems if the build artifacts were built with Java 8, or do we only care about the version used at runtime?

@iemejia
Copy link
Member Author

iemejia commented Apr 27, 2021

For the VR tests specifically what we want is to simulate the real execution use case. Users get the Beam module binaries compiled in Java 8 so we want to build them on Java 8 (because we publish those). Users run then their pipelines (in our case the VR tests) with their code on the specific Java versions they have.

We don't publish Java 11 specific artifacts and we probably won't for some time.

Copy link
Contributor

@ibzib ibzib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iemejia iemejia merged commit ea107d0 into apache:master Apr 28, 2021
@kennknowles
Copy link
Member

Why should these not be cached? That saves re-running tests when no input to the test has changed. We only need to disable caching when we are deflaking, right?

@iemejia
Copy link
Member Author

iemejia commented Apr 28, 2021

Because the VR tests were not being run at all for Java 11 so we could not have guarantees that there was not breakage related to Java versions.

@kennknowles
Copy link
Member

So it was because of same test target on different Java versions? I am surprised the cache key does not account for this.

@iemejia
Copy link
Member Author

iemejia commented Apr 29, 2021

Well maybe it is the way the job was designed I am not familiar with how the cache key is determined, so if you have suggestions on how to guarantee the execution of the VR tests only when needed (when the commit of the git master has changed between runs) I would be ok to fix it.

Notice that I discovered this when looking at the execution time of the Postcommit jobs, for example
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Java11/184/
took 2m 55s to run which was noticeable short for the full VR set that is around 20 mins long
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Java11/190/

I suspected this might be due to the fact that the git SHA of the previous job did not change so gradle was smart and did not run it, but it does not seem to be the case:
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Java11/184/ - e13e548
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Java11/183/ - 38e879b

Notice that this was also happening for the normal Java 8 Spark VR tests suite too:
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/8823/ - 2m 26s - e13e548
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/8822/ - 2m 24s - 38e879b

And I just looked at Flink and it has the same issue:
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/8853/ - 2m 5s - 9ea3884
https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/8852/ - 15m -
325471d

@kennknowles
Copy link
Member

Gradle does not know about git SHA. It just notices the inputs to the gradle task. So the problem will be if you have a Gradle task that depends on things that are not listed as inputs.

@kennknowles
Copy link
Member

Specifically, https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Java11/184/ correctly does not run the tests, because all the inputs to the test are the same.

@kennknowles
Copy link
Member

With the examples you have presented here, I think this change should be reverted. If there are examples where (a) the sources changes and (b) the tests did not run, then we should keep this PR until we can fix the Gradle config.

@iemejia
Copy link
Member Author

iemejia commented May 6, 2021

Well the issue is that we need to run the VR tests for both Java 8 and Java 11 to guarantee that the runner works consistently in both versions, and I expected this to happen for every commit but what I found on the CI history was that it was not even ran, I don't have a problem reverting this but how can we guarantee that the VR tests are run?

@kennknowles
Copy link
Member

OK yea it seems we need clear separate tasks for the JVMs, or somehow to have the JDK be an input. This is an interesting question for Gradle perhaps. I am surprised if it does not include the JDK in the cache key.

@adude3141
Copy link
Contributor

Gradle should reflect the java version

If this is not working, it might be an issue with building on jdk8, but running on jdk11. This would need some investigations. Declaring some input env var could possibly mitigate this. Like explicitly specifying jdk11 differentiator. Dunno.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants