New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-9162] Upgrade Jackson to version 2.10.2 #10643
Conversation
Run JavaPortabilityApi PreCommit |
Can you run the linkage checker against all the Beam modules that have a direct dependency on Jackson and enumerate the before and after similar to the PR description in #10631 |
$ grep -l -R "library.java.jackson" --include \*.gradle . | sort
./runners/apex/build.gradle
./runners/core-construction-java/build.gradle
./runners/core-java/build.gradle
./runners/direct-java/build.gradle
./runners/extensions-java/metrics/build.gradle
./runners/flink/flink_runner.gradle
./runners/gearpump/build.gradle
./runners/google-cloud-dataflow-java/build.gradle
./runners/google-cloud-dataflow-java/worker/build.gradle
./runners/google-cloud-dataflow-java/worker/legacy-worker/build.gradle
./runners/samza/build.gradle
./runners/spark/build.gradle
./sdks/java/core/build.gradle
./sdks/java/extensions/google-cloud-platform-core/build.gradle
./sdks/java/extensions/jackson/build.gradle
./sdks/java/extensions/sql/build.gradle
./sdks/java/harness/build.gradle
./sdks/java/io/amazon-web-services2/build.gradle
./sdks/java/io/amazon-web-services/build.gradle
./sdks/java/io/cassandra/build.gradle
./sdks/java/io/elasticsearch/build.gradle
./sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/build.gradle
./sdks/java/io/google-cloud-platform/build.gradle
./sdks/java/io/hadoop-file-system/build.gradle
./sdks/java/io/hcatalog/build.gradle
./sdks/java/io/kafka/build.gradle
./sdks/java/io/kinesis/build.gradle
./sdks/java/io/synthetic/build.gradle
./sdks/java/maven-archetypes/examples/build.gradle
./sdks/java/testing/nexmark/build.gradle |
is there any generic way to analyze just all the modules? Artifact ids don't coincide with directories, so doing a manual artifact list is painful and error-prone. |
We could create a jenkins task for that and then just take a look at that in the future if we get to do it that way, no? |
@suztomo maybe you know a way I can run the linkagechecker analysis in the full set of Beam modules? I think is more scalable to have a task for that that we invoke during PRs to validate that no regressions are included as suggested by Luke. (I can do that in Maven but my gradle-fu is still not good enough). |
I want that job too! The challenge is that because of the many existing linkage errors, I'd have to compare
Like a code coverage report. As I don't know how to do that, I'm still doing it |
Well the manual comparison is not ideal but we can cope with that for the moment, what I don't want is to type the command for the 31 modules of this PR and then have to change it for other dependency upgrade. I just want some sort of |
Let me think about that this week. https://issues.apache.org/jira/browse/BEAM-9206 For this PR, I would only check the modules that use jackson: |
Hehe so the 31 that I mentioned above, mmm not an easy to sell proposition. On the other hand I can help with the jenkins part if you get to do an incantation that works locally for all modules. |
If the linkage checker had a way to ignore pre-existing linkage failures, we could turn it into a test by enumerating all known failures statically. The linkage checker would complain if there was a new failure that wasn't pre-existing or if the pre-existing failure wasn't being reported anymore (allowing us to maintain the list over time). The vendored gRPC 1.26.0 reduced the number of warnings in beam-sdks-java-core down to 4. Also, running the linkage checker per module would be useful and I can help with the Gradle bit if there was some good way to have linkage checker main return non zero status code on linkage errors and also if it supported enumerating pre-existing somehow. |
@lukecwik do you think we can just get the gradle part that just runs the linkageChecker and outputs the errors for all modules in the meantime? That would allow me to run and do the manual comparison so we can get progress in this PR. |
If you want to check linkage for all relevant modules then use:
I believe that is the entire set of modules that aren't archetypes, model, or vendored dependencies. I think there is an issue with beam-sdks-java-extensions-sql-zetasql and beam-sdks-java-extensions-zetasketch which you might need to exclude from the list above. |
I wouldn't rely on the result of the linkage check for an artifact list. To ensure each Beam artifact is checked independently, I created a shell script that run "checkJavaLinkage" for each artifact:
For this issue, today I made a release of Linkage Checker 1.1.3 and I just raised a PR #10721 . |
Ok so finally back to this one after soooo long. Can you PTAL @lukecwik I really would like to have this one as part of 2.20.0. Thx |
@suztomo I found that when we run the linkage checks on the hcatalog module it takes wwwayyy too long time and then ends up with a weird OOM error I am wondering if there is something odd like the checker assuming some recursive dependency. Can you PTAL? |
I've been working on the slowness of Linkage Checker. GoogleCloudPlatform/cloud-opensource-java#1145 |
Awesome @suztomo my issue here was more about the OOM fact but probably related. Eager to retest when ready. |
We use a mix of jackson-dataformat-csv / jackson-dataformat-xml since it is brought in transitively through our dependencies such as To improve our current usage we need to ensure that we declare 2.10.2 versions of jackson libraries in
I would say that this PR is a net positive since the prior version of Jackson didn't match the dataformat versions anyway but your analysis points to some simple additional changes we could do to improve consistency because of what a downstream dependency is bringing in. Filed https://issues.apache.org/jira/browse/BEAM-9352 for further improvements. |
Thanks Luke will take a look at the other ticket. |
Regarding the output of Linkage Checker, the conflict from optional dependencies shouldn't cause the problem. The checker is conservative to find many errors even in optional or provided dependencies. |
Agreed |
Forgot I had one last question around this one. @lukecwik what command did you invoke to output the list of modules that you suggested above, I would like to be aware of this and document it somewhere for the future. |
Another suggestion for the script @suztomo since I had multiple intermediary failures while trying to validate this PR modules, being able to produce output eagerly could help, currently we execute the validation for all modules but produce no output until the end. |
@iemejia How about running diff command while waiting for the result? (The script outputs the file name). I thought about the comparing the result early (see "Early Exist on Failure" below) but it would increase overall execution time. CurrentAs of now the script consists of (1) Install artifacts and (2) run multiple checkJavaLinakge, for 2 branches.
Total time for "install artifacts" is ~10 minutes (2 * 5) Early Exit on FailureI considered changing the loop order as below, but the number of "install artifacts" increases if the for-loop for the Beam artifacts is outside:
Total time for "install artifacts" is ~30 minutes (3 * 2 * 5). |
I looked through the gist that you provided and copied out the relevant lines. |
R: @lukecwik