-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-30491][INFRA][2.4] Enable dependency audit files to tell dependency classifier #27178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your first contribution, @mengCareers . |
|
ok to test |
|
Test build #116571 has finished for PR 27178 at commit
|
|
@dongjoon-hyun @gengliangwang May I adjust the comments for this PR first? If this PR is finalized, I would apply the same changes for #27177, and then ask you for a final review. |
|
Test build #116735 has finished for PR 27178 at commit
|
|
Test build #116743 has finished for PR 27178 at commit
|
|
Test build #116794 has finished for PR 27178 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @mengCareers . It looks good to me. Could you update your PR on master branch now?
We need to finalize and merge on master branch first. Also, we are in the middle of vote on Apache Spark 2.4.5. So, branch-2.4 is currently locked for this type of patch backporting.
|
@dongjoon-hyun how does voting for 2.4.5 affect this since it will not be part of the artifact? |
|
It could in theory break the build, so matters w.r.t. managing an RC process. But indeed this won't affect any functionality of Spark, so would be easy enough to merge for a next RC if necessary, or after 2.4.5 is finalized. |
|
My PR against the master branch has been updated. |
|
Since RC1 vote is cancelled with several feedbacks, I'll merge this for next RC. Thank you all. |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to branch-2.4.
…dency classifier ### What changes were proposed in this pull request? Enable dependency audit files to tell the value of artifact id, version, and classifier of a dependency. For example, `avro-mapred-1.8.2-hadoop2.jar` should be expanded to `avro-mapred/1.8.2/hadoop2/avro-mapred-1.8.2-hadoop2.jar` where `avro-mapred` is the artifact id, `1.8.2` is the version, and `haddop2` is the classifier. ### Why are the changes needed? Dependency audit files are expected to be consumed by automated tests or downstream tools. However, current dependency audit files under `dev/deps` only show jar names. And there isn't a simple rule on how to parse the jar name to get the values of different fields. For example, `hadoop2` is the classifier of `avro-mapred-1.8.2-hadoop2.jar`, in contrast, `incubating` is the version of `htrace-core-3.1.0-incubating.jar`. Reference: There is a good example of the downstream tool that would be enabled as yhuai suggested, > Say we have a Spark application that depends on a third-party dependency `foo`, which pulls in `jackson` as a transient dependency. Unfortunately, `foo` depends on a different version of `jackson` than Spark. So, in the pom of this Spark application, we use the dependency management section to pin the version of `jackson`. By doing this, we are lifting `jackson` to the top-level dependency of my application and I want to have a way to keep tracking what Spark uses. What we can do is to cross-check my Spark application's classpath with what Spark uses. Then, with a test written in my code base, whenever my application bumps Spark version, this test will check what we define in the application and what Spark has, and then remind us to change our application's pom if needed. In my case, I am fine to directly access git to get these audit files. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Code changes are verified by generated dependency audit files naturally. Thus, there are no tests added. Closes #27178 from mengCareers/depsOptimize2.4. Lead-authored-by: Xinrong Meng <meng.careers@gmail.com> Co-authored-by: mengCareers <meng.careers@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
Enable dependency audit files to tell the value of artifact id, version, and classifier of a dependency.
For example,
avro-mapred-1.8.2-hadoop2.jarshould be expanded toavro-mapred/1.8.2/hadoop2/avro-mapred-1.8.2-hadoop2.jarwhereavro-mapredis the artifact id,1.8.2is the version, andhaddop2is the classifier.Why are the changes needed?
Dependency audit files are expected to be consumed by automated tests or downstream tools.
However, current dependency audit files under
dev/depsonly show jar names. And there isn't a simple rule on how to parse the jar name to get the values of different fields. For example,hadoop2is the classifier ofavro-mapred-1.8.2-hadoop2.jar, in contrast,incubatingis the version ofhtrace-core-3.1.0-incubating.jar.Reference: There is a good example of the downstream tool that would be enabled as @yhuai suggested,
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Code changes are verified by generated dependency audit files naturally. Thus, there are no tests added.