Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "provided" hadoop-* dependencies in Spark metadata client #4399

Closed
arielshaqed opened this issue Oct 19, 2022 · 0 comments · Fixed by #4400
Closed

Use "provided" hadoop-* dependencies in Spark metadata client #4399

arielshaqed opened this issue Oct 19, 2022 · 0 comments · Fixed by #4400

Comments

@arielshaqed
Copy link
Contributor

hadoop-* dependencies are not well-shaded. By including them in our
assembly we make users' lives harder. Make them "provided": they will be
provided in any Spark installation that is in actual use (and if not, users
should include e.g. hadoop-aws explicitly).

@arielshaqed arielshaqed self-assigned this Oct 19, 2022
arielshaqed added a commit that referenced this issue Oct 20, 2022
Fixes #4399.

Tested by manually including hadoop-aws packages in Spark.
arielshaqed added a commit that referenced this issue Oct 20, 2022
* Use "provided" hadoop dependencies

Fixes #4399.

Tested by manually including hadoop-aws packages in Spark.

* Instruct to provide hadoop-aws to `spark-submit` metadata client

* Document hadoop-aws 3.2.1 with Hadoop 3

(Was 2.7.7, which is just wrong)

* [CR] Add CHANGELOG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant