Use "provided" hadoop-* dependencies in Spark metadata client #4399

arielshaqed · 2022-10-19T12:01:27Z

hadoop-* dependencies are not well-shaded. By including them in our
assembly we make users' lives harder. Make them "provided": they will be
provided in any Spark installation that is in actual use (and if not, users
should include e.g. hadoop-aws explicitly).

Fixes #4399. Tested by manually including hadoop-aws packages in Spark.

* Use "provided" hadoop dependencies Fixes #4399. Tested by manually including hadoop-aws packages in Spark. * Instruct to provide hadoop-aws to `spark-submit` metadata client * Document hadoop-aws 3.2.1 with Hadoop 3 (Was 2.7.7, which is just wrong) * [CR] Add CHANGELOG

arielshaqed mentioned this issue Oct 19, 2022

Use "provided" hadoop dependencies #4400

Merged

arielshaqed self-assigned this Oct 19, 2022

arielshaqed added area/client/spark contributor team/ecosystem Team Ecosystem labels Oct 19, 2022

arielshaqed added a commit that referenced this issue Oct 20, 2022

Use "provided" hadoop dependencies

23d264f

Fixes #4399. Tested by manually including hadoop-aws packages in Spark.

arielshaqed closed this as completed in #4400 Oct 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use "provided" hadoop-* dependencies in Spark metadata client #4399

Use "provided" hadoop-* dependencies in Spark metadata client #4399

arielshaqed commented Oct 19, 2022

Use "provided" hadoop-* dependencies in Spark metadata client #4399

Use "provided" hadoop-* dependencies in Spark metadata client #4399

Comments

arielshaqed commented Oct 19, 2022