-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-5134 [BUILD] Bump default Hadoop version to 2+ #5027
Conversation
… version reported by published Maven artifacts.)
Test build #28609 has finished for PR 5027 at commit
|
I want to double- and triple-check about this. I'm in favor, I think @pwendell is in favor since it reflects how Spark is already published vs Hadoop 2.2. It doesn't remove support for older Hadoop. I'd like to merge tomorrow. |
Looks good - thanks for commiting this sean. |
This PR seems to have broken spark-perf. Not sure why, but the executor stderr logs have the following:
cc @JoshRosen |
Suspicion is it's just a Hadoop 1 vs. 2 issue since spark-ec2 (which we use for spark-perf testing) launches clusters with Hadoop 1 by default. Will confirm. |
Confirmed. Simply building Spark with the Hadoop version explicitly set to 1.0.4 resolves this issue. |
How about setting up Hadoop 2 on EC2 by default? |
Yeah, I asked about that some time ago, and I believe the concern was about surprising users (by changing defaults) + the fact that the Hadoop 2 distro used by spark-ec2 is somehow not a "real" distro. @shivaram could explain more. |
Yeah spark-ec2 does not support Hadoop 2 right now, though there has been a patch sitting around for a while now |
Bump default Hadoop version to 2.2.0. (This is already the dependency version reported by published Maven artifacts.) See JIRA for further discussion.