Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33288][YARN][FOLLOW-UP][test-hadoop2.7] Fix type mismatch error #30375

Closed
wants to merge 1 commit into from
Closed

[SPARK-33288][YARN][FOLLOW-UP][test-hadoop2.7] Fix type mismatch error #30375

wants to merge 1 commit into from

Conversation

wangyum
Copy link
Member

@wangyum wangyum commented Nov 14, 2020

What changes were proposed in this pull request?

This pr fix type mismatch error:

[error] /home/jenkins/workspace/spark-master-test-sbt-hadoop-2.7-hive-2.3/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala:320:52: type mismatch;
[error]  found   : Long
[error]  required: Int
[error]         Resource.newInstance(resourcesWithDefaults.totalMemMiB, resourcesWithDefaults.cores)
[error]                                                    ^
[error] one error found

Why are the changes needed?

Fix compile issue.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing test.

@github-actions github-actions bot added the YARN label Nov 14, 2020
@@ -317,7 +317,7 @@ private[yarn] class YarnAllocator(
customSparkResources
}
val resource =
Resource.newInstance(resourcesWithDefaults.totalMemMiB, resourcesWithDefaults.cores)
Resource.newInstance(resourcesWithDefaults.totalMemMiB.toInt, resourcesWithDefaults.cores)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SparkQA
Copy link

SparkQA commented Nov 14, 2020

Test build #131090 has finished for PR 30375 at commit 6002504.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Nov 14, 2020

cc @tgravescs

@SparkQA
Copy link

SparkQA commented Nov 14, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35693/

@SparkQA
Copy link

SparkQA commented Nov 14, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35693/

@wangyum
Copy link
Member Author

wangyum commented Nov 14, 2020

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 14, 2020

Test build #131093 has finished for PR 30375 at commit 6002504.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 14, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35696/

@mridulm
Copy link
Contributor

mridulm commented Nov 14, 2020

Given the value is in MB's, perhaps we should maintain it as int and not do .toInt from a long, thoughts @tgravescs ?

@SparkQA
Copy link

SparkQA commented Nov 14, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35696/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @wangyum .

Ya. Hadoop 2.7 profile is easily outdated in these days. Given that, hopefully, Apache Spark 3.1 can give more benefits and help to migrate to Hadoop 3.2+ and we can remove Hadoop 2.7 maintenance burden eventually.

BTW, @wangyum and @mridulm and @tgravescs . Do you think it's possible for us to start discussion for dropping Hadoop 2.7 at Apache Spark 3.2?

@mridulm
Copy link
Contributor

mridulm commented Nov 14, 2020

BTW, @wangyum and @mridulm and @tgravescs . Do you think it's possible for us to start discussion for dropping Hadoop 2.7 at Apache Spark 3.2?

Is the proposal to drop 2.7 and move to 2.10 ? Or to drop 2.x entirely and move to hadoop 3.x ?

Hadoop 2.7.4, which we use, was released 3 years back - while 2.10 was released 1 year back.
Given hadoop 3.x is a major version change, IMO we need to support 2.x branch for longer until larger set of users have migrated.

Assuming we are still supporting 2.10, will the issues we are facing get addressed ?

@dongjoon-hyun
Copy link
Member

My initial question was about dropping hadoop-2.x profile completely like we did for hive-1.2, @mridulm . It was just a question for the possibility~

If you think so, I will focus on supporting Hadoop 2.x LTS officially in Apache Spark 3 too.

@dongjoon-hyun
Copy link
Member

I made a PR to protect Hadoop 2 profile. Could you review that, @mridulm ?

@wangyum
Copy link
Member Author

wangyum commented Nov 15, 2020

+1 for supporting Hadoop 2.x LTS officially in Apache Spark 3.

@wangyum wangyum closed this in f660946 Nov 16, 2020
@wangyum
Copy link
Member Author

wangyum commented Nov 16, 2020

Merged to master.

@wangyum wangyum deleted the SPARK-33288 branch November 16, 2020 03:29
@dongjoon-hyun
Copy link
Member

Thank you, @wangyum .

HyukjinKwon pushed a commit that referenced this pull request Nov 16, 2020
### What changes were proposed in this pull request?

This PR aims to protect `Hadoop 2.x` profile compilation in Apache Spark 3.1+.

### Why are the changes needed?

Since Apache Spark 3.1+ switch our default profile to Hadoop 3, we had better prevent at least compilation error with `Hadoop 2.x` profile at the PR review phase. Although this is an additional workload, it will finish quickly because it's compilation only.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the GitHub Action.
- This should be merged after #30375 .

Closes #30378 from dongjoon-hyun/SPARK-33454.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@tgravescs
Copy link
Contributor

thanks for fixing, I don't think we should drop Hadoop 2.x, I think to many people are still using it.

@mridulm
Copy link
Contributor

mridulm commented Nov 16, 2020

@tgravescs any thoughts on this comment ?
This probably got merged before you got to it.

@tgravescs
Copy link
Contributor

sorry missed your comment @mridulm yeah we can just use int since this are all MiB, I'll file a followup and put up a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants