Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34384][CORE] Add missing docs for ResourceProfile APIs #31496

Closed
wants to merge 8 commits into from

Conversation

Ngone51
Copy link
Member

@Ngone51 Ngone51 commented Feb 6, 2021

What changes were proposed in this pull request?

This PR adds missing docs for ResourceProfile related APIs. Besides, it includes a few minor changes on API:

  • ResourceProfileBuilder.build -> ResourceProfileBuilder.builder()
  • Provides java specific API allSupportedExecutorResourcesJList
  • private ResourceAllocator since it was mistakenly exposed previously

Why are the changes needed?

Add missing API docs

Does this PR introduce any user-facing change?

No, as Apache Spark 3.1 hasn't officially released.

How was this patch tested?

Updated unit tests due to the signature change of build().

@@ -25,7 +25,7 @@ import org.apache.spark.SparkException
* Trait used to help executor/worker allocate resources.
* Please note that this is intended to be used in a single thread.
*/
trait ResourceAllocator {
private[spark] trait ResourceAllocator {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This's probably mistakenly exposed in 3.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this impact folks working on schedulers outside of org.apache.spark?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it wouldn't since there's no way to plug-in a custom ResourceAllocator in Spark yet.

@Ngone51
Copy link
Member Author

Ngone51 commented Feb 6, 2021

cc @tgravescs @jiangxb1987 @mengxr

cc release manager @HyukjinKwon

@SparkQA
Copy link

SparkQA commented Feb 6, 2021

Test build #134953 has finished for PR 31496 at commit 199ce8d.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39536/

@SparkQA
Copy link

SparkQA commented Feb 6, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39536/

@SparkQA
Copy link

SparkQA commented Feb 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39537/

@SparkQA
Copy link

SparkQA commented Feb 6, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39537/

@SparkQA
Copy link

SparkQA commented Feb 6, 2021

Test build #134954 has finished for PR 31496 at commit d248c8d.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 7, 2021

Test build #134967 has started for PR 31496 at commit 8e72f87.

@SparkQA
Copy link

SparkQA commented Feb 7, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39551/

@SparkQA
Copy link

SparkQA commented Feb 7, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39551/

@tgravescs
Copy link
Contributor

I'm -1 for any changes here except maybe docs... it is way to late in the release to be changing api's. This was reviewed and went in.

We have already put out an RC and this is not a blocking bug. I don't want last minute changes that were thought about to mess things up.

@tgravescs
Copy link
Contributor

I'd really like to express my concern over this PR. There is a reason we go through a SPIP process and review the API's during design and PRs. Many things were discussed, sometimes compromises were made based on feedback, some things may be to allow supporting future enhancements, etc. I really hope this isn't happening in other places in the code.

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Feb 8, 2021

I don't think this blocks RC I am preparing now. Same here too. Hope we can do this earlier next time.

@tgravescs, though, some of them like https://github.com/apache/spark/pull/31496/files#diff-ee7e90474f1ce0390fce28f5e4d1d1be689c905ed13069bd869c8689a177e154R150 seems making sense. Maybe it wouldn't hurt to do a fine grained review.

For the API changes such as https://github.com/apache/spark/pull/31496/files#diff-a6d96a65d9905b310451b125acac6610ffbd6b4548461bd1d5a18dc29282814aL57, we might have to follow the standard deprecation process if we're going to change if we will do.

@mengxr
Copy link
Contributor

mengxr commented Feb 9, 2021

@tgravescs I think there are two separate questions:

  1. Does it block 3.1 release? I agree with you that it shouldn't. It is indeed too late in the release process and the proposed changes are not critical bugs.
  2. Are the current APIs final? I think they are not. That is why they are labeled Evolving or DeveloperApi. So we still want to hear feedback from users, discuss them, and improve the APIs until we feel confident to mark them as stable.

So how about the following (none blocking 3.1 release):

  1. In this PR, we complete the ScalaDoc and hide ResourceAllocator that should be marked private in 3.0. Seems everyone is okay with those changes.
  2. I hope we can agree on the following minor changes:

If we manage to get the above changes into 3.1 in time, I think it would improve API clarity and consistency. Then we can discuss other proposed renaming/refactoring after. Does it sound good?

@tgravescs
Copy link
Contributor

The API as you state is Evolving and that is on purpose so we can extend and change as people use it and we learn more. I'm happy to hear the feedback and improve the API during normally development periods, at this point when we already had a 3.1 rc, it is not appropriate to throw API changes in last minute, most of them were discussed and chosen for a reason. This just leads to introducing more issues. I asked repeatedly for people to help review the SPIP and the code along the way - that was the time to help with API changes not right before a release goes out. I think this has happened way to much and I believe even brought up about the 3.0.0 release.

That said, non-api changes or obvious small issues in it should be fixed.
I'm fine with 1. and I'm fine with 2 for:
Seq -> Array: https://github.com/apache/spark/pull/31496/files#diff-319a3f0dfd7de6045eb11ad960180230c47e6f52b1b27d3c9a1f2d72f1615d9dR284
build -> build() https://github.com/apache/spark/pull/31496/files#diff-a6d96a65d9905b310451b125acac6610ffbd6b4548461bd1d5a18dc29282814aR71

The third point api's were intentionally added so I'm against that change at this point. It does look like I forgot to add automated tests for it, we can file a jira and I will add them. We shouldn't remove an api because tests were missed. This is user facing api so the fact its not used internally doesn't mean its not useful. You can argue whether a builder should have a clear/remove api, but it really depends on how users use this set of API's. I have seen many builders with remove apis.

@mengxr
Copy link
Contributor

mengxr commented Feb 9, 2021

Sounds good. Let's do minimal changes in this PR and open two JIRAs as follow-ups: 1) complete the missing tests, 2) collect API feedback. cc: @Ngone51 @HyukjinKwon

@HyukjinKwon
Copy link
Member

SGTM

Copy link
Contributor

@holdenk holdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick skim, thanks for working on improving the javadocs :)

@@ -25,7 +25,7 @@ import org.apache.spark.SparkException
* Trait used to help executor/worker allocate resources.
* Please note that this is intended to be used in a single thread.
*/
trait ResourceAllocator {
private[spark] trait ResourceAllocator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this impact folks working on schedulers outside of org.apache.spark?

@Ngone51
Copy link
Member Author

Ngone51 commented Feb 18, 2021

Thanks for all the feedback. I've updated PR to include only 2 API changes(Seq -> Array and build -> build()) as @tgravescs agreed. And also filed follow-up JIRAs: SPARK-34460, SPARK-34461

Please take another look, thanks!

@SparkQA
Copy link

SparkQA commented Feb 18, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39800/

@SparkQA
Copy link

SparkQA commented Feb 18, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39800/

@SparkQA
Copy link

SparkQA commented Feb 18, 2021

Test build #135219 has finished for PR 31496 at commit ea08237.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tgravescs
Copy link
Contributor

can we please rename pr and Jira and update the description. It would be good for jira to have description as well.

@Ngone51 Ngone51 changed the title [SPARK-34384][CORE] API cleanup for ResourceProfile [SPARK-34384][CORE] Improve javadoc for ResourceProfile APIs Feb 19, 2021
@Ngone51 Ngone51 changed the title [SPARK-34384][CORE] Improve javadoc for ResourceProfile APIs [SPARK-34384][CORE] Add missing docs for ResourceProfile APIs Feb 19, 2021
@Ngone51
Copy link
Member Author

Ngone51 commented Feb 19, 2021

Thanks @tgravescs I have updated both PR and JIRA.

@SparkQA
Copy link

SparkQA commented Feb 19, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39826/

@SparkQA
Copy link

SparkQA commented Feb 19, 2021

Test build #135246 has finished for PR 31496 at commit 9e2bd4b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 19, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39826/

@HyukjinKwon
Copy link
Member

@tgravescs and @holdenk, I plan to cut the RC as soon as possible (after the blocker #31550 is merged), but it seems like this PR includes two small changes that might matter in compatibility:

The changes look pretty fine to me, and I would like to merge this soon. I would appreciate if you guys take another look when you guys fine some time :-).

@srowen
Copy link
Member

srowen commented Feb 19, 2021

There is also marking ResourceAllocator as private[spark] - everyone is OK with that, to be clear?

@tgravescs
Copy link
Contributor

yes I think marking ResourceAllocator as private is fine

@Ngone51
Copy link
Member Author

Ngone51 commented Feb 19, 2021

ResourceAllocator is mistakenly exposed previously. And there's no way to plug-in a custom ResourceAllocator into Spark. So it should be fine.

@srowen
Copy link
Member

srowen commented Feb 19, 2021

OK sounds good to merge for 3.1 before the RC? I'll do so if someone doesn't beat me to it.

* Return all supported Spark built-in executor resources, custom resources like GPUs/FPGAs
* are excluded.
*/
def allSupportedExecutorResources: Seq[String] = _allSupportedExecutorResources
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just def allSupportedExecutorResources: Array[String] = Array(CORES, MEMORY, OVERHEAD_MEM, PYSPARK_MEM, OFFHEAP_MEM) for both cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this also looks ok to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this also looks ok to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me ... hope you are ok with it @tgravescs

@SparkQA
Copy link

SparkQA commented Feb 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39869/

@SparkQA
Copy link

SparkQA commented Feb 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39869/

@SparkQA
Copy link

SparkQA commented Feb 20, 2021

Test build #135290 has finished for PR 31496 at commit 5ce6fbe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 20, 2021

Test build #135300 has finished for PR 31496 at commit 96d0760.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I will wait for @tgravescs's input at #31496 (comment) (I am going to cut RC on the first working day from now on).

@HyukjinKwon
Copy link
Member

@tgravescs, I will merge this one I guess we'd be all fine with #31496 (comment) but will cut the RC in the next working day just in case you have a different thought.

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.1.

HyukjinKwon pushed a commit that referenced this pull request Feb 21, 2021
### What changes were proposed in this pull request?

This PR adds missing docs for ResourceProfile related APIs. Besides, it includes a few minor changes on API:

* ResourceProfileBuilder.build -> ResourceProfileBuilder.builder()
* Provides java specific API `allSupportedExecutorResourcesJList`
* private `ResourceAllocator` since it was mistakenly exposed previously

### Why are the changes needed?

Add missing API docs

### Does this PR introduce _any_ user-facing change?

No, as Apache Spark 3.1 hasn't officially released.

### How was this patch tested?

Updated unit tests due to the signature change of `build()`.

Closes #31496 from Ngone51/resource-profile-api-cleanup.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
(cherry picked from commit 546d2eb)
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
@Ngone51
Copy link
Member Author

Ngone51 commented Feb 21, 2021

Thanks all!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants