Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33912][SQL] Refactor DependencyUtils ivy property parameter #30928

Closed
wants to merge 8 commits into from

Conversation

AngersZhuuuu
Copy link
Contributor

What changes were proposed in this pull request?

According to #29966 (comment)
we'd better refactor ivy property parameters.

Why are the changes needed?

Refactor code to use Option instead of null

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existed UT

@github-actions github-actions bot added the CORE label Dec 25, 2020
@AngersZhuuuu
Copy link
Contributor Author

I see many parameter use null as default value, if we can refactor all parameter to Option?

@AngersZhuuuu
Copy link
Contributor Author

FYI @xkrogen @maropu @dongjoon-hyun

@SparkQA
Copy link

SparkQA commented Dec 25, 2020

Test build #133381 has finished for PR 30928 at commit 83686af.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 27, 2020

Test build #133403 has finished for PR 30928 at commit 978e994.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -588,7 +588,7 @@ private[spark] class SparkSubmit extends Logging {
OptionAssigner(args.deployMode, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES,
confKey = SUBMIT_DEPLOY_MODE.key),
OptionAssigner(args.name, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, confKey = "spark.app.name"),
OptionAssigner(args.ivyRepoPath, ALL_CLUSTER_MGRS, CLIENT, confKey = "spark.jars.ivy"),
OptionAssignerWrapper(args.ivyRepoPath, ALL_CLUSTER_MGRS, CLIENT, confKey = "spark.jars.ivy"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine either way honestly.
My main concern is consistency - why are these options handled differently from those using OptionAssigner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine either way honestly.
My main concern is consistency - why are these options handled differently from those using OptionAssigner?

Since I want to add a constructor to support Option[String]. Emmm seems will cause None.get exception.
Change to to use orNull

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, OptionAssigner takes an object or null, not an Option. Ok either way. It's fine as is now.

@SparkQA
Copy link

SparkQA commented Dec 28, 2020

Test build #133416 has started for PR 30928 at commit 7da53f2.

var packages: String = null
var repositories: String = null
var ivyRepoPath: String = null
var packages: Option[String] = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but why do we bother change this? Looks like there are a lot of nulls above too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but why do we bother change this? Looks like there are a lot of nulls above too.

Yea, that's why I asked about #30928 (comment)

Since want to refactor parameter to use Option[String], but here are just String, so change this place .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String vs null is a valid way probably in Java context. Option might be preferred since we're in Scala but I wouldn't change it just for this reason. The gain is small compared to the size of change and potential maintenance overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String vs null is a valid way probably in Java context. Option might be preferred since we're in Scala but I wouldn't change it just for this reason. The gain is small compared to the size of change and potential maintenance overhead.

So would you prefer to keep current or revert this change and just make it as Option when use method

      val resolvedMavenCoordinates = DependencyUtils.resolveMavenDependencies(
        packagesTransitive = true, Option(args.packagesExclusions), Option(args.packages),
        Option(args.repositories), Option(args.ivyRepoPath), Option(args.ivySettingsPath)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wouldn't particularly mind when I read these codes. If you still prefer, feel free to wait for other committers' approval or review. I don't mind if any committer prefers and wants to merge it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, wait for other committer's review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be neutral on it - but if we're just changing a few instances? not sure it's worth the inconsistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be neutral on it - but if we're just changing a few instances? not sure it's worth the inconsistency.

I decide to change this since there is already a Option[String]

var ivySettingsPath: Option[String] = None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm neutral on this change, too. I find xxx: String = null in many places of the core package and I'm not sure that this partial fix can make any benefit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm neutral on this change, too. I find xxx: String = null in many places of the core package and I'm not sure that this partial fix can make any benefit.

Ok, revert this change. about this.How about current

@SparkQA
Copy link

SparkQA commented Dec 28, 2020

Test build #133432 has finished for PR 30928 at commit e09d8e0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 29, 2020

Test build #133468 has finished for PR 30928 at commit e09d8e0.

  • This patch fails from timeout after a configured wait of 500m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 29, 2020

Test build #133497 has finished for PR 30928 at commit e09d8e0.

  • This patch fails from timeout after a configured wait of 500m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 30, 2020

Test build #133511 has started for PR 30928 at commit c7d1aee.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against it, but, is this simplifying anything?

@AngersZhuuuu
Copy link
Contributor Author

retest this please

@AngersZhuuuu
Copy link
Contributor Author

I'm not against it, but, is this simplifying anything?

Just a refactor about method parameter, to avoid handle null and empty string such as getOrElse(""), make these method more readable from a Scala perspective.

@SparkQA
Copy link

SparkQA commented Dec 31, 2020

Test build #133552 has finished for PR 30928 at commit c7d1aee.

  • This patch fails from timeout after a configured wait of 500m.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@xkrogen xkrogen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current diff seems like a good tradeoff to me, I think it brings more consistency between the SparkSubmitUtils.(build|load)IvySettings and resolveMavenDependencies and generally makes the intent more clear. IMO Option is better, at least in this case where it isn't leveraged from Java code, and we're already using it in nearby places so even if it is a small improvement it's still helpful.

That being said, I recognize the previous comments that null is used in many other places as well, so we still haven't made everything consistent.

Comment on lines 168 to 172
if (packagesExclusions.nonEmpty) {
packagesExclusions.map(_.split(",")).get
} else {
Nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use idiomatic Option processing, either a match or packagesExclusions.map(_.split(",")).getOrElse(Nil)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use idiomatic Option processing, either a match or packagesExclusions.map(_.split(",")).getOrElse(Nil)

DOne

}

SparkSubmitUtils.resolveMavenCoordinates(packages, ivySettings,
SparkSubmitUtils.resolveMavenCoordinates(packages.get, ivySettings,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be packages.orNull ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be packages.orNull ?

Yea

@SparkQA
Copy link

SparkQA commented Jan 5, 2021

Test build #133625 has finished for PR 30928 at commit de0eee1.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38214/

@SparkQA
Copy link

SparkQA commented Jan 5, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38214/

@SparkQA
Copy link

SparkQA commented Jan 5, 2021

Test build #133636 has finished for PR 30928 at commit de1225d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38224/

@SparkQA
Copy link

SparkQA commented Jan 5, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38224/

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Apr 16, 2021
@github-actions github-actions bot closed this Apr 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants