Skip to content

[SPARK] Config methods simplification at SparkSession#Builder#16823

Closed
pfcoperez wants to merge 1 commit intoapache:masterfrom
pfcoperez:sparksession_buildersimplification
Closed

[SPARK] Config methods simplification at SparkSession#Builder#16823
pfcoperez wants to merge 1 commit intoapache:masterfrom
pfcoperez:sparksession_buildersimplification

Conversation

@pfcoperez
Copy link

@pfcoperez pfcoperez commented Feb 6, 2017

What changes were proposed in this pull request?

SparkSession' companion object Builder inner class presents several implementations for config setter which are exact copies of the same method just varying the value input type.

There are setters for Double, Long and Boolean. And the three of them transform the input value into an string via the default .toString method.

These are basic types and, therefore, child classes of AnyVal. Why explicitly enumerating them in a set of methods replicating code?

The config implementation for String values can be invoke by a single method.

This PR tries to:

  • Reduce the amount of replicated code. That implies that further code updates will require modifying different methods with the dangers derived from possible oversights.
  • Increase the code readability.
  • Add all Scala basic types as possible inputs for the builder configuration pseudo-dsl implemented by SparkSession#Builder

How was this patch tested?

This PRs changes the implementation of already tested, by the current test suites. Hence, no more testing has been added as no actual functionality is added.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

…g code readibility and reducing the probability of forgeting to propagate changes across methods
@andrewor14
Copy link
Contributor

This is a bad idea! First it breaks backward compatibility, and second, we intentionally didn't want to make it so general that the user can pass in any objects. Can you please close this PR?

@srowen
Copy link
Member

srowen commented Feb 6, 2017

Yeah, non-starter I'm afraid.

@pfcoperez
Copy link
Author

pfcoperez commented Feb 6, 2017

@andrewor14 @srowen OK, no discussion: I'll close it.

Just an observation:

image

AnyVal is not Any hence: "any objects" will hardly fit in the input type.

@pfcoperez pfcoperez closed this Feb 6, 2017
@pfcoperez
Copy link
Author

@andrewor14 @srowen In any case, I just wanted to add that copying code is, basically the worst strategy.

If you wanted to constraint the types for those tree and not just AnyVal sub-classes I would recommend doing something as:

def config(key: String, value: Double): Builder = config(key, value.toString)
def config(key: String, value: Boolean): Builder = config(key, value.toString)
def config(key: String, value: Long): Builder = config(key, value.toString)		

Exactly same interface, no copy-paste code.

@srowen
Copy link
Member

srowen commented Feb 6, 2017

Agree, though we are talking about duplicating 1 line of code in 3 nearby places. It's not meaningfully duplicating anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants