Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MINOR] Remove inappropriate type notation and extra anonymous closure within functional transformations #12413

Closed
wants to merge 11 commits into from

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Apr 15, 2016

What changes were proposed in this pull request?

This PR removes

  • Inappropriate type notations
    For example, from

    words.foreachRDD { (rdd: RDD[String], time: Time) =>
    ...

    to

    words.foreachRDD { (rdd, time) => 
    ...
  • Extra anonymous closure within functional transformations.
    For example,

    .map(item => {
      ...
    })

    which can be just simply as below:

    .map { item => 
      ...
    }

and corrects some obvious style nits.

How was this patch tested?

This was tested after adding rules in scalastyle-config.xml, which ended up with not finding all perfectly.

The rules applied were below:

  • For the first correction,
<check customId="NoExtraClosure" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
    <parameters><parameter name="regex">(?m)\.[a-zA-Z_][a-zA-Z0-9]*\(\s*[^,]+s*=>\s*\{[^\}]+\}\s*\)</parameter></parameters>
</check>
<check customId="NoExtraClosure" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
    <parameters><parameter name="regex">\.[a-zA-Z_][a-zA-Z0-9]*\s*[\{|\(]([^\n>,]+=>)?\s*\{([^()]|(?R))*\}^[,]</parameter></parameters>
</check>
  • For the second correction
<check customId="TypeNotation" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
    <parameters><parameter name="regex">\.[a-zA-Z_][a-zA-Z0-9]*\s*[\{|\(]\s*\([^):]*:R))*\}^[,]</parameter></parameters>
</check>

Those rules were not added

@HyukjinKwon
Copy link
Member Author

cc @srowen

@@ -149,8 +149,8 @@ case class ExceptionFailure(
this(e, accumUpdates, preserveCause = true)
}

def exception: Option[Throwable] = exceptionWrapper.flatMap {
(w: ThrowableSerializationWrapper) => Option(w.exception)
def exception: Option[Throwable] = exceptionWrapper.flatMap { w =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: in this case, it could even be flatMap(w => Option(w.exception)). Is this different at all as far as Scala is concerned, without the braces at all? I could look it up but figured you might already know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be my mistake (while trying to be careful). Let me correct this soon. Thank you.

@srowen
Copy link
Member

srowen commented Apr 15, 2016

LGTM with a few questions

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55914 has finished for PR 12413 at commit bd7dec4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Apr 15, 2016

@srowen the commits I just submitted change some more obviously meaningless names to anonymous names and correct parentheses in the classes I changed.

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55922 has finished for PR 12413 at commit c6b3cb8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 15, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55923 has finished for PR 12413 at commit 0ffb4d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55924 has finished for PR 12413 at commit 08de54d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55927 has finished for PR 12413 at commit 08de54d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 15, 2016

Huh, pretty sure that means it actually passed, since there are 2/3 passes and no changes in between.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Apr 15, 2016

Recently I just have noticed Docker integration test is being failed time to time (eg. #12382 (comment)). Please allow me to run a test again.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 15, 2016

Test build #55950 has finished for PR 12413 at commit 08de54d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 16, 2016

Test build #55978 has finished for PR 12413 at commit 08de54d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Apr 16, 2016

Am I doing something wrong?

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Apr 16, 2016

Test build #55987 has finished for PR 12413 at commit 08de54d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 16, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 16, 2016

Test build #55996 has finished for PR 12413 at commit 08de54d.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 16, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 16, 2016

Test build #56002 has finished for PR 12413 at commit 08de54d.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Apr 16, 2016

I really don't get it. I have been looking into this deeper for a whole day but I really have no idea. Let me merge upstream and if that doesn't work, I will try to take the changes out class by class.

@srowen
Copy link
Member

srowen commented Apr 16, 2016

These all look spurious

@srowen
Copy link
Member

srowen commented Apr 16, 2016

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 16, 2016

Test build #56006 has finished for PR 12413 at commit 08de54d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Apr 16, 2016

Merged to master

@asfgit asfgit closed this in 9f678e9 Apr 16, 2016
@rxin
Copy link
Contributor

rxin commented Apr 16, 2016

Hey why are we removing the explicit types? Some of them are here to provide strong guards on the input. I'm not sure what this is accomplishing

@rxin
Copy link
Contributor

rxin commented Apr 16, 2016

Also I don't think we should ever have a rule that forbids explicit type annotation. That is just asking for trouble.

@srowen
Copy link
Member

srowen commented Apr 16, 2016

The title is somewhat misleading. 95% of the change is removing extra layers of braces. There are a few instances where types were removed where they appeared redundant, along the way. "Inappropriate" isn't quite the right word; maybe superfluous. No style rules have been added (see PR), though he wrote down what he tried but didn't enact.

So the question is about things like this?

new JavaDoubleRDD(rdd.flatMap(fn).map((x: jl.Double) => x.doubleValue()))

becoming

new JavaDoubleRDD(rdd.flatMap(fn).map(_.doubleValue()))

?

In the method, fn is already known to produce jl.Double so this is redundant. It does not add type safety that I can see. I supported removing a few instances of this type merely to make it more readable. The others are in examples, where they show user code with these types that aren't necessary.

If I missed a reason for that -- or if you just really want that type -- those changes are minor either way and could be undone. But I reviewed all the changes again and they look correct as intended.

@rxin
Copy link
Contributor

rxin commented Apr 16, 2016

How is the type (rdd: RDD[(String, Int)], time: Time) obvious? I'd rather roll that back. But more importantly, while I think it is worthwhile to get the codebase to some consistent style (e.g. the {{ one is good), some of the things changed in this pr (the ones that removed explicit typing, or replaced x => with _. are very questionable. They don't really improve clarity or achieve any stronger consistency (because we have tons of other places that are using either style).

@@ -431,7 +431,7 @@ private[sql] object StatFunctions extends Logging {
s"exceed 1e4. Currently $columnSize")
val table = counts.groupBy(_.get(0)).map { case (col1Item, rows) =>
val countsRow = new GenericMutableRow(columnSize + 1)
rows.foreach { (row: Row) =>
rows.foreach { row =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here it is definitely unclear what the input type is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but there are a thousand places in the code where a local val's type isn't obvious (e.g. what's rows above?). The question is readability, and I do agree this isn't obviously better on that dimension. I wouldn't have touched it for its own sake. I scanned this as a line where we were already changing the braces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are agreeing with me that it is not obvious, and making the type more explicit is good here, but yet you are arguing we should change it to make it less obvious?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW in Spark SQL there are internal row types and external row types, so it is actually really not obvious what the row type is here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the type isn't obvious, but that can't be the standard, or else we'd write types almost everywhere in Scala. While I don't see the reason this is a special case (you identify a decent reason right above though), I also would not have changed it just for its own sake. I personally would accept the change if the line were being modified, but it wasn't. Hence on a second look I'd be neutral, and wouldn't oppose undoing this, even if I think it's really 6 of one, half-dozen of the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I agree the type isn't obvious, but that can't be the standard"

That should be the standard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a tiny point but is actually a decent opportunity to talk about how we collectively should write the most readable code. As I say, either you mean all the types in this method are obvious to you (they aren't to me!), or you mean you'd actually write just about every type in this method, and I dare say no Scala programmer would. Putting aside cases where the type is essential (methods, or where the inferred reference type must be overridden), this is why I'd say the standard is certainly readability, not strictly whether the type is obvious. I bet we mean the same thing. It's going to be a judgment call that deserves relatively wide latitude, but that does argue against changing a call in the first place unless it seems materially better, and that standard was not met on this line, I agree in retrospect. While I might undo this, not some others you're pointing out, if you feel strongly about reverting a few of these types of changes and that lets us close this out, that LGTM

@rxin
Copy link
Contributor

rxin commented Apr 16, 2016

@HyukjinKwon I think we should either ...

  1. revert this patch and you submit another one that just fixes the straightforward problems
    or
  2. you go ahead and submit another pr that reverts the questionable changes.

In general I think we should not change code for the sake of changing code if the impact is minimal, and definitely not if they are regressing.

@srowen
Copy link
Member

srowen commented Apr 16, 2016

The type isn't that obvious, but it's also not necessary code-wise or to understand the example. I think. Generally the code doesn't write these types of course. This is an example however, for people to learn from. But, I also see most examples don't write types in this case. Seems reasonable to be consistent.

I suggested the further changes to lines that were being changed already (cf. above). There are some additional changes sort of in the same spirit that were not on lines already being modified for the core purpose of the change. Each of the changes looks OK to me, but the gain is very small. If I could wave a wand and make all the code follow one style -- it would be of the sort reflected in this change. We're not going to literally do that, but chipping a way a little is a tiny net improvement.

While I don't think it's worth modifying further, and certainly not undoing the core {{...}} changes, I wouldn't object if you undo the other changes, if you feel strongly. The blame "damage" is done, I'm afraid.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Apr 17, 2016

I understood the type does not look obvious but I think in the way similar with @srowen.
I do not feel strongly I should undo the changes such as removed explicit typing and replaced x => with _.
I can't decide for now.
Let me please take an action maybe after hearing other opinions.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Apr 17, 2016

@rxin Let me simply just submit a minor PR to remove the changes such as removed explicit typing and replaced x => with _. (I asked this to dev-mailing list, here)

@rxin
Copy link
Contributor

rxin commented Apr 17, 2016

no need to do the x => to _ one. I wouldn't do the change in the first place, but now they are done, changing them back doesn't really do much either so let's just leave them there. The explicit type ones are the one that actually matter.

@rxin
Copy link
Contributor

rxin commented Apr 17, 2016

btw thanks!

asfgit pushed a commit that referenced this pull request Apr 18, 2016
… StatFunctions)

## What changes were proposed in this pull request?

This PR reverts some changes in #12413. (please see the discussion in that PR).

from
```scala
    words.foreachRDD { (rdd, time) =>
    ...
```

to
```scala
    words.foreachRDD { (rdd: RDD[String], time: Time) =>
    ...
```

Also, this was discussed in dev-mailing list, [here](http://apache-spark-developers-list.1001551.n3.nabble.com/Question-about-Scala-style-explicit-typing-within-transformation-functions-and-anonymous-val-td17173.html)

## How was this patch tested?

This was tested with `sbt scalastyle`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #12452 from HyukjinKwon/revert-explicit-typing.
lw-lin pushed a commit to lw-lin/spark that referenced this pull request Apr 20, 2016
…e within functional transformations

## What changes were proposed in this pull request?

This PR removes

- Inappropriate type notations
    For example, from
    ```scala
    words.foreachRDD { (rdd: RDD[String], time: Time) =>
    ...
    ```
    to
    ```scala
    words.foreachRDD { (rdd, time) =>
    ...
    ```

- Extra anonymous closure within functional transformations.
    For example,
    ```scala
    .map(item => {
      ...
    })
    ```

    which can be just simply as below:

    ```scala
    .map { item =>
      ...
    }
    ```

and corrects some obvious style nits.

## How was this patch tested?

This was tested after adding rules in `scalastyle-config.xml`, which ended up with not finding all perfectly.

The rules applied were below:

- For the first correction,

```xml
<check customId="NoExtraClosure" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
    <parameters><parameter name="regex">(?m)\.[a-zA-Z_][a-zA-Z0-9]*\(\s*[^,]+s*=>\s*\{[^\}]+\}\s*\)</parameter></parameters>
</check>
```

```xml
<check customId="NoExtraClosure" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
    <parameters><parameter name="regex">\.[a-zA-Z_][a-zA-Z0-9]*\s*[\{|\(]([^\n>,]+=>)?\s*\{([^()]|(?R))*\}^[,]</parameter></parameters>
</check>
```

- For the second correction
```xml
<check customId="TypeNotation" level="error" class="org.scalastyle.file.RegexChecker" enabled="true">
    <parameters><parameter name="regex">\.[a-zA-Z_][a-zA-Z0-9]*\s*[\{|\(]\s*\([^):]*:R))*\}^[,]</parameter></parameters>
</check>
```

**Those rules were not added**

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#12413 from HyukjinKwon/SPARK-style.
lw-lin pushed a commit to lw-lin/spark that referenced this pull request Apr 20, 2016
… StatFunctions)

## What changes were proposed in this pull request?

This PR reverts some changes in apache#12413. (please see the discussion in that PR).

from
```scala
    words.foreachRDD { (rdd, time) =>
    ...
```

to
```scala
    words.foreachRDD { (rdd: RDD[String], time: Time) =>
    ...
```

Also, this was discussed in dev-mailing list, [here](http://apache-spark-developers-list.1001551.n3.nabble.com/Question-about-Scala-style-explicit-typing-within-transformation-functions-and-anonymous-val-td17173.html)

## How was this patch tested?

This was tested with `sbt scalastyle`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#12452 from HyukjinKwon/revert-explicit-typing.
@HyukjinKwon HyukjinKwon deleted the SPARK-style branch January 2, 2018 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants