Skip to content

use collectFirst instand of a var and iterate#3293

Closed
XuefengWu wants to merge 1 commit intoapache:masterfrom
XuefengWu:clean_code_utils
Closed

use collectFirst instand of a var and iterate#3293
XuefengWu wants to merge 1 commit intoapache:masterfrom
XuefengWu:clean_code_utils

Conversation

@XuefengWu
Copy link

No description provided.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this continue to try to delete the remaining directories if one fails? That depends on the laziness of the underlying data structure, right?

This is certainly more concise, but I'm not sure it's clearer. I think the latter is more important, especially when dealing with potential failure scenarios.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala collection is not lazy, only scala Stream is lazy.

Yours, Xuefeng Wu 吴雪峰 敬上

On 2014年11月17日, at 上午9:21, Aaron Davidson notifications@github.com wrote:

In core/src/main/scala/org/apache/spark/util/Utils.scala:

@@ -759,18 +759,9 @@ private[spark] object Utils extends Logging {
if (file != null) {
try {
if (file.isDirectory && !isSymlink(file)) {

  •      var savedIOException: IOException = null
    
  •      for (child <- listFilesSafely(file)) {
    
  •        try {
    
  •          deleteRecursively(child)
    
  •        } catch {
    
  •          // In case of multiple exceptions, only last one will be thrown
    
  •          case ioe: IOException => savedIOException = ioe
    
  •        }
    
  •      }
    
  •      if (savedIOException != null) {
    
  •        throw savedIOException
    
  •      }
    
  •      listFilesSafely(file).map(child => Try(deleteRecursively(child))).
    
    Will this continue to try to delete the remaining directories if one fails? That depends on the laziness of the underlying data structure, right?

This is certainly more concise, but I'm not sure it's clearer. I think the latter is more important, especially when dealing with potential failure scenarios.


Reply to this email directly or view it on GitHub.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterators are lazy, however. If listFilesSafely were changed to return an iterator instead of list, this would subtly change behavior here. I am not for this change, as it makes the behavior less obvious. The prior form clearly indicated the intended and actual behavior; here it is not clear which one is intended, even if it is clear to the particular reader which one is actually occurring.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if listFilesSafely return a lazy list, the current code doest not work too. the scala for comprehension is only a sugar, it is same as map if there is one iterator.
so
for (child <- listFilesSafely(file)) { try { deleteRecursively(child) } catch { // In case of multiple exceptions, only last one will be thrown case ioe: IOException => savedIOException = ioe } }
is the same as
listFilesSafely(file).map{child => try { deleteRecursively(child) } catch { // In case of multiple exceptions, only last one will be thrown case ioe: IOException => savedIOException = ioe } }
the savedIOException always be null.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterator(1,2,3).map{i => println(i);i} nothing happened for a lazy collection until consume asked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your point about the for comprehension is incorrect in that, without a yield, it actually turns into a foreach, which expands lazy iterators:

for (i <- Iterator(1, 2,3)) { println(i) } works as expected.

@rxin
Copy link
Contributor

rxin commented Nov 17, 2014

While the old code is more verbose, I find it much more intuitive and readable so I wouldn't really change it.

@XuefengWu
Copy link
Author

sound reasonable. thanks.

@XuefengWu
Copy link
Author

I close this.

@XuefengWu XuefengWu closed this Nov 17, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants