use collectFirst instand of a var and iterate#3293
use collectFirst instand of a var and iterate#3293XuefengWu wants to merge 1 commit intoapache:masterfrom
Conversation
|
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Will this continue to try to delete the remaining directories if one fails? That depends on the laziness of the underlying data structure, right?
This is certainly more concise, but I'm not sure it's clearer. I think the latter is more important, especially when dealing with potential failure scenarios.
There was a problem hiding this comment.
scala collection is not lazy, only scala Stream is lazy.
Yours, Xuefeng Wu 吴雪峰 敬上
On 2014年11月17日, at 上午9:21, Aaron Davidson notifications@github.com wrote:
In core/src/main/scala/org/apache/spark/util/Utils.scala:
@@ -759,18 +759,9 @@ private[spark] object Utils extends Logging {
if (file != null) {
try {
if (file.isDirectory && !isSymlink(file)) {
var savedIOException: IOException = nullfor (child <- listFilesSafely(file)) {try {deleteRecursively(child)} catch {// In case of multiple exceptions, only last one will be throwncase ioe: IOException => savedIOException = ioe}}if (savedIOException != null) {throw savedIOException} Will this continue to try to delete the remaining directories if one fails? That depends on the laziness of the underlying data structure, right?listFilesSafely(file).map(child => Try(deleteRecursively(child))).This is certainly more concise, but I'm not sure it's clearer. I think the latter is more important, especially when dealing with potential failure scenarios.
—
Reply to this email directly or view it on GitHub.
There was a problem hiding this comment.
Iterators are lazy, however. If listFilesSafely were changed to return an iterator instead of list, this would subtly change behavior here. I am not for this change, as it makes the behavior less obvious. The prior form clearly indicated the intended and actual behavior; here it is not clear which one is intended, even if it is clear to the particular reader which one is actually occurring.
There was a problem hiding this comment.
But if listFilesSafely return a lazy list, the current code doest not work too. the scala for comprehension is only a sugar, it is same as map if there is one iterator.
so
for (child <- listFilesSafely(file)) { try { deleteRecursively(child) } catch { // In case of multiple exceptions, only last one will be thrown case ioe: IOException => savedIOException = ioe } }
is the same as
listFilesSafely(file).map{child => try { deleteRecursively(child) } catch { // In case of multiple exceptions, only last one will be thrown case ioe: IOException => savedIOException = ioe } }
the savedIOException always be null.
There was a problem hiding this comment.
Iterator(1,2,3).map{i => println(i);i} nothing happened for a lazy collection until consume asked.
There was a problem hiding this comment.
Your point about the for comprehension is incorrect in that, without a yield, it actually turns into a foreach, which expands lazy iterators:
for (i <- Iterator(1, 2,3)) { println(i) } works as expected.
|
While the old code is more verbose, I find it much more intuitive and readable so I wouldn't really change it. |
|
sound reasonable. thanks. |
|
I close this. |
No description provided.