[SPARK-16613] [CORE] RDD.pipe returns values for empty partitions#14260
[SPARK-16613] [CORE] RDD.pipe returns values for empty partitions#14260srowen wants to merge 2 commits intoapache:masterfrom
Conversation
|
Test build #62521 has finished for PR 14260 at commit
|
|
Is it possible that the underlying command always return something even for 0 rows? e.g. if it is counting the number of elements? |
|
Yeah that's the 'problem' -- consider |
|
Test build #62595 has finished for PR 14260 at commit
|
|
Jenkins retest this please |
|
Test build #62601 has finished for PR 14260 at commit
|
|
LGTM |
|
Merging in master/2.0. |
## What changes were proposed in this pull request? Document RDD.pipe semantics; don't execute process for empty input partitions. Note this includes the fix in #14256 because it's necessary to even test this. One or the other will merge the fix. ## How was this patch tested? Jenkins tests including new test. Author: Sean Owen <sowen@cloudera.com> Closes #14260 from srowen/SPARK-16613. (cherry picked from commit 4b079dc) Signed-off-by: Reynold Xin <rxin@databricks.com>
What changes were proposed in this pull request?
Document RDD.pipe semantics; don't execute process for empty input partitions.
Note this includes the fix in #14256 because it's necessary to even test this. One or the other will merge the fix.
How was this patch tested?
Jenkins tests including new test.