New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-1955][GraphX]: VertexRDD can incorrectly assume index sharing #4705
Conversation
Can one of the admins verify this patch? |
/cc @ankurdave could you review, thanks! |
ok to test |
Test build #27804 has started for PR 4705 at commit
|
Test build #27804 has finished for PR 4705 at commit
|
Test PASSed. |
LGTM although @ankurdave should probably have the final say. |
@ankurdave any thoughts? This is blocking progress on a few other GraphX JIRA's (SPARK-4600, SPARK-5790), so whenever you get a chance that'd be great, thanks! |
bump |
@brennonyork I'll add unit tests for your patch before/after the patch merged. |
@maropu thanks! |
Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or `leftJoin`ed and have different partition sizes they fail under the `zipPartitions` method. This fix tests whether the partitions are equal or not and, if not, will repartition the other to match the partition size of the calling VertexRDD. Author: Brennon York <brennon.york@capitalone.com> Closes #4705 from brennonyork/SPARK-1955 and squashes the following commits: 0882590 [Brennon York] updated to properly handle differently-partitioned vertexRDDs (cherry picked from commit 9f603fc) Signed-off-by: Ankur Dave <ankurdave@gmail.com>
Fixes the issue whereby when VertexRDD's are `diff`ed, `innerJoin`ed, or `leftJoin`ed and have different partition sizes they fail under the `zipPartitions` method. This fix tests whether the partitions are equal or not and, if not, will repartition the other to match the partition size of the calling VertexRDD. Author: Brennon York <brennon.york@capitalone.com> Closes #4705 from brennonyork/SPARK-1955 and squashes the following commits: 0882590 [Brennon York] updated to properly handle differently-partitioned vertexRDDs (cherry picked from commit 9f603fc) Signed-off-by: Ankur Dave <ankurdave@gmail.com>
Thanks! Merged into master, branch-1.3, and branch-1.2. |
Fixes the issue whereby when VertexRDD's are
diff
ed,innerJoin
ed, orleftJoin
ed and have different partition sizes they fail under thezipPartitions
method. This fix tests whether the partitions are equal or not and, if not, will repartition the other to match the partition size of the calling VertexRDD.