Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix RDD.reduce when rdd contains empty partitions #84

Merged
merged 1 commit into from May 3, 2019

Conversation

tools4origins
Copy link
Collaborator

Fixes #83

It let the TypeError to be thrown instead of checking the emptiness of the partition beforehand as values is a generator and it seems better not to affect it.

@tools4origins tools4origins changed the title fix RDD.reduce when rdd contain empty partitions fix RDD.reduce when rdd contains empty partitions Apr 7, 2019
@svenkreiss
Copy link
Owner

Thanks for submitting this and sorry for the long delay!

I have a concern about the nested reducer. I think (and I haven't tested it) it is not importable which creates problems for the built-in pickle. Generally, pysparkling must work without cloudpickle, but it turns out there was no unit test for it. #86 adds that. I can probably merge that rather quickly. Do you mind testing your branch with this additional unit test? I think it will require a small change to make the reducer importable.

@svenkreiss
Copy link
Owner

Actually, I'll do it the other way around. Your PR is good for the current set of tests. So I will merge that now and see how to make #86 work afterwards.

@svenkreiss svenkreiss merged commit 181f73a into svenkreiss:master May 3, 2019
@svenkreiss
Copy link
Owner

And I was wrong. Your change is not a problem for the built-in pickle.
Thanks a lot for your contribution!

tools4origins pushed a commit to tools4origins/pysparkling that referenced this pull request May 9, 2020
fix RDD.reduce when rdd contains empty partitions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rdd.reduce does not handle empty partitions
2 participants