Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-64] Broadcast dedupParams to improve performance #545

Merged
merged 1 commit into from Sep 21, 2021

Conversation

umartin
Copy link
Contributor

@umartin umartin commented Sep 21, 2021

Is this PR related to a proposed Issue?

https://issues.apache.org/jira/browse/SEDONA-64

What changes were proposed in this PR?

Broadcast dedupParams to reduce task size.

How was this patch tested?

Current tests in Sedona covers the change.

Test with production workloads. We see a 20% reduction in execution time for jobs with a large partition count.

Did this PR include necessary documentation updates?

Javadoc for an added public method.

@umartin
Copy link
Contributor Author

umartin commented Sep 21, 2021

One of the checks fails with a timeout trying to download an artifact from a maven repository. Looks like an intermittent failure. I don't have the option to rerun checks.

@jiayuasu jiayuasu added this to the sedona-1.1.0 milestone Sep 21, 2021
@jiayuasu jiayuasu merged commit 4761ac0 into apache:master Sep 21, 2021
@jiayuasu
Copy link
Member

Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants