Skip to content

combine should be parallelizable in many cases #20892

@damccorm

Description

@damccorm

Relevant discussion: https://lists.apache.org/thread.html/r9e7d9527eb1d4c9c097c91c010a25dabf4a5f8053d50dc3b6d90d36a%40%3Cdev.beam.apache.org%3E

Currently we require Singleton partitioning for combine() because func might operate on the full dataset, but in many cases func is actually an elementwise method. We should detect this when possible (e.g. when func is an np.ufunc), and/or provide a flag to let the user indicate the function is elementwise.

Imported from Jira BEAM-12351. Original Jira may contain additional context.
Reported by: bhulette.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions