Skip to content

Hot key fanout should not distribute keys to all shards. #19089

@kennknowles

Description

@kennknowles

The goal is to reduce the number of value sent to a single post-GBK worker. If combiner lifting happens, each bundle will sends a single value per sub-key, causing an N-fold blowup in shuffle data and N reducers with the same amount of data to consume as the single reducer in the non-fanout case. 

Imported from Jira BEAM-4565. Original Jira may contain additional context.
Reported by: robertwb.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions