Skip to content

Pipeline proto seems to be incorrect for Combine.GroupedValues #18585

@kennknowles

Description

@kennknowles

It looks like CombineTest$BasicTests#testHotKeyCombining on Dataflow (and possibly other runners) is creating an invalid pipeline proto since the transform doesn't an environment (and possible a spec):


I0610 16:05:23.791430   14054 fnapi_instruction_graph_rewriter.cc:230]   transforms {
I0610 16:05:23.791402
  14054 fnapi_instruction_graph_rewriter.cc:230]     key: "HotMean/PostCombine/Combine.GroupedValues"
I0610
16:05:23.791404   14054 fnapi_instruction_graph_rewriter.cc:230]     value {
I0610 16:05:23.791406
  14054 fnapi_instruction_graph_rewriter.cc:230]       inputs {
I0610 16:05:23.791408   14054 fnapi_instruction_graph_rewriter.cc:230]
        key: "org.apache.beam.sdk.values.PCollection.<init>:400#56b99bb29b40d50c"
I0610 16:05:23.791410
  14054 fnapi_instruction_graph_rewriter.cc:230]         value: "HotMean/PostCombine/GroupByKey.out"
I0610
16:05:23.791412   14054 fnapi_instruction_graph_rewriter.cc:230]       }
I0610 16:05:23.791414   14054
fnapi_instruction_graph_rewriter.cc:230]       outputs {
I0610 16:05:23.791416   14054 fnapi_instruction_graph_rewriter.cc:230]
        key: "org.apache.beam.sdk.values.PCollection.<init>:400#4fa4d31096ca160c"
I0610 16:05:23.791419
  14054 fnapi_instruction_graph_rewriter.cc:230]         value: "HotMean/PostCombine/Combine.GroupedValues/ParDo(Anonymous)/ParMultiDo(Anonymous).output"
I0610
16:05:23.791421   14054 fnapi_instruction_graph_rewriter.cc:230]       }
I0610 16:05:23.791423   14054
fnapi_instruction_graph_rewriter.cc:230]       unique_name: "HotMean/PostCombine/Combine.GroupedValues"
I0610
16:05:23.791426   14054 fnapi_instruction_graph_rewriter.cc:230]     }
I0610 16:05:23.791428   14054
fnapi_instruction_graph_rewriter.cc:230]   }

Imported from Jira BEAM-10266. Original Jira may contain additional context.
Reported by: lcwik.
Subtask of issue #18583

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions