Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Error consuming Substrait plan which uses count function: "only unary aggregate functions are currently supported" #33566

Closed
asfimport opened this issue Nov 24, 2022 · 0 comments · Fixed by #15083

Comments

@asfimport
Copy link
Collaborator

ARROW-17523 added support for the Substrait extension function "count", but when I write code which produces a Substrait plan which calls it, and then try to run it in Acero, I get an error.

The plan:

message of type 'substrait.Plan' with 3 fields set
extension_uris {
  extension_uri_anchor: 1
  uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml"
}
extension_uris {
  extension_uri_anchor: 2
  uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml"
}
extension_uris {
  extension_uri_anchor: 3
  uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml"
}
extensions {
  extension_function {
    extension_uri_reference: 3
    function_anchor: 2
    name: "count"
  }
}
relations {
  rel {
    aggregate {
      input {
        project {
          common {
            emit {
              output_mapping: 9
              output_mapping: 10
              output_mapping: 11
              output_mapping: 12
              output_mapping: 13
              output_mapping: 14
              output_mapping: 15
              output_mapping: 16
              output_mapping: 17
            }
          }
          input {
            read {
              base_schema {
                names: "int"
                names: "dbl"
                names: "dbl2"
                names: "lgl"
                names: "false"
                names: "chr"
                names: "verses"
                names: "padded_strings"
                names: "some_negative"
                struct_ {
                  types {
                    i32 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    bool_ {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    bool_ {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                }
              }
              local_files {
                items {
                  uri_file: "file:///tmp/RtmpsBsoZJ/file1915f604cff4a"
                  parquet {
                  }
                }
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 1
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 2
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 3
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 4
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 5
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 6
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 7
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 8
                }
              }
              root_reference {
              }
            }
          }
        }
      }
      groupings {
        grouping_expressions {
          selection {
            direct_reference {
              struct_field {
                field: 3
              }
            }
            root_reference {
            }
          }
        }
      }
      measures {
        measure {
          function_reference: 2
          phase: AGGREGATION_PHASE_INITIAL_TO_RESULT
          output_type {
            i64 {
              nullability: NULLABILITY_NULLABLE
            }
          }
          invocation: AGGREGATION_INVOCATION_ALL
        }
      }
    }
  }
}

The error:

Error: NotImplemented: Only unary aggregate functions are currently supported
/home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:587  converter(aggregate_call)
/home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:153  FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(), ext_set, conversion_options)

I have no idea what the "phase" and "invocation" fields above do, but previous attempts to get Acero to consume this plan led to errors due to me using default values instead of the ones specified there (e.g. "Not Implemented: Unsupported aggregation phase 'AGGREGATION_PHASE_UNSPECIFIED'"), so I just changed them to see if it helped.

Reporter: Nicola Crane / @thisisnic
Assignee: Felipe Oliveira

PRs and other links:

Note: This issue was originally created as ARROW-18403. Please see the migration documentation for further details.

westonpace added a commit that referenced this issue Jan 26, 2023
…#15083)

 - [x] Add ability to pass 0 or more than 1 target fields via the Aggregate API
 - [x] Add support for nullary `count` -- `count(*)`
 - [x] Add a n-ary aggregate function to test changes `*`
 
 `*` I implemented a `"covariant(y, x)"` aggregation function and used it to test the Aggregate API changes, but it's not present in this PR now that I intend to focus on passing the CI tests and get a final review
* Closes: #33566

Lead-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
@westonpace westonpace added this to the 12.0.0 milestone Jan 26, 2023
sjperkins pushed a commit to sjperkins/arrow that referenced this issue Feb 10, 2023
…ctions (apache#15083)

 - [x] Add ability to pass 0 or more than 1 target fields via the Aggregate API
 - [x] Add support for nullary `count` -- `count(*)`
 - [x] Add a n-ary aggregate function to test changes `*`
 
 `*` I implemented a `"covariant(y, x)"` aggregation function and used it to test the Aggregate API changes, but it's not present in this PR now that I intend to focus on passing the CI tests and get a final review
* Closes: apache#33566

Lead-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
gringasalpastor pushed a commit to gringasalpastor/arrow that referenced this issue Feb 17, 2023
…ctions (apache#15083)

 - [x] Add ability to pass 0 or more than 1 target fields via the Aggregate API
 - [x] Add support for nullary `count` -- `count(*)`
 - [x] Add a n-ary aggregate function to test changes `*`
 
 `*` I implemented a `"covariant(y, x)"` aggregation function and used it to test the Aggregate API changes, but it's not present in this PR now that I intend to focus on passing the CI tests and get a final review
* Closes: apache#33566

Lead-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants