Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,18 +196,18 @@ Below is a checklist of what you need to do to add a new scalar function to Data
Below is a checklist of what you need to do to add a new aggregate function to DataFusion:

- Add the actual implementation of an `Accumulator` and `AggregateExpr`:
- [here](datafusion/src/physical_plan/string_expressions.rs) for string functions
- [here](datafusion/src/physical_plan/math_expressions.rs) for math functions
- [here](datafusion/src/physical_plan/datetime_expressions.rs) for datetime functions
- create a new module [here](datafusion/src/physical_plan) for other functions
- In [src/physical_plan/aggregates](datafusion/src/physical_plan/aggregates.rs), add:
- a new variant to `BuiltinAggregateFunction`
- [here](datafusion/physical-expr/src/string_expressions.rs) for string functions
- [here](datafusion/physical-expr/src/math_expressions.rs) for math functions
- [here](datafusion/physical-expr/src/datetime_expressions.rs) for datetime functions
- create a new module [here](datafusion/physical-expr/src) for other functions
- In [datafusion/expr/src](datafusion/expr/src/aggregate_function.rs), add:
- a new variant to `AggregateFunction`
- a new entry to `FromStr` with the name of the function as called by SQL
- a new line in `return_type` with the expected return type of the function, given an incoming type
- a new line in `signature` with the signature of the function (number and types of its arguments)
- a new line in `create_aggregate_expr` mapping the built-in to the implementation
- tests to the function.
- In [tests/sql.rs](datafusion/tests/sql.rs), add a new test where the function is called through SQL against well known data and returns the expected result.
- In [tests/sql](datafusion/core/tests/sql), add a new test where the function is called through SQL against well known data and returns the expected result.

## How to display plans graphically

Expand Down
35 changes: 35 additions & 0 deletions datafusion/core/tests/sql/aggregates.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1777,6 +1777,41 @@ async fn simple_avg() -> Result<()> {
Ok(())
}

#[tokio::test]
async fn simple_mean() -> Result<()> {
let schema = Schema::new(vec![Field::new("a", DataType::Int32, false)]);

let batch1 = RecordBatch::try_new(
Arc::new(schema.clone()),
vec![Arc::new(Int32Array::from_slice(&[1, 2, 3]))],
)?;
let batch2 = RecordBatch::try_new(
Arc::new(schema.clone()),
vec![Arc::new(Int32Array::from_slice(&[4, 5]))],
)?;

let ctx = SessionContext::new();

let provider = MemTable::try_new(Arc::new(schema), vec![vec![batch1], vec![batch2]])?;
ctx.register_table("t", Arc::new(provider))?;

let result = plan_and_collect(&ctx, "SELECT MEAN(a) FROM t").await?;

let batch = &result[0];
assert_eq!(1, batch.num_columns());
assert_eq!(1, batch.num_rows());

let values = batch
.column(0)
.as_any()
.downcast_ref::<Float64Array>()
.expect("failed to cast version");
assert_eq!(values.len(), 1);
// mean(1,2,3,4,5) = 3.0
assert_eq!(values.value(0), 3.0_f64);
Ok(())
}

#[tokio::test]
async fn query_sum_distinct() -> Result<()> {
let schema = Arc::new(Schema::new(vec![
Expand Down
1 change: 1 addition & 0 deletions datafusion/expr/src/aggregate_function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ impl FromStr for AggregateFunction {
"max" => AggregateFunction::Max,
"count" => AggregateFunction::Count,
"avg" => AggregateFunction::Avg,
"mean" => AggregateFunction::Avg,
"sum" => AggregateFunction::Sum,
"approx_distinct" => AggregateFunction::ApproxDistinct,
"array_agg" => AggregateFunction::ArrayAgg,
Expand Down