-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
This is a MRE for a query that results in this error:
InvalidArgumentError("Column 'COUNT(DISTINCT tbl.col2)[count distinct]' is declared as non-nullable but contains null values")
The input table (tbl) looks like this:
+------+------+
| col1 | col2 |
+------+------+
| A | |
| B | b |
| C | c |
+------+------+
This is the query
SELECT count(DISTINCT "col2"), min(0) AS "zero"
FROM "tbl"
GROUP BY "col1"The error goes away without the min(0) AS "zero" column expression
To Reproduce
Steps to reproduce the behavior:
#[cfg(test)]
mod tests {
use std::sync::Arc;
use datafusion::arrow::array::{ArrayRef, StringArray};
use datafusion::arrow::datatypes::{DataType, Field, Schema, SchemaRef};
use datafusion::arrow::record_batch::RecordBatch;
use datafusion::arrow::util::pretty::pretty_format_batches;
use datafusion::datasource::MemTable;
use datafusion::prelude::SessionContext;
#[tokio::test]
async fn count_distinct_error() {
let col1 = Arc::new(StringArray::from(
vec!["A", "B", "C"]
)) as ArrayRef;
let col2 = Arc::new(StringArray::from(
vec![None, Some("b"), Some("c")]
)) as ArrayRef;
let schema = Arc::new(Schema::new(vec![
Field::new("col1", DataType::Utf8, true),
Field::new("col2", DataType::Utf8, true)
])) as SchemaRef;
let batch = RecordBatch::try_new(schema.clone(), vec![col1, col2]).unwrap();
let mem_table = MemTable::try_new(schema, vec![vec![batch]]).unwrap();
// Create context and register table
let ctx = SessionContext::new();
ctx.register_table("tbl", Arc::new(mem_table)).unwrap();
// Print input table
let res = ctx.sql("SELECT * from tbl").await.unwrap().collect().await.unwrap();
let formatted = pretty_format_batches(res.as_slice()).unwrap();
println!("{}", formatted);
// Perform query
let res = ctx.sql(r#"
SELECT count(DISTINCT "col2"), min(0) AS "zero"
FROM "tbl"
GROUP BY "col1"
"#).await.unwrap().collect().await.unwrap();
let formatted = pretty_format_batches(res.as_slice()).unwrap();
println!("{}", formatted);
}
}Expected behavior
Expect the query to complete without errors
Additional context
This may be a duplicate of #4040, but the query setup there was more complex. If a maintainer thinks it's the same bug I'd be happy to move this MRE over to that issue.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working