Skip to content

feat: add DataFrame fill_nan#22702

Open
Nagato-Yuzuru wants to merge 2 commits into
apache:mainfrom
Nagato-Yuzuru:issue/14770
Open

feat: add DataFrame fill_nan#22702
Nagato-Yuzuru wants to merge 2 commits into
apache:mainfrom
Nagato-Yuzuru:issue/14770

Conversation

@Nagato-Yuzuru
Copy link
Copy Markdown

Which issue does this PR close?

What changes are included in this PR?

Add fill_nan and test by referencing the fill_null mirror.

Are these changes tested?

Yes

Are there any user-facing changes?

Add a new function.

Copilot AI review requested due to automatic review settings June 1, 2026 17:29
@github-actions github-actions Bot added the core Core DataFusion crate label Jun 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new fill_nan method to DataFrame that replaces NaN values in floating-point columns with a user-provided scalar, leveraging the nanvl math function. Includes tests for both column-scoped and all-columns variants.

Changes:

  • New DataFrame::fill_nan(value, columns) API that builds a projection with nanvl(col, fill_value) for matching float columns.
  • Falls back to the original column when the fill value cannot be cast to the column type.
  • Adds two tokio tests verifying behavior for specified columns and for all columns.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
datafusion/core/src/dataframe/mod.rs Implements fill_nan using nanvl, with column resolution and type-cast handling.
datafusion/core/tests/dataframe/mod.rs Adds create_nan_table helper and tests for fill_nan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2582 to +2593
match value.clone().cast_to(field.data_type()) {
Ok(fill_value) => Expr::Alias(Alias {
expr: Box::new(Expr::ScalarFunction(ScalarFunction {
func: nanvl(),
args: vec![col(field.name()), lit(fill_value)],
})),
relation: None,
name: field.name().to_string(),
metadata: None,
}),
Err(_) => col(field.name()),
}
})
.collect::<Vec<_>>();

self.clone().select(projections)
Comment thread datafusion/core/tests/dataframe/mod.rs
Comment thread datafusion/core/tests/dataframe/mod.rs
Ok(())
}

async fn create_nan_table() -> Result<DataFrame> {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add DataFrame fill_nan

2 participants