Skip to content

Conversation

@T2MIX
Copy link
Contributor

@T2MIX T2MIX commented Nov 15, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate execution Related to the execution crate physical-plan Changes to the physical-plan crate labels Nov 15, 2025
@T2MIX T2MIX force-pushed the feat/explain-analyze-readable-formatting branch from aa67ff6 to 2e1c52b Compare November 15, 2025 13:05
Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is well structured and tested, thank you!

The explain analyze format looks much nicer now

> explain analyze
select *
from generate_series(10000000) as t1(v1)
where v1 % 2 = 0;
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type         | plan                                                                                                                                                                                         |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | ProjectionExec: expr=[value@0 as v1], metrics=[output_rows=5.00 M, elapsed_compute=148.99µs, output_bytes=76.3 MB]                                                                           |
|                   |   FilterExec: value@0 % 2 = 0, metrics=[output_rows=5.00 M, elapsed_compute=30.53ms, output_bytes=76.3 MB, selectivity=50% (5.00 M/10.00 M)]                                                 |
|                   |     RepartitionExec: partitioning=RoundRobinBatch(14), input_partitions=1, metrics=[output_rows=10.00 M, elapsed_compute=598.09µs, output_bytes=76.3 MB]                                     |
|                   |       LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=0, end=10000000, batch_size=8192], metrics=[output_rows=10.00 M, elapsed_compute=10.25ms, output_bytes=76.3 MB] |
|                   |                                                                                                                                                                                              |
+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.018 seconds.

@2010YOUY01 2010YOUY01 requested a review from Copilot November 16, 2025 02:16
Copilot finished reviewing on behalf of 2010YOUY01 November 16, 2025 02:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds human-readable formatting to EXPLAIN ANALYZE metrics to improve readability by displaying large numbers with K, M, B, T suffixes and durations with appropriate time units (ns, µs, ms, s).

Key Changes

  • Added human_readable_count() and human_readable_duration() utility functions for formatting numeric values and time durations
  • Updated metric display implementations (Count, Time, PruningMetrics, RatioMetrics, Gauge) to use the new formatting functions
  • Added comprehensive test coverage for the new formatting functions

Reviewed Changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.

File Description
datafusion/execution/src/memory_pool/mod.rs Added human_readable_count() and human_readable_duration() utility functions with tests
datafusion/physical-plan/src/metrics/value.rs Updated Display implementations for Count, Time, PruningMetrics, RatioMetrics, and Gauge types to use human-readable formatting; added comprehensive tests
datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs Updated test assertion to expect human-readable format in output (99872 → 99.87 K)
Cargo.lock Standard dependency version updates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

assert_eq!(human_readable_count(1_000), "1.00 K");
assert_eq!(human_readable_count(10_100), "10.10 K");
assert_eq!(human_readable_count(1_532), "1.53 K");
assert_eq!(human_readable_count(99_999), "100.00 K");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be displayed as 100.0 K ? With one decimal place ?

// Format with appropriate precision
// For values >= 100, show 1 decimal place (e.g., 123.4 K)
// For values < 100, show 2 decimal places (e.g., 10.12 K)
if value >= 100.0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to round it first to make it consistent (i.e. 99999 -> 100.0 K):

Suggested change
if value >= 100.0 {
let rounded = (value * 100.0).round() / 100.0;
if rounded >= 100.0 {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, let's leave it to an optional follow-up.

@2010YOUY01
Copy link
Contributor

Thanks again @T2MIX and @martin-g

@2010YOUY01 2010YOUY01 added this pull request to the merge queue Nov 18, 2025
Merged via the queue into apache:main with commit a732dc3 Nov 18, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate execution Related to the execution crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Minor enhancements to EXPLAIN ANALYZE output formatting

3 participants