Skip to content

Show Statistics in explain Verbose without enabling show statistics #8111

@NGA-TRAN

Description

@NGA-TRAN

Is your feature request related to a problem or challenge?

Currently, we can only see statistics in the explain if the flag datafusion.explain.show_statistics is on. In the case of InfluxDB IOx, we disable flag settings to avoids turning on other unexpected features such as DDL & DML. However, we still want to see statistics in explain.

Here is the current behavior

Create a table

create table t1(state string, city string, min_temp float, area int, time timestamp) as values 
    ('MA', 'Boston', 70.4, 1, 50);

Run explain with default settings. No statistics in the plan

explain select * from t1 where time <= to_timestamp(350);
+---------------+----------------------------------------------------------------+
| plan_type     | plan                                                           |
+---------------+----------------------------------------------------------------+
| logical_plan  | Filter: t1.time <= TimestampNanosecond(350000000000, None)     |
|               |   TableScan: t1 projection=[state, city, min_temp, area, time] |
| physical_plan | CoalesceBatchesExec: target_batch_size=8192                    |
|               |   FilterExec: time@4 <= 350000000000                           |
|               |     MemoryExec: partitions=1, partition_sizes=[1]              |
|               |                                                                |
+---------------+----------------------------------------------------------------+

Turn show_statistics on, and the explain includes statistics

set datafusion.explain.show_statistics = true;
0 rows in set. Query took 0.002 seconds.

❯ explain select * from t1 where time <= to_timestamp(350);
+---------------+--------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                             |
+---------------+--------------------------------------------------------------------------------------------------+
| logical_plan  | Filter: t1.time <= TimestampNanosecond(350000000000, None)                                       |
|               |   TableScan: t1 projection=[state, city, min_temp, area, time]                                   |
| physical_plan | CoalesceBatchesExec: target_batch_size=8192, statistics=[Rows=Absent, Bytes=Absent]              |
|               |   FilterExec: time@4 <= 350000000000, statistics=[Rows=Absent, Bytes=Absent]                     |
|               |     MemoryExec: partitions=1, partition_sizes=[1], statistics=[Rows=Exact(1), Bytes=Exact(2896)] |
|               |                                                                                                  |
+---------------+--------------------------------------------------------------------------------------------------+

Describe the solution you'd like

Always showing statistics in explain verbose no matter the values of the settings

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions