-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-9683: [Rust][DataFusion] Add debug printing to physical plans and associated types #7925
Conversation
@@ -61,6 +73,12 @@ pub struct DatasourcePartition { | |||
batch_iter: Arc<Mutex<dyn RecordBatchReader + Send + Sync>>, | |||
} | |||
|
|||
impl Debug for DatasourcePartition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RecordBatchReader
does not implement Debug
so had to implement directly here.
@@ -110,6 +112,12 @@ struct ParquetPartition { | |||
iterator: Arc<Mutex<dyn RecordBatchReader + Send + Sync>>, | |||
} | |||
|
|||
impl Debug for ParquetPartition { | |||
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result { | |||
f.debug_struct("ParquetPartition").finish() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this is not #derive
because RecordBatchReader
does not implement Debug
.field("name", &self.name) | ||
.field("args", &self.args) | ||
.field("return_type", &self.return_type) | ||
.field("fun", &"<FUNC>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ScanarUdf
link is typedef'd to a function pointer and does not implement Debug
leading to the two implementations in this file
…nd associated types
e21b7a5
to
9a59730
Compare
Rebased to pick up 37ee600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @alamb
…nd associated types For ARROW-9653, I was trying to debug the execution plan and I would have found it easier if there had been a way to display the execution plan to better understand and isolate the issue. This would also be nice to have as part of EXPLAIN plan functionality in ARROW-9654 In general, for debugging purposes, we would like to be able to dump out an execution plan. To do so in the idiomatic rust way, I made `ExecutionPlan` also implement `std::fmt::Debug` and then followed `rustc` guidance until I got everything that was needed Here is an example plan for `"SELECT c1, c2, MIN(c3) FROM aggregate_test_100 GROUP BY c1, c2"` when printed using `println!("{:#?}", plan)`: ``` physical plan is HashAggregateExec { group_expr: [ Column { name: "c1", }, Column { name: "c2", }, ], aggr_expr: [ Min { expr: Column { name: "c3", }, }, ], input: DataSourceExec { schema: Schema { fields: [ Field { name: "c1", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, }, Field { name: "c2", data_type: UInt32, nullable: false, dict_id: 0, dict_is_ordered: false, }, Field { name: "c3", data_type: Int8, nullable: false, dict_id: 0, dict_is_ordered: false, }, ], metadata: {}, }, partitions.len: 1, }, schema: Schema { fields: [ Field { name: "c1", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, }, Field { name: "c2", data_type: UInt32, nullable: true, dict_id: 0, dict_is_ordered: false, }, Field { name: "MIN(c3)", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, }, ], metadata: {}, }, } ``` Closes apache#7925 from alamb/alamb/ARROW-9683-debug Authored-by: alamb <andrew@nerdnetworks.org> Signed-off-by: Andy Grove <andygrove@nvidia.com>
For ARROW-9653, I was trying to debug the execution plan and I would have found it easier if there had been a way to display the execution plan to better understand and isolate the issue. This would also be nice to have as part of EXPLAIN plan functionality in ARROW-9654
In general, for debugging purposes, we would like to be able to dump out an execution plan. To do so in the idiomatic rust way, I made
ExecutionPlan
also implementstd::fmt::Debug
and then followedrustc
guidance until I got everything that was neededHere is an example plan for
"SELECT c1, c2, MIN(c3) FROM aggregate_test_100 GROUP BY c1, c2"
when printed usingprintln!("{:#?}", plan)
: