Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The current Dataset TableProvider is ugly and converts DataFusion Exprs to Pyarrow expressions in large match statements. I propose we move towards using Substrait as the method for translating from DataFusion to / from Pyarrow. This was not available yet when this TableProvider was made.
Describe the solution you'd like
Now that both Pyarrow and Datafusion support substrait we could clean up and improve the Pyarrow DataSet TableProvider and ExecutionPlan by using pyarrow.substrait to execute the scan.
Describe alternatives you've considered
Keep the existing ugly Dataset code.
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The current
DatasetTableProvideris ugly and converts DataFusionExprs to Pyarrow expressions in large match statements. I propose we move towards using Substrait as the method for translating from DataFusion to / from Pyarrow. This was not available yet when thisTableProviderwas made.Describe the solution you'd like
Now that both Pyarrow and Datafusion support substrait we could clean up and improve the Pyarrow DataSet
TableProviderandExecutionPlanby using pyarrow.substrait to execute the scan.Describe alternatives you've considered
Keep the existing ugly Dataset code.