Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Add an order_by node which can reassign an ordering mid-plan #34248

Closed
westonpace opened this issue Feb 18, 2023 · 0 comments · Fixed by #34249 or #34654
Closed

[C++] Add an order_by node which can reassign an ordering mid-plan #34248

westonpace opened this issue Feb 18, 2023 · 0 comments · Fixed by #34249 or #34654

Comments

@westonpace
Copy link
Member

Describe the enhancement requested

#34136 added ordering to exec plans. We can now take our order_by_sink node and make an actual order_by node which can be placed anywhere in a plan. This can use the same naive implementation in order_by_sink.

Component(s)

C++

jorisvandenbossche added a commit that referenced this issue Mar 3, 2023
…eOptions classes (#34102)

First step for GH-33976, adding basic bindings for the different ExecNodeOptions classes and the Declaration class to combine those in a query.

Some notes on what is and what is not included in this PR:

* For source nodes, didn't expose the generic `SourceNodeOptions` et al, only the concrete `TableSourceNodeOptions` (should probably also add `RecordBatchReaderSourceNodeOptions`)
* Didn't yet expose any sink nodes. The table sink is implicitly used by `Declaration.to_table()`, and given that there is currently no explicit API to manually convert to ExecPlan and execute it, explicit table sink node bindings didn't seem necessary. 
* Also didn't yet expose the order_by sink node, because this requires a custom sink when collecting as a Table, and it's not directly clear how this is possible with the Declaration interface. This requires #34248 to be fixed first.
* Leaving dataset-based scan and write nodes for a follow-up PR
* Basic class for `Declaration` with a `to_table` method to execute the plan and consume it into a Table, and a `to_reader()` to get a RecordBatchReader (could also further add a `to_batches()` method)

--

* Issue: #33976

Lead-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
westonpace added a commit that referenced this issue Mar 21, 2023
* Closes: #34248

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
@westonpace westonpace added this to the 12.0.0 milestone Mar 21, 2023
jorisvandenbossche added a commit that referenced this issue Mar 22, 2023
Adds Python bindings for the OrderByNode added in #34249 
* Closes: #34248

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue Mar 27, 2023
* Closes: apache#34248

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
rtpsw pushed a commit to rtpsw/arrow that referenced this issue Mar 27, 2023
Adds Python bindings for the OrderByNode added in apache#34249 
* Closes: apache#34248

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
ArgusLi pushed a commit to Bit-Quill/arrow that referenced this issue May 15, 2023
Adds Python bindings for the OrderByNode added in apache#34249 
* Closes: apache#34248

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment