-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Consolidate shared methods of RecordBatch and Table #30559
Comments
Suhail Razzak: |
Todd Farmer / @toddfarmer: |
) ### Rationale for this change This is an incremental first step towards #30559 ### What changes are included in this PR? Introduce `class _Table` in `table.pxi`. ### Are these changes tested? Existing pytests will check for regression. ### Are there any user-facing changes? No * Closes: #34979 Authored-by: Dane Pitkin <dane@voltrondata.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
apache#34980) ### Rationale for this change This is an incremental first step towards apache#30559 ### What changes are included in this PR? Introduce `class _Table` in `table.pxi`. ### Are these changes tested? Existing pytests will check for regression. ### Are there any user-facing changes? No * Closes: apache#34979 Authored-by: Dane Pitkin <dane@voltrondata.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
apache#34980) ### Rationale for this change This is an incremental first step towards apache#30559 ### What changes are included in this PR? Introduce `class _Table` in `table.pxi`. ### Are these changes tested? Existing pytests will check for regression. ### Are there any user-facing changes? No * Closes: apache#34979 Authored-by: Dane Pitkin <dane@voltrondata.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
apache#34980) ### Rationale for this change This is an incremental first step towards apache#30559 ### What changes are included in this PR? Introduce `class _Table` in `table.pxi`. ### Are these changes tested? Existing pytests will check for regression. ### Are there any user-facing changes? No * Closes: apache#34979 Authored-by: Dane Pitkin <dane@voltrondata.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
### Rationale for this change These methods are present on `Table` but missing on `RecordBatch`: * `add_column` * `append_column` * `remove_column` * `set_column` * `drop_columns` * `rename_columns` * `cast` We also should probably accept a `dict` as input to `pa.record_batch` like we do for `pa.table`. ### What changes are included in this PR? Add the methods. ### Are these changes tested? Yes. * Parent issue: #36399 * Related: #30559 * Closes #30915 * GitHub Issue: #30915 Lead-authored-by: Judah Rand <17158624+judahrand@users.noreply.github.com> Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
RecordBatch and Table have a bunch of similar methods that don't directly interact with the C++ pointer, and thus that could be shared in a common base class.
In addition, we also have some methods on Table that would also be useful for RecordBatch (eg
cast
,group_by
,drop
,select
,sort_by
,rename_columns
), which could also be shared with a common mixin.Reporter: Joris Van den Bossche / @jorisvandenbossche
Related issues:
add_column
method missing in pyarrow.RecordBatch #30915 (relates to)add_column
method missing in pyarrow.RecordBatch #30915Other ideas without dedicated issues:
pa.record_batch(..)
constructor consistent withpa.table(..)
(eg accepting a dict of column names -> column values)Note: This issue was originally created as ARROW-15042. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: