Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-33926: [Python] DataFrame Interchange Protocol for pyarrow.RecordBatch #34294

Merged
merged 11 commits into from
Feb 28, 2023

Conversation

AlenkaF
Copy link
Member

@AlenkaF AlenkaF commented Feb 22, 2023

Rationale for this change

Add the implementation of the Dataframe Interchange Protocol for pyarrow.RecordBatch. The protocol is already implemented for pyarrow.Table, see #14804.

Are these changes tested?

Yes, tests are added to:

  • python/pyarrow/tests/interchange/test_interchange_spec.py
  • python/pyarrow/tests/interchange/test_conversion.py

@github-actions
Copy link

@AlenkaF AlenkaF marked this pull request as ready for review February 23, 2023 10:29
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already looking good! Seeing the code to use RecordBatch reminds me we should add some small conveniences to RecordBatch that already exist on Table and can easily be added to RecordBatch as well

python/pyarrow/interchange/dataframe.py Outdated Show resolved Hide resolved
python/pyarrow/interchange/dataframe.py Outdated Show resolved Hide resolved
python/pyarrow/table.pxi Show resolved Hide resolved
python/pyarrow/interchange/from_dataframe.py Outdated Show resolved Hide resolved
python/pyarrow/tests/interchange/test_conversion.py Outdated Show resolved Hide resolved
@AlenkaF AlenkaF force-pushed the dataframe-protocol-recordbatch branch from 1eb8cf3 to ef6983d Compare February 27, 2023 10:47
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, looks good to me!

@jorisvandenbossche jorisvandenbossche merged commit 6cf5e89 into apache:main Feb 28, 2023
@AlenkaF AlenkaF deleted the dataframe-protocol-recordbatch branch February 28, 2023 10:04
@ursabot
Copy link

ursabot commented Mar 1, 2023

Benchmark runs are scheduled for baseline = 61c9a74 and contender = 6cf5e89. 6cf5e89 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.46% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.13% ⬆️0.13%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 6cf5e898 ec2-t3-xlarge-us-east-2
[Finished] 6cf5e898 test-mac-arm
[Finished] 6cf5e898 ursa-i9-9960x
[Finished] 6cf5e898 ursa-thinkcentre-m75q
[Finished] 61c9a749 ec2-t3-xlarge-us-east-2
[Finished] 61c9a749 test-mac-arm
[Finished] 61c9a749 ursa-i9-9960x
[Finished] 61c9a749 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Python] DataFrame Interchange Protocol for pyarrow.RecordBatch
3 participants