[Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation #16228

asfimport · 2017-03-03T14:28:42Z

Reporter: Wes McKinney / @wesm
Assignee: Phillip Cloud / @cpcloud

_{Note: This issue was originally created as ARROW-596. Please see the migration documentation for further details.}

asfimport · 2017-03-03T14:35:11Z

Matthew Rocklin / @mrocklin:
For network applications a nice interface is something that we can pass to socket.send. This might be something like a bytes, bytearray, memoryview, or sequence of those.

asfimport · 2017-03-03T14:38:39Z

Wes McKinney / @wesm:
pyarrow.Buffer has a to_pybytes method as a last resort. Converting to bytes is bad because it is a memory copy

Is there a way to do a zero-copy memory handoff to a memoryview? I can dig into how NumPy provides a zero-copy buffer interface but in case you know off hand the right tool to use

asfimport · 2017-03-03T14:46:21Z

Matthew Rocklin / @mrocklin:
I've never had to construct one myself. I just grab my_numpy_array.data and pass that around. I'll ask Antoine Pitrou to chime in here. I suspect that he would have a better understanding.

asfimport · 2017-03-13T08:51:12Z

Antoine Pitrou / @pitrou:
Cython allows you to implement the buffer protocol: see https://cython.readthedocs.io/en/latest/src/userguide/buffer.html . I've never used it but it looks similar to what you would do in C.

Note that pyarrow.Buffer needs to be a fixed-size buffer for that operation to make sense. If not, then getbuffer should lock the buffer size until releasebuffer is called.

asfimport · 2017-03-13T14:29:57Z

Wes McKinney / @wesm:
thanks @pitrou – it is a fixed size buffer, so I think that will work out fine. This is being worked on right now in #369

asfimport · 2017-04-07T15:55:09Z

Wes McKinney / @wesm:
related to ARROW-376

asfimport · 2017-04-29T18:09:40Z

Wes McKinney / @wesm:
This is part of ARROW-881 #612

asfimport · 2017-04-30T20:28:42Z

Wes McKinney / @wesm:
I moved this to 0.4 since we need to come up with a standard spec for the index metadata that other serializers (e.g. fastparquet) can conform to also

asfimport · 2017-06-06T20:56:16Z

Wes McKinney / @wesm:
This was done in bed0197#diff-c05ae8aa370c62f401b6d29ae2e8ea3e

asfimport closed this as completed Jun 6, 2017

asfimport assigned cpcloud Jan 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation #16228

[Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation #16228

asfimport commented Mar 3, 2017

asfimport commented Mar 3, 2017

asfimport commented Mar 3, 2017

asfimport commented Mar 3, 2017

asfimport commented Mar 13, 2017

asfimport commented Mar 13, 2017

asfimport commented Apr 7, 2017

asfimport commented Apr 29, 2017

asfimport commented Apr 30, 2017

asfimport commented Jun 6, 2017

[Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation #16228

[Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation #16228

Comments

asfimport commented Mar 3, 2017

asfimport commented Mar 3, 2017

asfimport commented Mar 3, 2017

asfimport commented Mar 3, 2017

asfimport commented Mar 13, 2017

asfimport commented Mar 13, 2017

asfimport commented Apr 7, 2017

asfimport commented Apr 29, 2017

asfimport commented Apr 30, 2017

asfimport commented Jun 6, 2017