-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Add from_pylist() and to_pylist() to pyarrow.Table to convert list of records #22407
Comments
Wes McKinney / @wesm: https://github.com/apache/arrow/blob/master/python/pyarrow/table.pxi#L1021 I think |
David Lee / @davlee1972: |
Antoine Pitrou / @pitrou: |
Table.from_dict in 0.14.1 looks fine. The code I originally reviewed iterated through the ordered dictionary keys instead of the schema field names. Here's some testing samples for to_pylist() and from_pylist()
test_schema = pa.schema([
pa.field('id', pa.int16()),
pa.field('struct_test', pa.list_(pa.struct([pa.field("child_id", pa.int16()), pa.field("child_name", pa.string())]))),
pa.field('list_test', pa.list_(pa.int16()))
])
test_data = [
{'id': 1, 'struct_test': [{'child_id': 11, 'child_name': '_11'}, {'child_id': 12, 'child_name': '_12'}], 'list_test': [1,2,3]},
{'id': 2, 'struct_test': [{'child_id': 21, 'child_name': '_21'}], 'list_test': [4,5]}
]
test_tbl = from_pylist(test_data, schema = test_schema)
test_list = to_pylist(test_tbl)
test_tbl
test_list
|
Joris Van den Bossche / @jorisvandenbossche: Since we have now |
Joris Van den Bossche / @jorisvandenbossche:
I personally would not add such functionality to For So I think new methods such as |
Antoine Pitrou / @pitrou: However, a question remains: does |
Joris Van den Bossche / @jorisvandenbossche: |
I noticed that pyarrow.Table.to_pydict() exists, but pyarrow.Table.from_pydict() doesn't exist. There is a proposed ticket to create one, but it doesn't take into account potential mismatches between column order and number of columns.
I'm including some code I've written which I've been using to handle arrow conversions to ordered dictionaries and lists of dictionaries.. I've also included an example where this can be used to speed up pandas.to_dict() by a factor of 6x.
Here are my benchmarks using pandas to arrow to python vs of pandas.to_dict()
Reporter: David Lee / @davlee1972
Assignee: Alenka Frim / @AlenkaF
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-6001. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: