Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-2014: [Python] Document read_pandas method in pyarrow.parquet #1820

wants to merge 1 commit into from
Changes from all commits
File filter...
Filter file types
Jump to…
Jump to file or symbol
Failed to load files and symbols.


Just for now

@@ -68,7 +68,8 @@ Let's look at a simple table:
df = pd.DataFrame({'one': [-1, np.nan, 2.5],
'two': ['foo', 'bar', 'baz'],
'three': [True, False, True]})
'three': [True, False, True]},
table = pa.Table.from_pandas(df)
We write this to Parquet format with ``write_table``:
@@ -94,6 +95,13 @@ the whole file (due to the columnar layout):
pq.read_table('example.parquet', columns=['one', 'three'])
When reading a subset of columns from a file that used a Pandas dataframe as the
source, we use ``read_pandas`` to maintain any additional index column data:

.. ipython:: python
pq.read_pandas('example.parquet', columns=['two']).to_pandas()
We need not use a string to specify the origin of the file. It can be any of:

* A file path as a string
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.