Python library for Apache Arrow
This library provides a Python API for functionality provided by the Arrow C++ libraries, along with tools for Arrow integration and interoperability with pandas, NumPy, and other software in the Python ecosystem.
Across platforms, you can install a recent version of pyarrow with the conda package manager:
conda install pyarrow -c conda-forge
On Linux/macOS and Windows, you can also install binary wheels from PyPI with pip:
pip install pyarrow
We follow a similar PEP8-like coding style to the pandas project.
The code must pass
flake8 (available from pip or conda) or it will fail the
build. Check for style errors before submitting your pull request with:
flake8 . flake8 --config=.flake8.cython .
Building from Source
See the Development page in the documentation.
Running the unit tests
We are using pytest to develop our unit test suite. After building the
setup.py build_ext --inplace, you can run its unit tests like
The project has a number of custom command line options for its test suite. Some tests are disabled by default, for example. To see all the options, run
pytest pyarrow --help
and look for the "custom options" section.
For running the benchmarks, see the Sphinx documentation.
Building the documentation
pip install -r ../docs/requirements.txt python setup.py build_sphinx -s ../docs/source