Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-819: Public Cython and C++ API in the style of lxml, arrow::py::import_pyarrow method #680

Closed
wants to merge 5 commits into from

Conversation

wesm
Copy link
Member

@wesm wesm commented May 12, 2017

I have been looking at LXML's approach to creating both a public Cython API and C++ API

https://github.com/lxml/lxml

While this may seem like a somewhat radical reorganization of the code, putting all of the main symbols in a single Cython extension makes generating a C++ API for them significantly simpler. By using .pxi files we can break the codebase into as small pieces as we like (as long as there are no circular dependencies). As a convenient side effect, the build times are shorter.

wesm added 4 commits May 12, 2017 17:26
…c C API easier

Change-Id: I3879a8f0546c88959f4468f69f41044455c7978a
Change-Id: Ifc6b97f7bae539c7c072e6c4ddbba8b8dbd06bc5
Change-Id: I8eb4db35868420d374e03ebb9305e2884b3cac44
Change-Id: I21e6a311273cda7eb793f8430581ba06b2393913
@wesm
Copy link
Member Author

wesm commented May 12, 2017

OK, now we have

$ nm -g debug/libarrow_python.so | c++filt | grep import
000000000010e802 T arrow::py::import_pyarrow()

$ nm -g debug/libarrow_python.so | c++filt | grep wrap
000000000010e885 T arrow::py::wrap_array(std::shared_ptr<arrow::Array> const&)
000000000010e849 T arrow::py::wrap_field(std::shared_ptr<arrow::Field> const&)
000000000010e8df T arrow::py::wrap_table(std::shared_ptr<arrow::Table> const&)
000000000010e80d T arrow::py::wrap_buffer(std::shared_ptr<arrow::Buffer> const&)
000000000010e8c1 T arrow::py::wrap_column(std::shared_ptr<arrow::Column> const&)
000000000010e867 T arrow::py::wrap_schema(std::shared_ptr<arrow::Schema> const&)
000000000010e8a3 T arrow::py::wrap_tensor(std::shared_ptr<arrow::Tensor> const&)
000000000010e82b T arrow::py::wrap_data_type(std::shared_ptr<arrow::DataType> const&)
000000000010e8fd T arrow::py::wrap_record_batch(std::shared_ptr<arrow::RecordBatch> const&)

I haven't tried using these in a C extension yet (open to ideas how to test this), but I think this works!

@wesm wesm changed the title WIP ARROW-819: Work toward public Cython and C API in the style of lxml ARROW-819: Public Cython and C API in the style of lxml, arrow::py::import_pyarrow method May 12, 2017
@wesm
Copy link
Member Author

wesm commented May 12, 2017

This conflicts with #679; I can rebase this after that is merged.

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , amazing work! This is probably the last piece I needed for the turbodbc work to continue.

return ::import_pyarrow__lib();
}

PyObject* wrap_buffer(const std::shared_ptr<Buffer>& buffer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You won't be able to call these from C without wrapping them into extern "C" { .. }" due to the name mangling. Also I would be astonished if plain C code could instantiate std::shared_ptr`.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR title and comments to indicate this is a C++ API and not a C API (because the arguments are STL types)

@wesm wesm changed the title ARROW-819: Public Cython and C API in the style of lxml, arrow::py::import_pyarrow method ARROW-819: Public Cython and C++ API in the style of lxml, arrow::py::import_pyarrow method May 13, 2017
Change-Id: Icad57e6d5e9ee5302e8623664c3c58ac363bdd69
@wesm
Copy link
Member Author

wesm commented May 13, 2017

I will merge this as soon as I get a green build, then rebase #679 since there may be some more discussion about names for the IPC reader/writer classes

@asfgit asfgit closed this in 9e875a6 May 13, 2017
@wesm wesm deleted the ARROW-819 branch May 13, 2017 19:55
jeffknupp pushed a commit to jeffknupp/arrow that referenced this pull request Jun 3, 2017
…:import_pyarrow method

I have been looking at LXML's approach to creating both a public Cython API and C++ API

https://github.com/lxml/lxml

While this may seem like a somewhat radical reorganization of the code, putting all of the main symbols in a single Cython extension makes generating a C++ API for them significantly simpler. By using `.pxi` files we can break the codebase into as small pieces as we like (as long as there are no circular dependencies). As a convenient side effect, the build times are shorter.

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes apache#680 from wesm/ARROW-819 and squashes the following commits:

9e6ee24 [Wes McKinney] Fix up optional extensions
cff757d [Wes McKinney] Expose pyarrow C API in arrow/python/pyarrow.h
b39d19c [Wes McKinney] Fix test suite. Move _config into lib
ff1b5e5 [Wes McKinney] Rename things a bit
d4a8391 [Wes McKinney] Reorganize Cython code in the style of lxml so make declaring a public C API easier
pcmoritz pushed a commit to pcmoritz/arrow that referenced this pull request Jun 11, 2017
…:import_pyarrow method

I have been looking at LXML's approach to creating both a public Cython API and C++ API

https://github.com/lxml/lxml

While this may seem like a somewhat radical reorganization of the code, putting all of the main symbols in a single Cython extension makes generating a C++ API for them significantly simpler. By using `.pxi` files we can break the codebase into as small pieces as we like (as long as there are no circular dependencies). As a convenient side effect, the build times are shorter.

Author: Wes McKinney <wes.mckinney@twosigma.com>

Closes apache#680 from wesm/ARROW-819 and squashes the following commits:

9e6ee24 [Wes McKinney] Fix up optional extensions
cff757d [Wes McKinney] Expose pyarrow C API in arrow/python/pyarrow.h
b39d19c [Wes McKinney] Fix test suite. Move _config into lib
ff1b5e5 [Wes McKinney] Rename things a bit
d4a8391 [Wes McKinney] Reorganize Cython code in the style of lxml so make declaring a public C API easier
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants