Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface with libtopotoolbox through scikit-build-core and pybind11 #12

Merged
merged 1 commit into from
Apr 18, 2024

Conversation

wkearn
Copy link
Contributor

@wkearn wkearn commented Apr 12, 2024

The current approach using ctypes requires separately building and installing libtopotoolbox in an appropriate directory, which is tricky to get right in a cross-platform way without substantial intervention from users. This set of changes introduces a new build system using scikit-build-core and pybind11 to build an extension module that includes an interface to our library.

pyproject.toml is modified to switch the build backend to scikit-build-core to install pybind11 as a build-time dependency.

src/lib.cpp contains the bindings for libtopotoolbox, defined using pybind11. The PYBIND11_MODULE declaration exposes a Python module _lib within the topotoolbox package that contains the bindings. Right now, it only binds has_topotoolbox, which has a trivial binding because it requires no marshalling of Python data into C/C++ types. Other functions such as fillsinks will require more complex binding declarations, but pybind11 provides an interface to working with NumPy arrays that should be helpful.

CMakeLists.txt is used by scikit-build-core to build the package. It follows the guidance at
https://scikit-build-core.readthedocs.io/en/latest/getting_started.html with the exception of adding libtopotoolbox using
FetchContent. This will download the source from GitHub and then build a /static/ library for libtopotoolbox and link it into the /shared/ library that is built by pybind11.

src/topotoolbox/init.py shows how to access the binding from the _lib module. In practice, we probably don't want to expose the raw bindings from the topotoolbox module but instead use them to implement methods within the proposed GridObject class.

Next steps

It should be possible with these changes to run pip install . from the pytopotoolbox repository and have everything built and installed properly.

I am not committed to this approach being the way forward. On the positive site, it seems to work and simplify things a lot for the user. What I have read suggests that the C extension approach generally performs better than using ctypes, though I don't think we would really see the effects of performance problems yet. I also think using pybind11 gives us a lot more control over how data are passed into and out of the library that might come in handy as things start to get more complex.

On the negative side, it does require us to maintain the bindings in src/lib.cpp alongside our C library and the Python package: the bindings are not automatically generated from the libtopotoolbox header file or anything like that. It would be instructive to implement the pybind11 bindings for a more complicated function like fillsinks to see whether this becomes a real pain to develop and maintain.

@Teschl what are your thoughts?

If we decide to go this route, we should probably merge #11 first, then rebase this on those changes. Then we can add the fillsinks method to GridObject and create a binding for it in src/lib.cpp.

If merged, this solves #10.

@Teschl
Copy link
Contributor

Teschl commented Apr 12, 2024

I agree, the next step should definitely be to implement the pybind for fillsinks. As of right now, I don't understand the functionality completely, especially how to pass the array from the python class to C++. I think we should merge #11 anyway, so that we have a better base to move forward. The name of Gridobject is something we should be able to change very quickly if the need arises.

@Teschl
Copy link
Contributor

Teschl commented Apr 18, 2024

I have now built a version that includes fillsinks with pybind11. I think we should definitely use it over ctypes, just for the ease of building with pip. In order to pass arrays to our libtopotoolbox, we have to create a wrapper for every function, so that we can pass the pointer properly. That means the lib.cpp file will probably get quite big; I will look into separating it into a few (maybe one file for GRIDobj, FLOWobj, ...) files. Other than that, it's very simple to use, which is great.

@wkearn, I was thinking of adding the fillsinks function to the #11 pull request, since I based it on that file structure it would make things very easy. What do you think?

@wkearn
Copy link
Contributor Author

wkearn commented Apr 18, 2024 via email

The current approach using `ctypes` requires separately building and
installing libtopotoolbox in an appropriate directory, which is tricky
to get right in a cross-platform way without substantial intervention
from users. This set of changes introduces a new build system using
scikit-build-core and pybind11 to build an extension module that
includes an interface to our library.

pyproject.toml is modified to switch the build backend to
scikit-build-core to install pybind11 as a build-time dependency.

src/lib.cpp contains the bindings for libtopotoolbox, defined using
pybind11. The PYBIND11_MODULE declaration exposes a Python module _lib
within the topotoolbox package that contains the bindings. Right now,
it only binds `has_topotoolbox`, which has a trivial binding because
it requires no marshalling of Python data into C/C++ types. Other
functions such as `fillsinks` will require more complex binding
declarations, but pybind11 provides an interface to working with NumPy
arrays that should be helpful.

CMakeLists.txt is used by scikit-build-core to build the package. It
follows the guidance at
https://scikit-build-core.readthedocs.io/en/latest/getting_started.html
with the exception of adding libtopotoolbox using
FetchContent. This will download the source from GitHub and then build
a /static/ library for libtopotoolbox and link it into the /shared/
library that is built by pybind11.

src/topotoolbox/__init__.py shows how to access the binding from the
_lib module. In practice, we probably don't want to expose the raw
bindings from the topotoolbox module but instead use them to implement
methods within the proposed `GridObject` class.
@wkearn wkearn marked this pull request as ready for review April 18, 2024 14:44
@wkearn wkearn merged commit a5f394f into TopoToolbox:main Apr 18, 2024
@wkearn
Copy link
Contributor Author

wkearn commented Apr 18, 2024

@Teschl: #11 and #12 are now merged, so feel free to make a pull request for your fillsinks binding whenever you are ready.

@Teschl
Copy link
Contributor

Teschl commented Apr 18, 2024

Great, will do!

@wkearn wkearn added the build label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants