Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-23308: Add CMake support #34

Merged
merged 6 commits into from Nov 3, 2021
Merged

DM-23308: Add CMake support #34

merged 6 commits into from Nov 3, 2021

Conversation

fritzm
Copy link
Contributor

@fritzm fritzm commented Oct 28, 2021

Adds ability to build package with CMake in addition to eupspkg (allows to be included as submodule in other CMake projects, e.g. Qserv.)

@timj
Copy link
Member

timj commented Oct 28, 2021

Please add a new github action that builds the package and runs the tests using cmake.

@fritzm fritzm marked this pull request as ready for review October 28, 2021 17:24
@TallJimbo
Copy link
Member

TallJimbo commented Oct 28, 2021

(repost from the utils PR, where I accidentally posted this first)

If this CMake build includes building the pybind11 modules, I think I'd be in favor of publishing sphgeom on conda-forge via that, and using conda to get it in (e.g.) daf_butler CI, and then retiring the pip stuff. I think that is always going to remain a hack due to the difficulty (impossibility?) of properly building C++ libraries with setuptools.

Now I'm probably getting way ahead of myself, but having the main stack use a separately-semver'd sphgeom from conda-forge and dropping the scons/eupspkg build could then seem pretty attractive, in this particular case, at least.

@rra
Copy link
Member

rra commented Oct 28, 2021

The image cutout service (and any other IVOA services SQuaRE writes) use PIP for installation, which means they want to express a dependency on daf-butler via PIP. I'm not sure how that would work if daf-butler were only available via conda. My guess is "not well." I'm also quite dubious about switching them to conda, although I guess I haven't checked to see how much is missing from conda that I rely on.

That said, I strongly suspect that the IVOA frontends are going to care about a very small subset of Butler functionality and, in a client/server Butler world, only the client. So maybe the solution is to split off the small bit of Butler that they care about and which presumably wouldn't need a dependency on sphgeom and upload only that bit to PyPI.

src/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -70,10 +70,10 @@ class NormalizedAngle {
NormalizedAngle const & b);

/// This constructor creates a NormalizedAngle with a value of zero.
NormalizedAngle() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did a clang tidy run on afw. Eventually this could globally be run on all C++ code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What compiler/version warns about default ctors? Not sure I recall seeing these with the scons build, but I would need to check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwittgen I encountered the warning under gcc 8.4.1 in the centos:8 Qserv build containers. The actual warning was that the compiler would/could not implicitly instantiate operators= since a non-default ctor had been declared. Moving to the default ctor (which these were, effectively, anyway, and which would be in the long-term less error prone?) seemed preferable to adding explicit "=default" versions of operator= to squelch the warning.

@ktlim
Copy link
Contributor

ktlim commented Oct 28, 2021

I'm not averse to moving first-party packages into rubin-env, but conda-only/non-pip packaging seems hostile. Seems like having cmake build wheels and publish to pypi is what this scikit-build thing is about?

@TallJimbo
Copy link
Member

Seems like having cmake build wheels and publish to pypi is what this scikit-build thing is about?

That seems like a step in the right direction, in that we could have a real build system (CMake) in sphgeom without duplicating (as much) stuff into setuptools configuration. But does that actually give us the possibility of distributing a standalone C++ library (and headers?) via pip? Or would we still be stuffing the standalone C++ symbols into the Python module? If so, that still feels like a recipe for confusion (but maybe something we could mitigate with more verbose package names).

That said, I strongly suspect that the IVOA frontends are going to care about a very small subset of Butler functionality and, in a client/server Butler world, only the client. So maybe the solution is to split off the small bit of Butler that they care about and which presumably wouldn't need a dependency on sphgeom and upload only that bit to PyPI.

The sphgeom usage is pretty central; it provides the data structures for all of our spatial geometry objects and the predicates we use to compare them. If we can't distribute a sphgeom that C++ code can build against via pip, and you can't use conda, then I think I agree that we're stuck, and maybe the best we can do is the renaming that I alluded to earlier. But I'm a bit worried that even if the current pip version of sphgeom works for you now, taking this approach to dependencies is basically an assertion that any C/C++ code in any dependencies of these services must not depend on C/C++ code in any other dependencies. That's common practice in the broader Python world, and maybe it would have been a good idea for us if we could start over, but it's not at all how we've done things. So if it's our C++ code in dependencies, it's really an assertion that you will only ever need sphgeom to have C++, and that's a somewhat scary statement to make from a performance perspective.

@rra
Copy link
Member

rra commented Oct 28, 2021

I'm sure there's a lot missing from my mental model here. Could you fill me in on why the client portion of a client/server Butler would need spacial geometry objects? Naively, I would assume that it's making REST calls to a Butler server, so all of the objects have to be serialized as floats anyway and the comparisons would be done on the server, and hence I'm not sure why the client benefits from dedicated objects implemented in C++.

@timj
Copy link
Member

timj commented Oct 28, 2021

A DatasetRef can have a fully expanded DataCoordinate which will eventually point to a sphgeom region. Therefore the client needs to have access to sphgeom to know how to unpack that from the JSON.

@TallJimbo
Copy link
Member

I'm sure there's a lot missing from my mental model here. Could you fill me in on why the client portion of a client/server Butler would need spacial geometry objects? Naively, I would assume that it's making REST calls to a Butler server, so all of the objects have to be serialized as floats anyway and the comparisons would be done on the server, and hence I'm not sure why the client benefits from dedicated objects implemented in C++

I would have thought it was the Butler server that didn't need much C++ (potentially just sphgeom) while the client effectively needs the full Science Pipelines stack to be useful, because the things one fetches with Butler.get are usually objects written in C++.

In the more limited context of the butler client being used inside a IVOA service implementation that's really just getting butler metadata, not actually fetching datasets, you may well be right that you don't need any of the non-sphgeom C++, and the sphgeom dependency is just a refactoring problem. Data structures you would use often have a sphgeom.Region attribute, but one you would never access; we would just need to replace them with a placeholder of some kind.

I was imagining image-cutout as the kind of service we were talking about - that does need a lot of C++, including sphgeom - but maybe it's atypical.

Happy to discuss on a live call or slack if I'm still not actually answering the question you asked.

@rra
Copy link
Member

rra commented Oct 28, 2021

Ah, thanks, I see.

The image cutout frontend (the part installed with PIP) probably does not care about any of these concepts, since it receives parameters as floats in ASCII and passes those parameters to its backend as JSON objects. The only thing it would need is enough machinery to do basic input validation, which probably doesn't require anything more than astropy. Only the cutout backend that performs the cutout requires the C++ data structures, and it's separate software running on a stack container since it's using a pipeline to do the work. However, the frontend does need to look up datarefs via Butler and get their URLs.

So far as I know, the pip-compatible way of doing what the project wants to do with C++ libraries is to build and ship the C++ library independent of the Python library that binds to it and require that any user of the Python library have already obtained and installed the C++ library via some other mechanism. Then only the glue required to make a Python dynamic module linked against the C++ library is included in the library uploaded to PyPI. I don't believe PyPI has a mechanism to provide C++ libraries as such because it's only a Python module distribution system, not a C++ library distribution system; it can distribute a Python dynamic module with C++ code built into it, but not a library that can be independently linked against. (Typically one would package the C++ library as an RPM or deb or both, since those packaging systems are designed to work with shared libraries.)

I realize this isn't at all how the project is currently architected, and it would pose a bunch of additional challenges.

@timj timj changed the title Add CMake support DM-23308: Add CMake support Oct 29, 2021
@fritzm
Copy link
Contributor Author

fritzm commented Nov 2, 2021

@timj GHA add to do cmake build and run C++ tests -- please have a look?


- name: CMake test
working-directory: ${{github.workspace}}/build
run: ctest --parallel `nproc`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to do anything to test the python code. You are going to have to install pytest as I do in the other action and then run the tests to make sure it really does work from python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timj Added python tests to the cmake GHA; please have another look?

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the python tests -- this makes it less likely someone will break cmake. If you also install pytest-xdist you can add -n 2 to the pytest to make it run in parallel so be a little faster.

@fritzm
Copy link
Contributor Author

fritzm commented Nov 2, 2021

Thanks @timj -- at less than 3s for the Python tests compared to the C++ build/link times, I'm inclined to not worry on it at this point. :-)

Let me know when the version tool lands, and I'll happily help out with the retrofitting of that into the cmake here and over in log.

@fritzm fritzm merged commit 1e25446 into master Nov 3, 2021
@fritzm fritzm deleted the tickets/DM-23308 branch November 3, 2021 01:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants