-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update python build system #701
Conversation
I think the Windows failure is due to the loss of the flag TileDB-Vector-Search and tiledb-vector-search-feedstock are different. The recipe does not have a separate output for the C++ library. Since the C++ library is always vendored with the Python package, there is no need to have a command line flag to help the Python package find it. TileDB-VCF is more similar to TileDB-SOMA, and it is still using I don't know if we have an example Python package that uses scikit-build-core but also maintains the feature of finding an existing shared library. @dudoslav and @ihnorton will have to advise on that aspect. |
Also note that the macOS Python build works without TileDB-VCF/.github/workflows/macos.yml Lines 75 to 79 in 23565d2
Also, as an aside, I see that this macOS job suffers from the same SIP problems I just fixed yesterday in single-cell-data/TileDB-SOMA#2435. That is why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉. One high-level comment: we should probably make the libtiledbvcf
cmake build do a configured install that can be found with find_package
instead of the individual find_library
calls for libtiledbvcf. Probably not necessary to get this PR working, but if we don't do it here then @dudoslav could take a look as a backlog item.
apis/python/CMakeLists.txt
Outdated
find_library( | ||
TILEDBVCF_LIB | ||
NAMES tiledbvcf | ||
PATHS "${CMAKE_CURRENT_SOURCE_DIR}/../../dist/lib" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the direct cause of the error, but this will almost certainly be /bin
on windows. I believe if you remove /lib
then find_library
will search both on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, hmm maybe not IIUC this override
TileDB-VCF/libtiledbvcf/src/CMakeLists.txt
Line 335 in 23565d2
LIBRARY DESTINATION lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made windows happy with proper path manipulation, which could move to an improved libtiledbvcf
cmake structure.
PATHS "${CMAKE_CURRENT_SOURCE_DIR}/../../dist/lib" | ||
) | ||
endif() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably check TILEDBVCF_LIB-FOUND
here to make sure one of the statements succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added code to check for success. Moving to find_package
would definitely simplify this CMakeLists.txt
file.
Thanks @jdblischak, this was the problem. We can now pass in the location of libtiledbvcf with the
I agree @ihnorton. This PR works for now and @dudoslav could improve it in the future to be more idiomatic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor nit-picks and questions. Before merging please:
- Create
sdist
andwheel
packages - For each of these, install them and print out their version (something like
print(tilevcf.version)
if it exists) - Optional: Also test on Windows, because I remember issues with it in TileDB-Vector-Search
"Programming Language :: Python :: 3.10", | ||
"Programming Language :: Python :: 3.11", | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this pyproject.toml
is missing dependencies, however, even the setup.py
does not have any requires (except the build_requires). Doesn't this project really depend on any other packages? (tiledb, tiledb-cloud, numpy, etc?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been a long-standing frustration of mine since it makes it difficult to know what needs to be installed (eg for nightly builds or conda recipes), eg most recently in #669.
I think this overhaul of the Python build system is a good time to explicitly list the required and optional dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Since VCF has been a conda only install so far, we've push dependency management to the conda environments. We will make the dependencies clear in pyproject.toml
.
apis/python/pyproject.toml
Outdated
build-dir = "build/{wheel_tag}" | ||
metadata.version.provider = "scikit_build_core.metadata.setuptools_scm" | ||
sdist.include = ["src/tiledbvcf/version.py"] | ||
sdist.cmake = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is needed. This just runs CMake during sdist
packaging process. In Vector-Search we use this to generate version files/query git status, since .git
is not present in sdist
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can remove this.
cmake_minimum_required(VERSION 3.21) | ||
|
||
project( | ||
${SKBUILD_PROJECT_NAME} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to use SKBUILD_PROJECT_NAME
because it implies that we are going to run this CMakeLists.txt
only in scikit-build-core
environment. Same goes for the version. But this is just a suggestion, if you think that no-one will ever call this cmake file on its own then keep it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this CMakeLists.txt
will only be called in the scikit-build-core
environment. Let's keep it as.
message(STATUS "Searching for libtiledbvcf in ${LIBTILEDBVCF_SEARCH_PATHS}") | ||
|
||
# TODO: Use find_package() instead of find_library() to find libtiledbvcf. | ||
find_library( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use find_package? Or if this is not a different package but build by this project then add_subdirectory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was @ihnorton's comment too. The libtiledbvcf
cmake files will need some work to support a different approach. I'll add that task as a backlog item instead of blocking this PR.
apis/python/CMakeLists.txt
Outdated
|
||
target_include_directories( | ||
${VCF_TARGET_NAME} PRIVATE | ||
${pybind11_INCLUDE_DIR} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed. PYBIND headers are included automatically by pybind11_add_module.
@@ -0,0 +1,19 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use cibuildwheel
instead of this script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. These are VCF's first steps towards building wheels and suggestions for standard processes are welcome.
apis/python/requirements-dev.txt
Outdated
@@ -0,0 +1,7 @@ | |||
black |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between the dev requirements and requirements for the package? (This is genuine question I don't know :D )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my comment above (https://github.com/TileDB-Inc/TileDB-VCF/pull/701/files#r1565992899), I agree that this file requirements-dev.txt
is confusing because it includes more than just development-only dependencies. Here is how I would classify them based on my current understanding:
- dev (only needed for local development and CI): black
- runtime (required to
import tiledbvcf
because they are hit by__init__.py
): pandas, pyarrow, pyarrow-hotfix - optional (conditionally used if installed): dask
- test (required to run the test suite): dask, pytest, tiledb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The requirements-dev.txt
was only there for building a local dev environment. As mentioned above, we'll make the dependencies clear in a more standard way.
Great! Glad this was able to fix the Windows build
@gspowley how strong is your preference for
|
I tested this PR in my fork of tiledb-vcf-feedstock. Note that the failing win-64 build is a known issue due to linking to libfmt (discussed in TileDB-Inc/tiledb-vcf-feedstock#116). The error messages are the same as in the failing CI for those most recently merged commit, TileDB-Inc/tiledb-vcf-feedstock@761620c |
@jdblischak not very strong. I was pattern matching the environment variables for finding |
I tested the most recent changes in the feedstock. The linux and osx builds are now failing with the following configuration error that has something to do with aws-c-auth. Note that I replaced the long environment path with -- Found PythonInterp: $CONDA_PREFIX/bin/python (found suitable version "3.10.14", minimum required is "3.6")
-- Found PythonLibs: $CONDA_PREFIX/lib/libpython3.10.so
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Found pybind11: $CONDA_PREFIX/lib/python3.10/site-packages/pybind11/include (found version "2.12.0")
-- Searching for libtiledbvcf in /home/conda/feedstock_root/build_artifacts/tiledb-vcf_1713276872393/work/apis/python/../../dist/lib;
-- Found libtiledbvcf: $CONDA_PREFIX/lib/libtiledbvcf.so
CMake Error at CMakeLists.txt:122 (install):
install FILES given directory
"$CONDA_PREFIX/lib/aws-c-auth"
to install.
-- Configuring incomplete, errors occurred!
*** CMake configuration failed Also, today I noticed some CMake warnings. These were there previously, so they aren't fatal, but they might be worth addressing:
|
Thanks for the heads up. I'll follow up with a PR to fix this later today. |
The goal of this PR is to update the python build system to the more modern
scikit-build-core
, with inspiration from TileDB-Inc/TileDB-Vector-Search.Looking for feedback from build experts, especially related to the current Windows failure and releasing to pypi in the future.