Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Python build to scikit-build #4818

Merged
merged 44 commits into from
Aug 2, 2022

Conversation

dantegd
Copy link
Member

@dantegd dantegd commented Jul 18, 2022

PR does the required changes for Scikit-build using RAPIDS-CMake.

  • Update .gitignore
  • Create python/cuml/CMakeLists.txt file
  • Add CMakeLists.txt using RAPIDS-CMake to Python folders
  • Update setup.py
  • Update build.sh
  • Update CI files
  • Update conda env files
  • Clean code

@github-actions github-actions bot added CMake conda conda issue CUDA/C++ Cython / Python Cython or Python issue labels Jul 18, 2022
@dantegd dantegd added feature request New feature or request breaking Breaking change and removed conda conda issue Cython / Python Cython or Python issue CMake CUDA/C++ labels Jul 18, 2022
@github-actions github-actions bot added CMake conda conda issue CUDA/C++ Cython / Python Cython or Python issue labels Jul 20, 2022
@dantegd dantegd marked this pull request as ready for review July 28, 2022 15:57
@dantegd dantegd requested review from a team as code owners July 28, 2022 15:57
v22.08 Release automation moved this from PR-WIP to PR-Reviewer approved Jul 28, 2022
conda/recipes/cuml/build.sh Show resolved Hide resolved
@@ -7,4 +7,4 @@ if [ -n "$MACOSX_DEPLOYMENT_TARGET" ]; then
export MACOSX_DEPLOYMENT_TARGET=10.11
fi

./build.sh -n clean libcuml prims -v --allgpuarch
./build.sh -n libcuml prims -v --allgpuarch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a debug change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the clean target of build.sh ends up calling setup.py clean, which imports skbuild which is not part of the libcuml requirements. This seemed like the simplest change to fix that since clean doesn't really seem to be necessary for buulding the conda packages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does skbuild need to be brought into libcuml? Shouldn't it only be needed by cuml?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is why we needed to remove clean here, so that skbuild is not needed by libcuml.

The alternative would be to have separate C++ clean and Python clean targets, because right now clean calls both C++ and setup.py clean, the latter of which does import skbuild...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this hasn't been an issue with other RAPIDS libs because their clean target in build.sh manually runs rm etc rather than invoking python setup.py clean. Any attempt to use setup.py will now require skbuild. Personally I'm not sure that there is a truly clean solution for this problem other than having separate clean targets for different components, but that's a broader change to be made across the various RAPIDS libs.

@@ -58,3 +58,6 @@ done
sed_runner "s/export UCX_PY_VERSION=.*/export UCX_PY_VERSION='${NEXT_UCX_PY_VERSION}'/g" ci/checks/style.sh
sed_runner "s/export UCX_PY_VERSION=.*/export UCX_PY_VERSION='${NEXT_UCX_PY_VERSION}'/g" ci/cpu/build.sh
sed_runner "s/export UCX_PY_VERSION=.*/export UCX_PY_VERSION='${NEXT_UCX_PY_VERSION}'/g" ci/gpu/build.sh

# rapids-cmake version
sed_runner 's/'"branch-.*\/RAPIDS.cmake"'/'"branch-${NEXT_SHORT_TAG}\/RAPIDS.cmake"'/g' fetch_rapids.cmake
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should replace line 35 since that RAPIDS.cmake entry no longer exists

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 35 still exists

python/CMakeLists.txt Show resolved Hide resolved
Comment on lines 56 to 57
set(CUML_VERSION_MAJOR "22")
set(CUML_VERSION_MINOR "08")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look to be debug and not needed

Suggested change
set(CUML_VERSION_MAJOR "22")
set(CUML_VERSION_MINOR "08")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment that variables are used in get_raft.cmake while we temporarily use that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be computed from the cuml-python_VERSION_MAJOR and cuml-python_VERSION_MINOR values so that they don't go out of sync on new branch creation.

v22.08 Release automation moved this from PR-Reviewer approved to PR-Needs review Jul 28, 2022
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double-check all the RPATHs, they all need to be repointed to search the library dir. Aside from that, just a few minor additional requests, but I think they should be addressed before merging.

Comment on lines 19 to 21
include(rapids-cmake)
include(rapids-cpm)
include(rapids-find)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of this is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these still need removing.

option(FIND_CUML_CPP "Search for existing CUML C++ installations before defaulting to local files" OFF)
option(SINGLEGPU "Disable all mnmg components and comms libraries" OFF)

message(VERBOSE "CUML_PY: Searching for existing CUML C++ installations before defaulting to local files: ${FIND_CUML_CPP}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW you can use the CMAKE_MESSAGE_CONTEXT variable to help set prefixes like CUML_PY for all messages within a scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 56 to 59
set(CUML_VERSION_MAJOR "22")
set(CUML_VERSION_MINOR "08")
set(CUML_USE_RAFT_NN ON)
set(CUML_USE_RAFT_DIST ON)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are any of these getting used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are needed by get_raft.cmake

set(CUML_USE_RAFT_NN ON)
set(CUML_USE_RAFT_DIST ON)

# We need to call get_raft due to cuML asing for raft::nn and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# We need to call get_raft due to cuML asing for raft::nn and
# We need to call get_raft due to cuML asking for raft::nn and


# We need to call get_raft due to cuML asing for raft::nn and
# raft::distance targets
include(../cpp/cmake/thirdparty/get_raft.cmake)
Copy link
Contributor

@vyasr vyasr Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the raft targets be exported from the libcuml target? I see that they're part of the public link interface and raft::raft is part of the export sets. Should get_raft.cmake be using PKG_USE_RAFT_DIST, PKG_USE_RAFT_NN, and/or RAFT_COMPILE_LIBRARIES to decide whether or not to manually add raft::nn and/or raft::distance to the export sets as well? I'm thinking we need to use rapids_export_package manually on those targets here. CC @robertmaynard

Copy link
Member Author

@dantegd dantegd Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thinking about it for a bit, I think we just need to add raft::nn and raft::distance (conditionally) to

GLOBAL_TARGETS raft::raft
should be enough to properly add them to libcuml's exported targets, will try it locally. So I think there's no need for an additional rapids_export_package

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the raft::nn and raft::distance are made GLOBAL when they are brought in as well. ( if not they should be )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems then there is a bug somewhere, without calling get_raft.cmake, or modifying get_raft.cmake in cuML, we run into the following issue:

CMake Error at /home/ursa/miniconda3/envs/skbuild4/lib/cmake/cuml/cuml-targets.cmake:56 (set_target_properties):
  The link interface of target "cuml::cuml++" contains:

    raft::nn

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /home/ursa/miniconda3/envs/skbuild4/lib/cmake/cuml/cuml-config.cmake:77 (include)
  CMakeLists.txt:68 (find_package)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency tracking in rapids-CMake doesn't understand COMPONENTs and therefore cuml is find raft but not enabling any components.
I think we can re call find package raft with the right components to unblock us.

Please file an issue on rapids-CMake about this

endforeach()

foreach(cython_module IN LISTS RAPIDS_CYTHON_CREATED_TARGETS)
set_target_properties(${cython_module} PROPERTIES INSTALL_RPATH "\$ORIGIN;\$ORIGIN/cpp")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure this RPATH goes the necessary extra level higher.

# fixed in https://gitlab.kitware.com/cmake/cmake/-/merge_requests/7410 and will be available in
# CMake 3.24, so we can remove the Development component once we upgrade to CMake 3.24.
find_package(Python REQUIRED COMPONENTS Development NumPy)
set(targets_using_numpy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a quick scan of a handful of these and I didn't see any numpy logic in the pyx files. Is the dependency inherited (maybe from raft), or is this not really necessary?

Comment on lines +45 to +48
foreach(target IN LISTS targets_using_numpy)
target_include_directories(${target} PRIVATE "${Python_NumPy_INCLUDE_DIRS}")
endforeach()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
foreach(target IN LISTS targets_using_numpy)
target_include_directories(${target} PRIVATE "${Python_NumPy_INCLUDE_DIRS}")
endforeach()

python/setup.py Outdated
cmdclass["build"] = cuml_build
cmdclass["build_ext"] = cuml_build_ext
cmdclass = versioneer.get_cmdclass()
cmdclass["build_ext"] = build_ext
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant, skbuild injects its own commands automatically.

Suggested change
cmdclass["build_ext"] = build_ext

setup_requires=['cython'],
url="https://github.com/rapidsai/cudf",
setup_requires=['Cython>=0.29,<0.30'],
packages=find_packages(include=['cuml', 'cuml.*']),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the other RAPIDS packages we include the pxd files as part of the package data in case someone wants to build Cython against the Cython in our libs. Not strictly relevant here, but I thought I'd point it out in case it's an unintentional oversight. If needed the code can be copied from one of the other RAPIDS libs.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One very minor outstanding comment, otherwise this looks fine to me (pending Rob's comments as well).

set_target_properties(${cython_module} PROPERTIES INSTALL_RPATH "\$ORIGIN;\$ORIGIN/cpp")
endforeach()

# todo: ml_cuda_utils.h should be in the include folder of cuML or the functionality
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

option(FIND_CUML_CPP "Search for existing CUML C++ installations before defaulting to local files" OFF)
option(SINGLEGPU "Disable all mnmg components and comms libraries" OFF)

message(VERBOSE "CUML_PY: Searching for existing CUML C++ installations before defaulting to local files: ${FIND_CUML_CPP}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 19 to 21
include(rapids-cmake)
include(rapids-cpm)
include(rapids-find)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these still need removing.

ci/release/update-version.sh Show resolved Hide resolved
Comment on lines 56 to 57
set(CUML_VERSION_MAJOR "22")
set(CUML_VERSION_MINOR "08")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be computed from the cuml-python_VERSION_MAJOR and cuml-python_VERSION_MINOR values so that they don't go out of sync on new branch creation.

@codecov-commenter
Copy link

Codecov Report

Merging #4818 (0d3274a) into branch-22.08 (5324d52) will increase coverage by 0.39%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##           branch-22.08    #4818      +/-   ##
================================================
+ Coverage         77.62%   78.02%   +0.39%     
================================================
  Files               180      180              
  Lines             11384    11385       +1     
================================================
+ Hits               8837     8883      +46     
+ Misses             2547     2502      -45     
Flag Coverage Δ
dask 46.21% <ø> (+0.69%) ⬆️
non-dask 67.27% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/cuml/feature_extraction/_vectorizers.py 89.93% <0.00%> (+0.37%) ⬆️
python/cuml/common/import_utils.py 59.82% <0.00%> (+0.85%) ⬆️
.../dask/extended/linear_model/logistic_regression.py 92.00% <0.00%> (+57.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5324d52...0d3274a. Read the comment docs.

v22.08 Release automation moved this from PR-Needs review to PR-Reviewer approved Aug 2, 2022
@dantegd
Copy link
Member Author

dantegd commented Aug 2, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 33c0170 into rapidsai:branch-22.08 Aug 2, 2022
v22.08 Release automation moved this from PR-Reviewer approved to Done Aug 2, 2022
jakirkham pushed a commit to jakirkham/cuml that referenced this pull request Feb 27, 2023
PR does the required changes for Scikit-build using RAPIDS-CMake. 

- [x] Update .gitignore
- [x] Create `python/cuml/CMakeLists.txt` file
- [x] Add `CMakeLists.txt` using RAPIDS-CMake to Python folders
- [x] Update `setup.py`
- [x] Update `build.sh`
- [x] Update CI files
- [x] Update conda env files
- [x] Clean code

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Sevag H (https://github.com/sevagh)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Robert Maynard (https://github.com/robertmaynard)

URL: rapidsai#4818
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake conda conda issue CUDA/C++ Cython / Python Cython or Python issue feature request New feature or request gpuCI gpuCI issue
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

7 participants