Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-1424: [Python] Add CUDA support to pyarrow #2536

Closed
wants to merge 39 commits into from

Conversation

pearu
Copy link
Contributor

@pearu pearu commented Sep 10, 2018

This PR implements CUDA support to pyarrow. In order to use it, the Arrow C++ library must be built with the following cmake options:

-DARROW_PYTHON=on -DARROW_GPU=ON -DARROW_IPC=ON

To enable CUDA support in pyarrow, it must be built with --with-cuda option, e.g.

python setup.py build_ext --with-cuda <other options>

or the environment must define

PYARROW_WITH_CUDA=1

This CUDA support implementation is rather complete: all Arrow C++ GPU related functions and classes are exposed to Python, the new methods and functions are documented, the test coverage is close to 100%.

However, there are some issues that need to be tackled and questions to be answered (hence the WIP attribute):

  1. Is the naming convention of the new methods appropriate? Are there any changes needed in the new API?
  2. The IPC test (see test_IPC in python/pyarrow/tests/test_gpu.py) fails when calling open_ipc_buffer: cuIpcOpenMemHandle fails with code 201 . Currently, I don't know why. Any hint or help on this is much appreciated. [FIXED: using multiprocessing.Process in spawn mode]
  3. Anything else?

@wesm
Copy link
Member

wesm commented Sep 10, 2018

Looks like you need to rebase, let me know if you need help

Change-Id: Id79eb87983b3e1eb449fd8ecf05def823d655ef2
@wesm
Copy link
Member

wesm commented Sep 10, 2018

I went ahead and fixed the branch (ugly merge the crossed the parquet-cpp repo merge). Will review when I can

@pearu
Copy link
Contributor Author

pearu commented Sep 10, 2018

Thanks, Wes!

@pitrou
Copy link
Member

pitrou commented Sep 10, 2018

I get a core dump when running the test suite:

$ python -m pytest -v --tb=native pyarrow/tests/test_gpu.py 
=============================================================== test session starts ===============================================================
platform linux -- Python 3.7.0, pytest-3.7.2, py-1.5.4, pluggy-0.7.1 -- /home/antoine/miniconda3/envs/pyarrow/bin/python
cachedir: .pytest_cache
rootdir: /home/antoine/arrow/python, inifile: setup.cfg
plugins: timeout-1.3.1, faulthandler-1.5.0
collected 15 items                                                                                                                                

pyarrow/tests/test_gpu.py::test_manager_num_devices PASSED                                                                                  [  6%]
pyarrow/tests/test_gpu.py::test_manage_allocate_free_host Fatal Python error: Aborted

Current thread 0x00007f87077fd740 (most recent call first):
  File "/home/antoine/arrow/python/pyarrow/tests/test_gpu.py", line 81 in test_manage_allocate_free_host
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/python.py", line 196 in pytest_pyfunc_call
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/callers.py", line 180 in _multicall
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/hooks.py", line 258 in __call__
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/python.py", line 1430 in runtest
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 111 in pytest_runtest_call
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/callers.py", line 180 in _multicall
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/hooks.py", line 258 in __call__
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 183 in <lambda>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 201 in __init__
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 185 in call_runtest_hook
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 161 in call_and_report
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 81 in runtestprotocol
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/runner.py", line 66 in pytest_runtest_protocol
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/callers.py", line 180 in _multicall
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/hooks.py", line 258 in __call__
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/main.py", line 236 in pytest_runtestloop
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/callers.py", line 180 in _multicall
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/hooks.py", line 258 in __call__
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/main.py", line 215 in _main
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/main.py", line 178 in wrap_session
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/main.py", line 208 in pytest_cmdline_main
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/callers.py", line 180 in _multicall
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 61 in <lambda>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/manager.py", line 67 in _hookexec
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pluggy/hooks.py", line 258 in __call__
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/_pytest/config/__init__.py", line 65 in main
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/pytest.py", line 68 in <module>
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/runpy.py", line 85 in _run_code
  File "/home/antoine/miniconda3/envs/pyarrow/lib/python3.7/runpy.py", line 193 in _run_module_as_main
Abandon (core dumped)

Here is the truncated gdb backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff7805801 in __GI_abort () at abort.c:79
#2  0x00007fffde548428 in arrow::internal::CerrLog::~CerrLog (this=0x7fffffff78a0, __in_chrg=<optimized out>) at ../src/arrow/util/logging.h:130
#3  0x00007fffdb122243 in arrow::gpu::CudaHostBuffer::~CudaHostBuffer (this=0x555556a48490, __in_chrg=<optimized out>)
    at ../src/arrow/gpu/cuda_memory.cc:153
#4  0x00007fffdb120dff in __gnu_cxx::new_allocator<arrow::gpu::CudaHostBuffer>::destroy<arrow::gpu::CudaHostBuffer> (this=0x555556a48490, 
    __p=0x555556a48490) at /usr/include/c++/7/ext/new_allocator.h:140
#5  0x00007fffdb120d97 in std::allocator_traits<std::allocator<arrow::gpu::CudaHostBuffer> >::destroy<arrow::gpu::CudaHostBuffer> (__a=..., 
    __p=0x555556a48490) at /usr/include/c++/7/bits/alloc_traits.h:487
#6  0x00007fffdb120b71 in std::_Sp_counted_ptr_inplace<arrow::gpu::CudaHostBuffer, std::allocator<arrow::gpu::CudaHostBuffer>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x555556a48480) at /usr/include/c++/7/bits/shared_ptr_base.h:535
#7  0x00007fffdef09ba8 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x555556a48480)
    at /usr/include/c++/7/bits/shared_ptr_base.h:154
#8  0x00007fffdef053bf in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffd9898be0, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr_base.h:684
#9  0x00007fffdef021da in std::__shared_ptr<arrow::Buffer, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffd9898bd8, 
    __in_chrg=<optimized out>) at /usr/include/c++/7/bits/shared_ptr_base.h:1123
#10 0x00007fffdef021f6 in std::shared_ptr<arrow::Buffer>::~shared_ptr (this=0x7fffd9898bd8, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/shared_ptr.h:93
#11 0x00007fffdef09564 in __Pyx_call_destructor<std::shared_ptr<arrow::Buffer> > (x=...)
    at /home/antoine/arrow/python/build/temp.linux-x86_64-3.7/lib.cpp:281
#12 0x00007fffdee9ed92 in __pyx_tp_dealloc_7pyarrow_3lib_Buffer (o=0x7fffd9898bc0)
    at /home/antoine/arrow/python/build/temp.linux-x86_64-3.7/lib.cpp:120931
#13 0x00007fffdb171578 in __pyx_tp_dealloc_7pyarrow_7lib_gpu_CudaHostBuffer (o=0x7fffd9898bc0)
    at /home/antoine/arrow/python/build/temp.linux-x86_64-3.7/lib_gpu.cpp:12759
#14 0x00007ffff2089509 in array_dealloc ()
   from /home/antoine/miniconda3/envs/pyarrow/lib/python3.7/site-packages/numpy/core/multiarray.cpython-37m-x86_64-linux-gnu.so

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a preliminary review. Haven't gone through the actual code yet. Is this PR still WIP?

cpp/cmake_modules/FindArrow.cmake Outdated Show resolved Hide resolved
python/pyarrow/__init__.py Outdated Show resolved Hide resolved
python/pyarrow/lib_gpu.pxd Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
@pearu
Copy link
Contributor Author

pearu commented Sep 10, 2018

This PR is still WIP primarily due to the test_IPC issue, see the description above.

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pearu for working through this! The code looks clean and reasonable.

I left a number of stylistic comments and questions. I think we should probably also change our terminology around "GPU" to be "CUDA" instead. Might want to change this in the C++ library, too, in case we have demand for supporting OpenCL at some point

cpp/cmake_modules/FindArrow.cmake Outdated Show resolved Hide resolved
python/CMakeLists.txt Outdated Show resolved Hide resolved
python/CMakeLists.txt Outdated Show resolved Hide resolved
python/CMakeLists.txt Outdated Show resolved Hide resolved
if (ARROW_GPU_FOUND)
ADD_THIRDPARTY_LIB(arrow_gpu
SHARED_LIB ${ARROW_GPU_SHARED_LIB})
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments above

python/pyarrow/tests/test_gpu.py Outdated Show resolved Hide resolved
python/pyarrow/tests/test_gpu.py Outdated Show resolved Hide resolved
python/pyarrow/tests/test_gpu.py Outdated Show resolved Hide resolved
python/pyarrow/tests/test_gpu.py Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
@pearu
Copy link
Contributor Author

pearu commented Sep 10, 2018

@pitrou I cannot reproduce core dump here on Ubuntu 16.04 with Python versions 3.6.6 nor 3.7.0.

However, I suspect that the core dump could be related to the usage manager.free_host(buf) in test_manage_allocate_free_host. Could you try commenting out the calls to free_host method and rerun the tests?

@codecov-io
Copy link

Codecov Report

Merging #2536 into master will increase coverage by 0.55%.
The diff coverage is 9.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2536      +/-   ##
==========================================
+ Coverage   87.67%   88.23%   +0.55%     
==========================================
  Files         372      311      -61     
  Lines       57914    54578    -3336     
==========================================
- Hits        50777    48155    -2622     
+ Misses       7063     6423     -640     
+ Partials       74        0      -74
Impacted Files Coverage Δ
python/pyarrow/ipc.pxi 70.64% <ø> (ø) ⬆️
python/pyarrow/lib.pxd 0% <ø> (ø) ⬆️
python/pyarrow/tests/test_gpu.py 7.58% <7.58%> (ø)
python/pyarrow/__init__.py 70.27% <72.72%> (-0.05%) ⬇️
rust/src/record_batch.rs
go/arrow/datatype_nested.go
rust/src/util/bit_util.rs
go/arrow/math/uint64_amd64.go
go/arrow/internal/testing/tools/bool.go
go/arrow/internal/bitutil/bitutil.go
... and 57 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a42d4bf...a544787. Read the comment docs.

@pitrou
Copy link
Member

pitrou commented Sep 10, 2018

If I comment out the free_host calls, then the tests crash in test_manage_allocate_autofree_host. With gdb I manage to extract the failed status message: """IOError: Cuda Driver API call in ../src/arrow/gpu/cuda_context.cc at line 148 failed with code 1: cuMemFreeHost(data)""".

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the time to do a more thorough review. Please see comments below.

cpp/cmake_modules/FindArrow.cmake Outdated Show resolved Hide resolved
python/pyarrow/_cuda.pxd Outdated Show resolved Hide resolved
python/setup.py Outdated
if self.with_cuda:
cmake_options.append('-DPYARROW_BUILD_CUDA=on')
else:
cmake_options.append('-DPYARROW_BUILD_CUDA=off')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is desired. IMHO we should probably let the extension build by default, where possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed -DPYARROW_BUILD_CUDA=off as redundant.
Detecting Arrow CUDA is implemented in FindArrowCuda.cmake. Moving the detection algorithm to setup.py is involved and against the current logic in cmake/setup files, imho.

An option is to make --with-cuda default, that is, replace it with --without-cuda.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be possible to analyze the output of cmake to detect if libarrow_gpu library is available and then define with_cuda flag. It sounds hackish though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not try to build and fail silently, as for Parquet and Orc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup.py-wise, _cuda is configured and build exactly like _parquet or _orc. So I don't understand your analogue, especially because parquet and orc are disabled by default.

I suggest the following behavior:
Case 1: libarrow_gpu can be detected by cmake, then

  • python setup.py build_ext --with-cuda will build _cuda extension
  • PYARROW_BUILD_CUDA=1 python setup.py build_ext will build _cuda extension
  • python setup.py build_ext will succeed, no _cuda extension is built
  • PYARROW_BUILD_CUDA=0 python setup.py build_ext will succeed, no _cuda extension is built

Case 2: libarrow_gpu is not detected by cmake, then

  • python setup.py build_ext --with-cuda will fail
  • PYARROW_BUILD_CUDA=1 python setup.py build_ext will fail
  • python setup.py build_ext will succeed, no _cuda extension is built

python/setup.py Show resolved Hide resolved
python/pyarrow/tests/test_cuda.py Outdated Show resolved Hide resolved
python/pyarrow/_cuda.pyx Outdated Show resolved Hide resolved
python/pyarrow/_cuda.pyx Outdated Show resolved Hide resolved
python/pyarrow/_cuda.pyx Outdated Show resolved Hide resolved
python/pyarrow/_cuda.pyx Outdated Show resolved Hide resolved
python/pyarrow/tests/test_cuda.py Outdated Show resolved Hide resolved
python/CMakeLists.txt Outdated Show resolved Hide resolved
@xhochy
Copy link
Member

xhochy commented Sep 12, 2018

What about the licensing of CUDA? I guess we cannot include it into our official Arrow wheels?

@pitrou
Copy link
Member

pitrou commented Sep 12, 2018

CUDA toolkit is huge, I don't think you want to include it in any case.

@scopatz
Copy link

scopatz commented Sep 12, 2018

@xhochy - According to Attachment A of teh cudatoolkit EULA, there are actually a fairly large number of files you are allowed to redistribute. These are mostly shared objects that let you run (but not compile) CUDA code. Of course, you'd still need the NVIDIA drivers installed...

@xhochy
Copy link
Member

xhochy commented Sep 12, 2018

CUDA toolkit is huge, I don't think you want to include it in any case.

I maybe missed out the important point: Could we build the cuda support in such a fashion that we link against CUDA but don't depend on things that are not allowed in the ASF policy. I'm not really informed about the ABI stability of CUDA but it seems that Torch for example has wheels with CUDA support. Looking at them it rather seems like you need to build for every major version which is quite hard to package.

@scopatz
Copy link

scopatz commented Sep 12, 2018

yeah, I don't know the details of ASF policy here

python/pyarrow/_cuda.pyx Outdated Show resolved Hide resolved
@pearu pearu changed the title ARROW-1424: Add CUDA support to pyarrow (WIP) ARROW-1424: Add CUDA support to pyarrow (REVIEW) Sep 13, 2018
@pearu
Copy link
Contributor Author

pearu commented Sep 13, 2018

@pitrou , @wesm , @scopatz , I believe I have addressed all your concerns regarding this PR and it is ready for review again. Thank you all for your detailed comments so far, the pyarrow.cuda has certainly improved thanks to you.

@pitrou
Copy link
Member

pitrou commented Sep 13, 2018

@pearu Thanks a lot for doing this! I will review again in a few days, unless someone beats to me to it.

@wesm wesm changed the title ARROW-1424: Add CUDA support to pyarrow (REVIEW) ARROW-1424: [Python] Add CUDA support to pyarrow Sep 13, 2018
Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I skimmed this but it looks great; thanks for iterating so much on this. Left a couple small comments.

My laptop has an Nvidia GPU so I'll test this out locally before merging

return self.host_buffer.get().size()


cdef class BufferReader(NativeFile):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be clearer to call this CudaBufferReader but this is fine for now

import numpy as np
import sysconfig

cuda = pytest.importorskip("pyarrow.cuda")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not fail silently (skipping all tests) like it will now. Suggest we improve this aspect when we get CI set up for this code (somehow)

@wesm wesm closed this in 99190d0 Sep 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants