Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-1424: [Python] Add CUDA support to pyarrow #2536

Closed
wants to merge 39 commits into from
Closed
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
a2d7557
Minimal GPU support for pyarrow. WIP.
pearu Aug 29, 2018
49b0190
Expose CudaDeviceManager, CudaContext, CudaIpcMemHandle to pyarrow.
pearu Aug 30, 2018
cf42941
Expose Message for lib_gpu.
pearu Aug 31, 2018
ae3cd3f
Exposed all Arrow GPU C++ classes and functions to pyarrow. WIP
pearu Aug 31, 2018
11cba54
Add copy_to_host and copy_from_host methods to CudaBuffer
pearu Aug 31, 2018
9eebf19
Document copy_to_host and copy_from_host methods. Complete unittests …
pearu Aug 31, 2018
9b8cb1b
Complete copy_from/to_host implementations and tests.
pearu Sep 4, 2018
102f277
Document all methods and functions.
pearu Sep 5, 2018
5e2c1ba
Unittest for allocate/free host buffer.
pearu Sep 5, 2018
36396c6
Test memory management of CudaBuffer and CudaHostBuffer
pearu Sep 5, 2018
e2b14df
Impl CudaBuffer device_buffer and slice methods. Impl tests for CudaB…
pearu Sep 6, 2018
7dc6d88
Impl CudaBufferWriter.writeat, CudaBufferReader.read_buffer. Complete…
pearu Sep 7, 2018
c2799e6
Impl IPC tests.
pearu Sep 7, 2018
584d94b
Flake it.
pearu Sep 7, 2018
d237b34
Implement GPU support in pyarrow
pearu Sep 10, 2018
9e365bd
Merge branch 'pearu-cuda-pyarrow' of github.com:Quansight/arrow into …
pearu Sep 10, 2018
3913e29
Improve detecting availability of gpu support.
pearu Sep 10, 2018
7053df8
Improve detecting availability of gpu support (2nd try).
pearu Sep 10, 2018
a544787
Fix formatting for flake8.
pearu Sep 10, 2018
8faf1ee
Add missing import of pandas_compat.
pearu Sep 10, 2018
61659c4
Introduce pyarrow.cuda module as the CUDA UI.
pearu Sep 10, 2018
cb89ee3
Remove usage of FreeHost to avoid double-freeing issue.
pearu Sep 10, 2018
94989a7
Rename lib_gpu to _cuda, ARROW_GPU to CUDA, --with-arrow-gpu to --wit…
pearu Sep 10, 2018
1defcb4
Remove spaces around * and &
pearu Sep 10, 2018
ce1d3bb
Remove _freed attribute from CudaHostBuffer
pearu Sep 10, 2018
dff5bd4
More space removal. Fix CudaBufferWriter and cuda_read_record_batch d…
pearu Sep 10, 2018
106b071
Removed cuda prefix everywhere except in CudaBuffer.
pearu Sep 10, 2018
f3c41db
Added cdefs to CMemoryPool varibales
pearu Sep 10, 2018
ff7faa2
Removed DeviceManager, use 'ctx=Context(<device_number>)'. Introduced…
pearu Sep 11, 2018
2126eba
Fixes for flake8
pearu Sep 11, 2018
88961fa
cmake: moved Arrow CUDA detection to FindArrowCuda.
pearu Sep 12, 2018
66d704e
Silence flake8 on Arrow CUDA api module.
pearu Sep 12, 2018
4ece809
Remove redudant -DPYARROW_BUILD_CUDA=on
pearu Sep 12, 2018
38ddfff
Apply feedback from PR. WIP.
pearu Sep 12, 2018
e1bcb08
Apply feedback from PR. 2.
pearu Sep 12, 2018
7555c65
Raise BufferError in CudaBuffer.__getbuffer__. Fix pytest.raises usages.
pearu Sep 13, 2018
5e171e8
Revised buffer_from_data.
pearu Sep 13, 2018
c86e165
Minor clean up.
pearu Sep 13, 2018
7a01846
Minor clean up 2.
pearu Sep 13, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
50 changes: 50 additions & 0 deletions cpp/cmake_modules/FindArrow.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,12 @@ find_library(ARROW_PYTHON_LIB_PATH NAMES arrow_python
NO_DEFAULT_PATH)
get_filename_component(ARROW_PYTHON_LIBS ${ARROW_PYTHON_LIB_PATH} DIRECTORY)

find_library(ARROW_GPU_LIB_PATH NAMES arrow_gpu
PATHS
${ARROW_SEARCH_LIB_PATH}
NO_DEFAULT_PATH)
get_filename_component(ARROW_GPU_LIBS ${ARROW_GPU_LIB_PATH} DIRECTORY)

if (MSVC)
SET(CMAKE_FIND_LIBRARY_SUFFIXES ".lib" ".dll")

Expand All @@ -95,6 +101,13 @@ if (MSVC)
PATH_SUFFIXES "bin" )
get_filename_component(ARROW_SHARED_LIBS ${ARROW_SHARED_LIBRARIES} PATH )
get_filename_component(ARROW_PYTHON_SHARED_LIBS ${ARROW_PYTHON_SHARED_LIBRARIES} PATH )

if (PYARROW_BUILD_CUDA)
pearu marked this conversation as resolved.
Show resolved Hide resolved
find_library(ARROW_GPU_SHARED_LIBRARIES NAMES arrow_gpu
PATHS ${ARROW_HOME} NO_DEFAULT_PATH
PATH_SUFFIXES "bin" )
get_filename_component(ARROW_GPU_SHARED_LIBS ${ARROW_GPU_SHARED_LIBRARIES} PATH )
endif()
endif ()

if (ARROW_INCLUDE_DIR AND ARROW_LIBS)
Expand All @@ -117,10 +130,41 @@ if (ARROW_INCLUDE_DIR AND ARROW_LIBS)
endif()
endif()

if (PYARROW_BUILD_CUDA AND ARROW_GPU_LIBS)
set(ARROW_GPU_FOUND TRUE)
set(ARROW_GPU_LIB_NAME arrow_gpu)
if (MSVC)
set(ARROW_GPU_STATIC_LIB ${ARROW_GPU_LIBS}/${ARROW_GPU_LIB_NAME}${ARROW_MSVC_STATIC_LIB_SUFFIX}${CMAKE_STATIC_LIBRARY_SUFFIX})
set(ARROW_GPU_SHARED_LIB ${ARROW_GPU_SHARED_LIBS}/${ARROW_GPU_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
set(ARROW_GPU_SHARED_IMP_LIB ${ARROW_GPU_LIBS}/${ARROW_GPU_LIB_NAME}.lib)
else()
set(ARROW_GPU_STATIC_LIB ${ARROW_LIBS}/lib${ARROW_GPU_LIB_NAME}.a)
set(ARROW_GPU_SHARED_LIB ${ARROW_LIBS}/lib${ARROW_GPU_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
endif()


if (ARROW_FOUND)
if (NOT Arrow_FIND_QUIETLY)
message(STATUS "Found the Arrow core library: ${ARROW_LIB_PATH}")
message(STATUS "Found the Arrow Python library: ${ARROW_PYTHON_LIB_PATH}")
if (PYARROW_BUILD_CUDA)
if (ARROW_GPU_FOUND)
message(STATUS "Found the Arrow GPU library: ${ARROW_GPU_LIB_PATH}")
else()
set(ARROW_ERR_MSG "Could not find the Arrow GPU library. Looked for libs")
set(ARROW_ERR_MSG "${ARROW_ERR_MSG} in ${ARROW_SEARCH_LIB_PATH}")
if (Arrow_FIND_REQUIRED)
message(FATAL_ERROR "${ARROW_ERR_MSG}")
else (Arrow_FIND_REQUIRED)
message(STATUS "${ARROW_ERR_MSG}")
endif()
set(ARROW_GPU_FOUND FALSE)
endif()
else()
message(STATUS "Found but not using the Arrow GPU library: ${ARROW_GPU_LIB_PATH}")
set(ARROW_GPU_FOUND FALSE)
endif()
endif ()
else ()
if (NOT Arrow_FIND_QUIETLY)
Expand All @@ -134,6 +178,7 @@ else ()
endif (Arrow_FIND_REQUIRED)
endif ()
set(ARROW_FOUND FALSE)
set(ARROW_GPU_FOUND FALSE)
endif ()

if (MSVC)
Expand All @@ -145,6 +190,9 @@ if (MSVC)
ARROW_PYTHON_STATIC_LIB
ARROW_PYTHON_SHARED_LIB
ARROW_PYTHON_SHARED_IMP_LIB
ARROW_GPU_STATIC_LIB
ARROW_GPU_SHARED_LIB
ARROW_GPU_SHARED_IMP_LIB
)
else()
mark_as_advanced(
Expand All @@ -153,5 +201,7 @@ else()
ARROW_SHARED_LIB
ARROW_PYTHON_STATIC_LIB
ARROW_PYTHON_SHARED_LIB
ARROW_GPU_STATIC_LIB
ARROW_GPU_SHARED_LIB
)
endif()
30 changes: 28 additions & 2 deletions python/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ endif()

# Top level cmake dir
if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
option(PYARROW_BUILD_CUDA
"Build the PyArrow CUDA support"
OFF)
pearu marked this conversation as resolved.
Show resolved Hide resolved
option(PYARROW_BUILD_PARQUET
"Build the PyArrow Parquet integration"
OFF)
Expand Down Expand Up @@ -275,6 +278,12 @@ if (PYARROW_BUNDLE_ARROW_CPP)
ABI_VERSION ${ARROW_ABI_VERSION}
SO_VERSION ${ARROW_SO_VERSION})

if (ARROW_GPU_FOUND)
bundle_arrow_lib(ARROW_GPU_SHARED_LIB
ABI_VERSION ${ARROW_ABI_VERSION}
SO_VERSION ${ARROW_SO_VERSION})
endif()
pearu marked this conversation as resolved.
Show resolved Hide resolved

# boost
if (PYARROW_BOOST_USE_SHARED AND PYARROW_BUNDLE_BOOST)
set(Boost_USE_STATIC_LIBS OFF)
Expand Down Expand Up @@ -305,6 +314,9 @@ if (PYARROW_BUNDLE_ARROW_CPP)
if (MSVC)
bundle_arrow_implib(ARROW_SHARED_IMP_LIB)
bundle_arrow_implib(ARROW_PYTHON_SHARED_IMP_LIB)
if (ARROW_GPU_FOUND)
bundle_arrow_implib(ARROW_GPU_SHARED_IMP_LIB)
endif()
pearu marked this conversation as resolved.
Show resolved Hide resolved
endif()
endif()

Expand All @@ -313,11 +325,19 @@ if (MSVC)
SHARED_LIB ${ARROW_SHARED_IMP_LIB})
ADD_THIRDPARTY_LIB(arrow_python
SHARED_LIB ${ARROW_PYTHON_SHARED_IMP_LIB})
if (ARROW_GPU_FOUND)
ADD_THIRDPARTY_LIB(arrow_gpu
SHARED_LIB ${ARROW_GPU_SHARED_IMP_LIB})
endif()
else()
ADD_THIRDPARTY_LIB(arrow
SHARED_LIB ${ARROW_SHARED_LIB})
ADD_THIRDPARTY_LIB(arrow_python
SHARED_LIB ${ARROW_PYTHON_SHARED_LIB})
if (ARROW_GPU_FOUND)
ADD_THIRDPARTY_LIB(arrow_gpu
SHARED_LIB ${ARROW_GPU_SHARED_LIB})
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments above

endif()

############################################################
Expand All @@ -330,12 +350,18 @@ endif()

set(CYTHON_EXTENSIONS
lib
)
)

set(LINK_LIBS
arrow_shared
arrow_python_shared
)
)


if (ARROW_GPU_FOUND)
set(LINK_LIBS ${LINK_LIBS} arrow_gpu_shared)
set(CYTHON_EXTENSIONS ${CYTHON_EXTENSIONS} _cuda)
endif()

if (PYARROW_BUILD_PARQUET)
## Parquet
Expand Down
62 changes: 62 additions & 0 deletions python/pyarrow/_cuda.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

from pyarrow.lib cimport *
from pyarrow.includes.common cimport *
from pyarrow.includes.libarrow cimport *
from pyarrow.includes.libarrow_cuda cimport *


cdef class Context:
cdef:
shared_ptr[CCudaContext] context
int device_number

cdef void init(self, const shared_ptr[CCudaContext] &ctx)
pearu marked this conversation as resolved.
Show resolved Hide resolved


cdef class IpcMemHandle:
cdef:
shared_ptr[CCudaIpcMemHandle] handle

cdef void init(self, shared_ptr[CCudaIpcMemHandle] &h)


cdef class CudaBuffer(Buffer):
cdef:
shared_ptr[CCudaBuffer] cuda_buffer

cdef void init_cuda(self, const shared_ptr[CCudaBuffer] &buffer)


cdef class HostBuffer(Buffer):
cdef:
shared_ptr[CCudaHostBuffer] host_buffer

cdef void init_host(self, const shared_ptr[CCudaHostBuffer] &buffer)


cdef class BufferReader(NativeFile):
cdef:
CCudaBufferReader* reader
CudaBuffer buffer


cdef class BufferWriter(NativeFile):
cdef:
CCudaBufferWriter* writer
CudaBuffer buffer