Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-1424: [Python] Add CUDA support to pyarrow #2536

Closed
wants to merge 39 commits into from
Closed
Show file tree
Hide file tree
Changes from 33 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
a2d7557
Minimal GPU support for pyarrow. WIP.
pearu Aug 29, 2018
49b0190
Expose CudaDeviceManager, CudaContext, CudaIpcMemHandle to pyarrow.
pearu Aug 30, 2018
cf42941
Expose Message for lib_gpu.
pearu Aug 31, 2018
ae3cd3f
Exposed all Arrow GPU C++ classes and functions to pyarrow. WIP
pearu Aug 31, 2018
11cba54
Add copy_to_host and copy_from_host methods to CudaBuffer
pearu Aug 31, 2018
9eebf19
Document copy_to_host and copy_from_host methods. Complete unittests …
pearu Aug 31, 2018
9b8cb1b
Complete copy_from/to_host implementations and tests.
pearu Sep 4, 2018
102f277
Document all methods and functions.
pearu Sep 5, 2018
5e2c1ba
Unittest for allocate/free host buffer.
pearu Sep 5, 2018
36396c6
Test memory management of CudaBuffer and CudaHostBuffer
pearu Sep 5, 2018
e2b14df
Impl CudaBuffer device_buffer and slice methods. Impl tests for CudaB…
pearu Sep 6, 2018
7dc6d88
Impl CudaBufferWriter.writeat, CudaBufferReader.read_buffer. Complete…
pearu Sep 7, 2018
c2799e6
Impl IPC tests.
pearu Sep 7, 2018
584d94b
Flake it.
pearu Sep 7, 2018
d237b34
Implement GPU support in pyarrow
pearu Sep 10, 2018
9e365bd
Merge branch 'pearu-cuda-pyarrow' of github.com:Quansight/arrow into …
pearu Sep 10, 2018
3913e29
Improve detecting availability of gpu support.
pearu Sep 10, 2018
7053df8
Improve detecting availability of gpu support (2nd try).
pearu Sep 10, 2018
a544787
Fix formatting for flake8.
pearu Sep 10, 2018
8faf1ee
Add missing import of pandas_compat.
pearu Sep 10, 2018
61659c4
Introduce pyarrow.cuda module as the CUDA UI.
pearu Sep 10, 2018
cb89ee3
Remove usage of FreeHost to avoid double-freeing issue.
pearu Sep 10, 2018
94989a7
Rename lib_gpu to _cuda, ARROW_GPU to CUDA, --with-arrow-gpu to --wit…
pearu Sep 10, 2018
1defcb4
Remove spaces around * and &
pearu Sep 10, 2018
ce1d3bb
Remove _freed attribute from CudaHostBuffer
pearu Sep 10, 2018
dff5bd4
More space removal. Fix CudaBufferWriter and cuda_read_record_batch d…
pearu Sep 10, 2018
106b071
Removed cuda prefix everywhere except in CudaBuffer.
pearu Sep 10, 2018
f3c41db
Added cdefs to CMemoryPool varibales
pearu Sep 10, 2018
ff7faa2
Removed DeviceManager, use 'ctx=Context(<device_number>)'. Introduced…
pearu Sep 11, 2018
2126eba
Fixes for flake8
pearu Sep 11, 2018
88961fa
cmake: moved Arrow CUDA detection to FindArrowCuda.
pearu Sep 12, 2018
66d704e
Silence flake8 on Arrow CUDA api module.
pearu Sep 12, 2018
4ece809
Remove redudant -DPYARROW_BUILD_CUDA=on
pearu Sep 12, 2018
38ddfff
Apply feedback from PR. WIP.
pearu Sep 12, 2018
e1bcb08
Apply feedback from PR. 2.
pearu Sep 12, 2018
7555c65
Raise BufferError in CudaBuffer.__getbuffer__. Fix pytest.raises usages.
pearu Sep 13, 2018
5e171e8
Revised buffer_from_data.
pearu Sep 13, 2018
c86e165
Minor clean up.
pearu Sep 13, 2018
7a01846
Minor clean up 2.
pearu Sep 13, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions cpp/cmake_modules/FindArrow.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ if (MSVC)
PATH_SUFFIXES "bin" )
get_filename_component(ARROW_SHARED_LIBS ${ARROW_SHARED_LIBRARIES} PATH )
get_filename_component(ARROW_PYTHON_SHARED_LIBS ${ARROW_PYTHON_SHARED_LIBRARIES} PATH )

endif ()

if (ARROW_INCLUDE_DIR AND ARROW_LIBS)
Expand Down
124 changes: 124 additions & 0 deletions cpp/cmake_modules/FindArrowCuda.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# - Find ARROW CUDA (arrow/gpu/cuda_api.h, libarrow_gpu.a, libarrow_gpu.so)
#
# This module requires Arrow from which it uses
# ARROW_FOUND
# ARROW_SEARCH_HEADER_PATHS
# ARROW_SEARCH_LIB_PATH
# ARROW_HOME
#
# This module defines
# ARROW_CUDA_INCLUDE_DIR, directory containing headers
# ARROW_CUDA_LIBS, directory containing arrow libraries
# ARROW_CUDA_STATIC_LIB, path to libarrow.a
# ARROW_CUDA_SHARED_LIB, path to libarrow's shared library
# ARROW_CUDA_SHARED_IMP_LIB, path to libarrow's import library (MSVC only)
# ARROW_CUDA_FOUND, whether arrow has been found

#
# TODO(ARROW-3209): rename arrow/gpu to arrow/cuda, arrow_gpu to arrow_cuda
#

include(FindPkgConfig)
include(GNUInstallDirs)

if (NOT DEFINED ARROW_FOUND)
if (ArrowCuda_FIND_REQUIRED)
find_package(Arrow REQUIRED)
else()
find_package(Arrow)
endif()
endif()

if (NOT ARROW_FOUND)
set(ARROW_CUDA_FOUND FALSE)
return()
endif()

find_path(ARROW_CUDA_INCLUDE_DIR arrow/gpu/cuda_api.h PATHS
${ARROW_SEARCH_HEADER_PATHS}
NO_DEFAULT_PATH
)

if (NOT (ARROW_CUDA_INCLUDE_DIR STREQUAL ARROW_INCLUDE_DIR))
set(ARROW_CUDA_WARN_MSG "Mismatch of Arrow and Arrow CUDA include directories:")
set(ARROW_CUDA_WARN_MSG "${ARROW_CUDA_WARN_MSG} ARROW_INCLUDE_DIR=${ARROW_INCLUDE_DIR}")
set(ARROW_CUDA_WARN_MSG "${ARROW_CUDA_WARN_MSG} ARROW_CUDA_INCLUDE_DIR=${ARROW_CUDA_INCLUDE_DIR}")
message(WARNING ${ARROW_CUDA_WARN_MSG})
endif()

find_library(ARROW_CUDA_LIB_PATH NAMES arrow_gpu
PATHS
${ARROW_SEARCH_LIB_PATH}
NO_DEFAULT_PATH)
get_filename_component(ARROW_CUDA_LIBS ${ARROW_CUDA_LIB_PATH} DIRECTORY)

if (MSVC)
find_library(ARROW_CUDA_SHARED_LIBRARIES NAMES arrow_gpu
PATHS ${ARROW_HOME} NO_DEFAULT_PATH
PATH_SUFFIXES "bin" )
get_filename_component(ARROW_CUDA_SHARED_LIBS ${ARROW_CUDA_SHARED_LIBRARIES} PATH )
endif()


if (ARROW_CUDA_INCLUDE_DIR AND ARROW_CUDA_LIBS)
set(ARROW_CUDA_FOUND TRUE)
set(ARROW_CUDA_LIB_NAME arrow_gpu)
if (MSVC)
set(ARROW_CUDA_STATIC_LIB ${ARROW_CUDA_LIBS}/${ARROW_CUDA_LIB_NAME}${ARROW_MSVC_STATIC_LIB_SUFFIX}${CMAKE_STATIC_LIBRARY_SUFFIX})
set(ARROW_CUDA_SHARED_LIB ${ARROW_CUDA_SHARED_LIBS}/${ARROW_CUDA_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
set(ARROW_CUDA_SHARED_IMP_LIB ${ARROW_CUDA_LIBS}/${ARROW_CUDA_LIB_NAME}.lib)
else()
set(ARROW_CUDA_STATIC_LIB ${ARROW_LIBS}/lib${ARROW_CUDA_LIB_NAME}.a)
set(ARROW_CUDA_SHARED_LIB ${ARROW_LIBS}/lib${ARROW_CUDA_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
endif()

if (ARROW_CUDA_FOUND)
if (NOT ArrowCuda_FIND_QUIETLY)
message(STATUS "Found the Arrow CUDA library: ${ARROW_CUDA_LIB_PATH}")
endif()
else()
if (NOT ArrowCuda_FIND_QUIETLY)
set(ARROW_CUDA_ERR_MSG "Could not find the Arrow CUDA library. Looked for headers")
set(ARROW_CUDA_ERR_MSG "${ARROW_CUDA_ERR_MSG} in ${ARROW_SEARCH_HEADER_PATHS}, and for libs")
set(ARROW_CUDA_ERR_MSG "${ARROW_CUDA_ERR_MSG} in ${ARROW_SEARCH_LIB_PATH}")
if (ArrowCuda_FIND_REQUIRED)
message(FATAL_ERROR "${ARROW_CUDA_ERR_MSG}")
else(ArrowCuda_FIND_REQUIRED)
message(STATUS "${ARROW_CUDA_ERR_MSG}")
endif (ArrowCuda_FIND_REQUIRED)
endif ()
set(ARROW_CUDA_FOUND FALSE)
endif ()

if (MSVC)
mark_as_advanced(
ARROW_CUDA_INCLUDE_DIR
ARROW_CUDA_STATIC_LIB
ARROW_CUDA_SHARED_LIB
ARROW_CUDA_SHARED_IMP_LIB
)
else()
mark_as_advanced(
ARROW_CUDA_INCLUDE_DIR
ARROW_CUDA_STATIC_LIB
ARROW_CUDA_SHARED_LIB
)
endif()
36 changes: 34 additions & 2 deletions python/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
#
# Includes code assembled from BSD/MIT/Apache-licensed code from some 3rd-party
# projects, including Kudu, Impala, and libdynd. See python/LICENSE.txt
#
# TODO(ARROW-3209): rename arrow_gpu to arrow_cuda
#

cmake_minimum_required(VERSION 2.7)
project(pyarrow)
Expand Down Expand Up @@ -58,6 +61,9 @@ endif()

# Top level cmake dir
if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_CURRENT_SOURCE_DIR}")
option(PYARROW_BUILD_CUDA
"Build the PyArrow CUDA support"
OFF)
pearu marked this conversation as resolved.
Show resolved Hide resolved
option(PYARROW_BUILD_PARQUET
"Build the PyArrow Parquet integration"
OFF)
Expand Down Expand Up @@ -330,12 +336,38 @@ endif()

set(CYTHON_EXTENSIONS
lib
)
)

set(LINK_LIBS
arrow_shared
arrow_python_shared
)
)

if (PYARROW_BUILD_CUDA)
## Arrow CUDA
find_package(ArrowCuda)
if(NOT ARROW_CUDA_FOUND)
message(FATAL_ERROR "Unable to locate Arrow CUDA libraries")
else()
if (PYARROW_BUNDLE_ARROW_CPP)
bundle_arrow_lib(ARROW_CUDA_SHARED_LIB
ABI_VERSION ${ARROW_ABI_VERSION}
SO_VERSION ${ARROW_SO_VERSION})
if (MSVC)
bundle_arrow_implib(ARROW_CUDA_SHARED_IMP_LIB)
pearu marked this conversation as resolved.
Show resolved Hide resolved
endif()
endif()
if (MSVC)
ADD_THIRDPARTY_LIB(arrow_gpu
SHARED_LIB ${ARROW_CUDA_SHARED_IMP_LIB})
else()
ADD_THIRDPARTY_LIB(arrow_gpu
SHARED_LIB ${ARROW_CUDA_SHARED_LIB})
endif()
set(LINK_LIBS ${LINK_LIBS} arrow_gpu_shared)
set(CYTHON_EXTENSIONS ${CYTHON_EXTENSIONS} _cuda)
endif()
endif()

if (PYARROW_BUILD_PARQUET)
## Parquet
Expand Down
62 changes: 62 additions & 0 deletions python/pyarrow/_cuda.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

from pyarrow.lib cimport *
from pyarrow.includes.common cimport *
from pyarrow.includes.libarrow cimport *
from pyarrow.includes.libarrow_cuda cimport *


cdef class Context:
cdef:
shared_ptr[CCudaContext] context
int device_number

cdef void init(self, const shared_ptr[CCudaContext] &ctx)
pearu marked this conversation as resolved.
Show resolved Hide resolved


cdef class IpcMemHandle:
cdef:
shared_ptr[CCudaIpcMemHandle] handle

cdef void init(self, shared_ptr[CCudaIpcMemHandle] &h)


cdef class CudaBuffer(Buffer):
cdef:
shared_ptr[CCudaBuffer] cuda_buffer

cdef void init_cuda(self, const shared_ptr[CCudaBuffer] &buffer)


cdef class HostBuffer(Buffer):
cdef:
shared_ptr[CCudaHostBuffer] host_buffer

cdef void init_host(self, const shared_ptr[CCudaHostBuffer] &buffer)


cdef class BufferReader(NativeFile):
cdef:
CCudaBufferReader* reader
CudaBuffer buffer


cdef class BufferWriter(NativeFile):
cdef:
CCudaBufferWriter* writer
CudaBuffer buffer