Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collaborations on columnar data structures #1

Closed
wesm opened this issue May 8, 2017 · 6 comments
Closed

Collaborations on columnar data structures #1

wesm opened this issue May 8, 2017 · 6 comments
Labels
proposal Change current process or code

Comments

@wesm
Copy link
Contributor

wesm commented May 8, 2017

Excited to see this new org created. I am interested to see if Apache Arrow (i.e. contiguous columnar data, validity bitmap for nulls) is the appropriate data model for data on the GPU, and if we can collaborate on some aspects of the code. It seems that CUDA 7 now supports C++11, so in theory we could compile the Arrow C++ libraries with nvcc and provide necessary APIs to enable Numba to interact with the raw memory buffers. This might simplify IPC with GPU main memory (record batch loading and unloading) and make less work for you here. I have an NVIDIA GPU on my home desktop, so I could help with testing.

@tmostak
Copy link

tmostak commented May 16, 2017

Hi @wesm, thanks for this. Yes we are excited about Arrow (even though we are only supporting a subset at the moment) because it provides interoperability with lots of other things and makes sense as a way to represent columnar data. I don't see any issues why it should not be performant on GPU, as the MapD native format is quite similar (except we store nulls in-line when possible to save space and bandwidth). Would it make sense to set up a call with the project members so we can discuss ways to collaborate?

@wesm
Copy link
Contributor Author

wesm commented May 16, 2017

That sounds good to me. Adding @julienledem @xhochy since they will be interested, and maybe other from the Apache Arrow team.

I am interested in

  • Ingest data (zero-copy, preferably) from Arrow record batches
  • Ingest data to MapD from Arrow
  • Export data as Arrow record batches
  • UDF protocol for batch-based UDFs
  • Benchmarks and analysis of pros/cons of different columnar-type memory layouts on the GPU (you say you store nulls inline -- does that mean sentinel values? Otherwise I am not sure how you could be more efficient that 1 bit per value for data that has nulls).

As background, I did some GPU development for accelerating Bayesian inference problems years ago and did a fair amount of CUDA C and PyCUDA work, so I've had a long-standing interest in architecting data structures and memory access patterns for the GPU.

@billmaimone
Copy link

Bingo on all fronts, all things mentioned in the talk I gave last week at GTC. We have also some basic work to do for supporting the rest of the data types (prototype did only simple, uncompress numerics to keep it simple).

@wesm
Copy link
Contributor Author

wesm commented May 18, 2017

Does the GPU benefit from columnar compression techniques like CPU-based columnar databases do?

@m1mc
Copy link

m1mc commented May 18, 2017

@wesm, we already have some in core engine like dictionary compression. And we are planning to tokenize any string column that only has digits to save memory, but they don't require to be columnar if you just mean sth. like RLE or HCC. All ways aim to keep GPU decoding fast.

@billmaimone
Copy link

billmaimone commented Jun 7, 2017 via email

@mike-wendt mike-wendt added the proposal Change current process or code label Aug 6, 2018
@mike-wendt mike-wendt changed the title Collaborations on columnar data structures Collaborations on columnar data structures Aug 8, 2018
kkraus14 pushed a commit that referenced this issue Aug 23, 2018
* np array solution

* cleanup

* np solution for division

* full reflected ops tests
mike-wendt pushed a commit that referenced this issue Oct 26, 2018
kkraus14 pushed a commit that referenced this issue Nov 27, 2018
* adding eq datetime ops for pygdf

* flake8 fixes

* Drop Python 2.7, Add Python 3.7

* removing int coercion for datetime

* Remove Python 3.7 build

* bumping numba

* forgot to commit meta.yaml changes

* flake8

* commutative addition

* commutative subtraction and multiplication

* reflected floordiv and truediv

* cleanup

* stray comment

* change rsub method

* further testing rsub

* rsub docstring

* revert back

* type coercion

* revert to pseudo-commutative implementation

* commutative ops tests

* test comment cleanup

* Feature/reflected ops noncommutative testing (#1)

* np array solution

* cleanup

* np solution for division

* full reflected ops tests

* cleanup

* switching lambda scalar to 2

* Update README.md

Conda installation instruction needed changes with pygdf version.

* Feature/reflected ops update (#2)

* test binary_operator

* test one line

* essentially use _binaryop with a line flipped

* expand to all non commutative reflected ops

* revert rmul

* Feature/reflected ops update (#3)

* test binary_operator

* test one line

* essentially use _binaryop with a line flipped

* expand to all non commutative reflected ops

* revert rmul

* rbinaryop function for clarity

* add scalar to array generation to avoid division by zero behavior

* remove integer division test due to libgdf bug

* Fix timezone issue when converting from datetime object into datetime64

* Remove unused import to fix flake8

* Initial modifications for new join API
harrism added a commit that referenced this issue Jan 24, 2019
ayushdg pushed a commit to ayushdg/cudf that referenced this issue May 8, 2019
raydouglass pushed a commit that referenced this issue May 13, 2019
shwina referenced this issue in shwina/cudf May 24, 2019
* ENH: Support `GDF_BOOL8` in Parquet reader

* Translate `parquet::BOOLEAN` to `GDF_BOOL8`
* Remove no-longer-necessary type conversion in pytests

* ENH: Support `GDF_BOOL8` in CSV reader

* Add new type detection count for booleans
* Add `true` and `false` as detected boolean values
* Add extra pytest case for bools

* ENH: Support `GDF_BOOL8` in CSV writer

* Uncomment lines to call nvstring to convert booleans to string
* Fix wrong order of na, true, false arguments
* Add boolean column to gtest

* Update CHANGELOG.md for PR

* first draft of inequality_comparator to replace LesserRTTI

* fixed build issues

* Added optimization for inequality comparator so that it is faster if there are no nulls

* ENH: Support `GDF_BOOL8` in ORC reader

* Translate `orc::BOOLEAN` to `GDF_BOOL8`
* Remove no-longer-necessary type conversion in pytests

* WIP and compiles

* split nulls and non nulls operators. This increases compile time. Commiting here anyways for historical reasons

* fixed inequality_comparator, updated group by to use new equality_comparator, removed no longer needed null handling flag from context

* cleaned up

* updated CHANGELOG

* improved formatting and added changes that somehow did not make it in a previous commit

* fixed issues caused by formatting in previous commit

* Fix issue by handling multiindex in series groupby

* CHANGELOG

* CHANGELOG again

* refactored based on PR feedback and added more code documentation

* Change output datatype for count groupby to np.int32

* Add assert to ensure count() datatype is updated if gdf_size_type changes

* Add changelog entry

* Don't check for dtype when doing groupby-count

* Split device_atomics.cuh file

split the file into `device_atomics.cuh` and `device_operators.cuh`
separated the difinition of the device operators

* Remove atomicCASImpl(int8 or int16)

move atomicCASImpl(int8 or int16) into typesAtomicCASImpl

* Simplify `atomicAdd`

* Simplify atomicMin, atomixMax

simplify atomicMin, atomixMax
add cudf::bool8 for atomic test case for atomicAdd,Min,Max
add cudf::bool8 specialization for genericAtomicOperation

* Add more test coverage

* Simplify atomicAnd/Or/Xor

* Removed `genericAtomicOperationUnderlyingType`

* Remove `typesAtomicOperation32|64`

* Update doxygen texts for atomics

* Add '__forceinline__ __device__'

Add '__forceinline__ __device__' to `W genericAtomicOperator(W)`

* add static_assert for long long int size

Add size check assert between `long long int` and `int64_t`

* remove redundant `sizeof(T)` from `CASImpl`

remove redundant `sizeof(T)` when calling 'typesAtomicCASImpl`

* remove redundant `sizeof(T)` from `atomic op impl`

remove redundant `sizeof(T)` when calling 'genericAtomicOperationImpl`

* Add `genericAtomicOperationImpl(int64_t, Sum)`

Add native atomicAdd(uint64_t) call for sint64_t

* Add comment for impl of atomicAdd(int64_t)

Add comment for `genericAtomicOperationImpl<int64_t, DeviceSum, 8>`
 why it uses atomicAdd(uint64) inside

* Removed `genericAtomicOperation(W)`

Removed `genericAtomicOperation(W)` since it is not invoked for
 cudf::wrapper types.
Merged it into `genericAtomicOperation(T)`

Add size check assert at `type_reinterpret`.

* CHANGELOG.

* CHANGELOG.

* Don't check for dtype when doing groupby-count in test_string.pu

* Update CHANGELOG.md

Co-Authored-By: Keith Kraus <keith.j.kraus@gmail.com>

* update changelog

* Concatenate multiindexes.

* Use temporary buffer for `NvString:create_from_bool` for GDF_BOOL8

* No guarantee that `cudf::bool8` and `bool` are same type for cast

* Insane MultiIndex _concat method and many supporting tests.

* Fix style and CHANGELOG

* Add two more inverse tests.

* CSV reader: support specifying a subset of dtypes when using usecols parameter. Include C++ API changes

* Complete the support for partial dtype list w/ usecols. Expanded the test and refactored the dtype assignment.

* remove unused include

* Update CHANGELOG.md

* fix Python style

* Fix error cheking when setting the dtype array.

* implemented PR feedback

* Fix handling read only schema buffers in gpuarrow reader

* Changelog #

* Remove insane multiindex concatenation block and raise NotImplementedError instead.

* typo fixed

* Use one line list comprehension and eliminate shallow copies now that the _concat is not performed with levels/codes multiindices.

* One more single line fix

* Change the dtype behavior with usecols and list dtype parameter - user need to specify all column types, not just the active ones.

* format fix

* correct a comment

* Handle more generalized numpy input instead of forcing unsigned char

* Change test to use foreign memory similar to OmniSci

* Update CHANGELOG.md

* fixed build issue

* REL v0.7.2 release

* fix groupby count dtype issue

* maintain the original series name in series.unique output

* changelog

* ENH: Add test for cudf::bool8 in booleans gtest

* Fill mask with zeros when making a null column

* Fix merge

* Merge branch 'ohe-perf' of https://github.com/RFinkelberg/cudf into ohe-perf

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.

* Some cleanup of bindings and fixes for s_v and v_s binops
rjzamora pushed a commit to rjzamora/cudf that referenced this issue Jun 19, 2019
Minor cudf serialization improvements
kkraus14 pushed a commit that referenced this issue Feb 10, 2020
…on_column

[REVIEW] Full join issue with no common column
OlivierNV added a commit to OlivierNV/cudf that referenced this issue Feb 21, 2020
kkraus14 pushed a commit that referenced this issue Apr 6, 2020
codereport added a commit to codereport/cudf that referenced this issue Jun 19, 2020
codereport added a commit to codereport/cudf that referenced this issue Jun 26, 2020
codereport added a commit to codereport/cudf that referenced this issue Jun 29, 2020
codereport added a commit to codereport/cudf that referenced this issue Jul 2, 2020
sperlingxx pushed a commit to sperlingxx/cudf that referenced this issue Oct 23, 2020
mythrocks added a commit to mythrocks/cudf that referenced this issue May 4, 2021
1. Output struct size must match target column, not source.
2. Test case for rapidsai#1.
3. Also, move scatter_struct_tests and scatter_list_tests
   from .cu to .cpp, for faster compile.
rapids-bot bot pushed a commit that referenced this issue Jul 22, 2022
Fix style checks for string_udfs.
bwyogatama referenced this issue in bwyogatama/cudf Aug 19, 2022
rapids-bot bot pushed a commit that referenced this issue Sep 28, 2022
…untime-checks

Remove runtime checks for CUDA versions from strings_udf
rapids-bot bot pushed a commit that referenced this issue Jun 9, 2023
This implements stacktrace and adds a stacktrace string into any exception thrown by cudf. By doing so, the exception carries information about where it originated, allowing the downstream application to trace back with much less effort.

Closes #12422.

### Example:
```
#0: cudf/cpp/build/libcudf.so : std::unique_ptr<cudf::column, std::default_delete<cudf::column> > cudf::detail::sorted_order<false>(cudf::table_view, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x446
#1: cudf/cpp/build/libcudf.so : cudf::detail::sorted_order(cudf::table_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x113
#2: cudf/cpp/build/libcudf.so : std::unique_ptr<cudf::column, std::default_delete<cudf::column> > cudf::detail::segmented_sorted_order_common<(cudf::detail::sort_method)1>(cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x66e
#3: cudf/cpp/build/libcudf.so : cudf::detail::segmented_sort_by_key(cudf::table_view const&, cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::cuda_stream_view, rmm::mr::device_memory_resource*)+0x88
#4: cudf/cpp/build/libcudf.so : cudf::segmented_sort_by_key(cudf::table_view const&, cudf::table_view const&, cudf::column_view const&, std::vector<cudf::order, std::allocator<cudf::order> > const&, std::vector<cudf::null_order, std::allocator<cudf::null_order> > const&, rmm::mr::device_memory_resource*)+0xb9
#5: cudf/cpp/build/gtests/SORT_TEST : ()+0xe3027
#6: cudf/cpp/build/lib/libgtest.so.1.13.0 : void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x8f
#7: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::Test::Run()+0xd6
#8: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::TestInfo::Run()+0x195
#9: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::TestSuite::Run()+0x109
#10: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::internal::UnitTestImpl::RunAllTests()+0x44f
#11: cudf/cpp/build/lib/libgtest.so.1.13.0 : bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x87
#12: cudf/cpp/build/lib/libgtest.so.1.13.0 : testing::UnitTest::Run()+0x95
#13: cudf/cpp/build/gtests/SORT_TEST : ()+0xdb08c
#14: /lib/x86_64-linux-gnu/libc.so.6 : ()+0x29d90
#15: /lib/x86_64-linux-gnu/libc.so.6 : __libc_start_main()+0x80
#16: cudf/cpp/build/gtests/SORT_TEST : ()+0xdf3d5
```

### Usage

In order to retrieve a stacktrace with fully human-readable symbols, some compiling options must be adjusted. To make such adjustment convenient and effortless, a new cmake option (`CUDF_BUILD_STACKTRACE_DEBUG`) has been added. Just set this option to `ON` before building cudf and it will be ready to use.

For downstream applications, whenever a cudf-type exception is thrown, it can retrieve the stored stacktrace and do whatever it wants with it. For example:
```
try {
  // cudf API calls
} catch (cudf::logic_error const& e) {
  std::cout << e.what() << std::endl;
  std::cout << e.stacktrace() << std::endl;
  throw e;
} 
// similar with catching other exception types
```

### Follow-up work

The next step would be patching `rmm` to attach stacktrace into `rmm::` exceptions. Doing so will allow debugging various memory exceptions thrown from libcudf using their stacktrace.


### Note:
 * This feature doesn't require libcudf to be built in Debug mode.
 * The flag `CUDF_BUILD_STACKTRACE_DEBUG` should not be turned on in production as it may affect code optimization. Instead, libcudf compiled with that flag turned on should be used only when needed, when debugging cudf throwing exceptions.
 * This flag removes the current optimization flag from compiling (such as `-O2` or `-O3`, if in Release mode) and replaces by `-Og` (optimize for debugging).
 * If this option is not set to `ON`, the stacktrace will not be available. This is to avoid expensive stracktrace retrieval if the throwing exception is expected.

Authors:
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Robert Maynard (https://github.com/robertmaynard)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Jason Lowe (https://github.com/jlowe)

URL: #13298
rapids-bot bot pushed a commit that referenced this issue Sep 22, 2023
Pin conda packages to `aws-sdk-cpp<1.11`. The recent upgrade in version `1.11.*` has caused several issues with cleaning up (more details on changes can be read in [this link](https://github.com/aws/aws-sdk-cpp#version-111-is-now-available)), leading to Distributed and Dask-CUDA processes to segfault. The stack for one of those crashes looks like the following:

```
(gdb) bt
#0  0x00007f5125359a0c in Aws::Utils::Logging::s_aws_logger_redirect_get_log_level(aws_logger*, unsigned int) () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../.././libaws-cpp-sdk-core.so
#1  0x00007f5124968f83 in aws_event_loop_thread () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-io.so.1.0.0
#2  0x00007f5124ad9359 in thread_fn () from /opt/conda/envs/dask/lib/python3.9/site-packages/pyarrow/../../../././libaws-c-common.so.1
#3  0x00007f519958f6db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4  0x00007f5198b1361f in clone () from /lib/x86_64-linux-gnu/libc.so.6
```

Such segfaults now manifest frequently in CI, and in some cases are reproducible with a hit rate of ~30%. Given the approaching release time, it's probably the safest option to just pin to an older version of the package while we don't pinpoint the exact cause for the issue and a patched build is released upstream.

The `aws-sdk-cpp` is statically-linked in the `pyarrow` pip package, which prevents us from using the same pinning technique. cuDF is currently pinned to `pyarrow=12.0.1` which seems to be built against `aws-sdk-cpp=1.10.*`, as per [recent build logs](https://github.com/apache/arrow/actions/runs/6276453828/job/17046177335?pr=37792#step:6:1372).

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Ray Douglass (https://github.com/raydouglass)

URL: #14173
raydouglass pushed a commit that referenced this issue Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Change current process or code
Projects
None yet
Development

No branches or pull requests

6 participants