[FEA] Add 64-bit size type option at build-time for libcudf #13159

GregoryKimball · 2023-04-17T23:18:55Z

GregoryKimball · 2023-09-01T03:30:56Z

On branch-23.10 commit ad9fa501192, I ran build.sh libcudf with a 64-bit size type and identified the unique lines that threw compilation errors.

Dictionary errors

cudf/cpp/src/groupby/sort/group_single_pass_reduction_util.cuh(64): error: no suitable conversion function from "cudf::dictionary32" to "cudf::size_type" exists
cudf/cpp/include/cudf/dictionary/detail/iterator.cuh(88): error: no suitable conversion function from "cudf::dictionary32" to "cudf::size_type" exists
cudf/cpp/include/cudf/detail/aggregation/aggregation.cuh(346): error: no suitable conversion function from "cudf::dictionary32" to "cudf::size_type" exists
cudf/cpp/src/groupby/sort/group_correlation.cu(87): error: no suitable conversion function from "cudf::dictionary32" to "cudf::size_type" exists
cudf/cpp/src/groupby/hash/multi_pass_kernels.cuh(107): error: no suitable conversion function from "cudf::dictionary32" to "cudf::size_type" exists
cudf/cpp/include/cudf/dictionary/detail/iterator.cuh(41): error: no suitable conversion function from "cudf::dictionary32" to "cudf::size_type" exists

AtomicAdd errors

cudf/cpp/include/cudf/detail/copy_if_else.cuh(97): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/include/cudf/detail/null_mask.cuh(108): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/groupby/sort/group_std.cu(153): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/bitmask/null_mask.cu(303): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/include/cudf/detail/valid_if.cuh(68): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(274): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(375): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(207): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(204): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(264): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(266): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(281): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(209): error: no instance of overloaded function "atomicAdd" matches the argument list
cudf/cpp/src/io/csv/csv_gpu.cu(283): error: no instance of overloaded function "atomicAdd" matches the argument list

Thrust

cudf/cpp/src/groupby/groupby.cu(269): error: no instance of overloaded function "std::transform" matches the argument list
cudf/cpp/src/copying/contiguous_split.cu(631): error: no instance of overloaded function "std::transform" matches the argument list
cudf/cpp/src/groupby/groupby.cu(304): error: no instance of overloaded function "std::all_of" matches the argument list
cudf/cpp/src/hash/md5_hash.cu(343): error: no instance of overloaded function "thrust::for_each" matches the argument list
cudf/cpp/include/cudf/lists/detail/scatter.cuh(245): error: no instance of overloaded function "thrust::sequence" matches the argument list
cudf/cpp/src/groupby/groupby.cu(313): error: no instance of overloaded function "std::transform" matches the argument list
cudf/cpp/src/filling/repeat.cu(124): error: no instance of overloaded function "thrust::upper_bound" matches the argument list

Device span errors

cudf/cpp/include/cudf/table/experimental/row_operators.cuh(848): error: no instance of constructor "std::optional<_Tp>::optional [with _Tp=cudf::device_span<const int, 18446744073709551615UL>]" matches the argument list
cudf/cpp/include/cudf/table/experimental/row_operators.cuh(848): error: no instance of constructor "std::optional<_Tp>::optional [with _Tp=cudf::device_span<const int32_t, 18446744073709551615UL>]" matches the argument list

int typing errors

cudf/cpp/src/binaryop/compiled/binary_ops.cuh(272): error: no instance of function template "cudf::util::div_rounding_up_safe" matches the argument list
cudf/cpp/include/cudf/detail/utilities/cuda.cuh(169): error: no instance of overloaded function "std::clamp" matches the argument list

Assorted errors

cudf/cpp/src/hash/spark_murmurhash3_x86_32.cu(230): error: no instance of overloaded function "std::max" matches the argument list
cudf/cpp/src/copying/purge_nonempty_nulls.cu(93): error: no instance of function template "cudf::detail::gather" matches the argument list
cudf/cpp/include/cudf/detail/copy_if.cuh(166): error: more than one instance of overloaded function "min" matches the argument list:

revans2 · 2023-09-01T14:15:18Z

The java code right now hard codes a signed 32-bits as the size type in many places. We can switch it to 64-bits everywhere along with a dynamic check depending on how the code is compiled. But also just so you are aware Spark has a top level limitation of a singed 32-bit int for the number of rows in a table. We can work around this in some places, but moving the Spark plugin over to a 64-bit index is not going to be super simple.

GregoryKimball added feature request New feature or request 0 - Backlog In queue waiting for assignment libcudf Affects libcudf (C++/CUDA) code. labels Apr 17, 2023

GregoryKimball added this to the Stabilizing large workflows (OOM, spilling, partitioning) milestone Apr 17, 2023

GregoryKimball mentioned this issue Jun 29, 2023

GitHub infra updates #13542

Closed

5 tasks

GregoryKimball changed the title ~~[FEA] Add a build-time option for libcudf to use a 64-bit size type~~ [FEA] Add 64-bit size type option at build-time option for libcudf Jul 22, 2023

GregoryKimball mentioned this issue Jul 22, 2023

[FEA] Increase maximum characters in strings columns #13733

Open

GregoryKimball changed the title ~~[FEA] Add 64-bit size type option at build-time option for libcudf~~ [FEA] Add 64-bit size type option at build-time for libcudf Aug 2, 2023

GregoryKimball mentioned this issue Sep 10, 2023

[FEA] Improve ORC reader filtering and performance #13882

Open

GregoryKimball mentioned this issue Feb 4, 2024

[FEA] Incorporate chunked parquet reading into cuDF-python #14966

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Add 64-bit size type option at build-time for libcudf #13159

[FEA] Add 64-bit size type option at build-time for libcudf #13159

GregoryKimball commented Apr 17, 2023 •

edited

Loading

GregoryKimball commented Sep 1, 2023 •

edited

Loading

revans2 commented Sep 1, 2023

[FEA] Add 64-bit size type option at build-time for libcudf #13159

[FEA] Add 64-bit size type option at build-time for libcudf #13159

Comments

GregoryKimball commented Apr 17, 2023 • edited Loading

GregoryKimball commented Sep 1, 2023 • edited Loading

revans2 commented Sep 1, 2023

GregoryKimball commented Apr 17, 2023 •

edited

Loading

GregoryKimball commented Sep 1, 2023 •

edited

Loading