[RELEASE] cudf v22.04 #10512

ajschmidt8 · 2022-03-24T20:36:31Z

❄️ Code freeze for `branch-22.04` and v22.04 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-22.04 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-22.04 into main for the release

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

Signed-off-by: Peixin Li <pxli@nyu.edu> Update cudfjni version to 22.04.0-SNAPSHOT Authors: - Peixin (https://github.com/pxLi) Approvers: - Jason Lowe (https://github.com/jlowe) - Nghia Truong (https://github.com/ttnghia) URL: #10084

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

This PR relates to #7960 and indicates that new benchmarks should be written using [NVBench](https://github.com/NVIDIA/nvbench). (This PR does not resolve that issue because it does not transition any existing benchmarks to nvbench, it is only an update in the developer guide.) Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Mark Harris (https://github.com/harrism) URL: #10093

Bug: This is currently throwing exceptions due to cython failing to encode the responses from the python callback automatically. Authors: - Jeremy Dyer (https://github.com/jdye64) Approvers: - https://github.com/brandon-b-miller - https://github.com/jsmaupin - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #10103

This PR removes docker-related files that are no longer necessary. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #10069

The test `test_dataframe.py::test_memory_usage_multi` is currently flaky. In theory it can fail for any value of the `rows` parameter, but in practice we only observe failures for the smaller value of 10. The reason for this is that the data for the `MultiIndex` is being constructed by randomly sampling from an array of size 3, and for a sufficiently small sample (e.g. 10) the probability that selection will not actually include all three values (e.g. a sample of `[0, 1, 1, 1, 0, 1, 1, 0, 0, 1]`) is not vanishingly small and occurs with observable frequency. The resulting `MultiIndex` will encode the levels for that column as a column with only two values, and as a result the column will occupy 8 fewer bytes (one 64 bit integer or float) less of space than expected. This PR changes that by always sampling without replacement from an array of the same length as the number of rows. I could also have fixed this problem by fixing a random seed that ensures that all the values are always sampled, but I made this change instead because 1) it more clearly conveys the intent, and 2) fixing a seed is a change that we should discuss and apply globally across all our tests. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Ashwin Srinath (https://github.com/shwina) URL: #10114

This PR is a follow-up to #9917 and should be merged after that PR. This resolves #9695 and resolves #5401. The implementation here is only a first pass, but in the interest of prioritizing a working feature for the upcoming release I'm postponing making various additional changes (including some breaking ones). Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Jason Lowe (https://github.com/jlowe) - David Wendt (https://github.com/davidwendt) - Robert Maynard (https://github.com/robertmaynard) URL: #10037

To address part of #5773 This allows to run benchmarks for only specific iterations using environment variable `CUDF_BENCHMARK_ITERATIONS`. except when benchmark definition itself specifies iteration count. Also, makes pool as static to allocate pool memory resource only once per binary. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Conor Hoekstra (https://github.com/codereport) - Vukasin Milovanovic (https://github.com/vuule) - Nghia Truong (https://github.com/ttnghia) URL: #10060

This depends on #9941 and #10037. Adds Java bindings for mixed left semi join and mixed left anti join. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Alessandro Bellina (https://github.com/abellina) - Thomas Graves (https://github.com/tgravescs) - Jim Brennan (https://github.com/jbrennan333) URL: #10040

This PR avoid make `nan_as_null` a no-op if `nan_count` is 0, i.e., there are no nulls present. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - https://github.com/brandon-b-miller URL: #10082

@robertmaynard

This PR addresses suggestions from @robertmaynard, @karthikeyann, and @harrism in #10093 to remove the suffix `_benchmark` from the name of the benchmark files. The developer guide has been updated to reflect this change. I left a few shared files with the word "benchmark" in them, such as [cpp/benchmarks/fixture/benchmark_fixture.hpp](https://github.com/rapidsai/cudf/blob/cfb6cbe6a8e4b01d7ee28fbe3212b5f2eb275167/cpp/benchmarks/fixture/benchmark_fixture.hpp) and [cpp/benchmarks/fixture/templated_benchmark_fixture.hpp](https://github.com/rapidsai/cudf/blob/cfb6cbe6a8e4b01d7ee28fbe3212b5f2eb275167/cpp/benchmarks/fixture/templated_benchmark_fixture.hpp). I built the benchmarks locally to ensure that nothing was broken (I'm noting this because I recall that our CI does not build the benchmarks). Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Karthikeyan (https://github.com/karthikeyann) - Nghia Truong (https://github.com/ttnghia) URL: #10112

This commit cleans up some of the redundancy in CUDF JNI native code, and modifies some of the `unique_ptr` handling: ### Improvements 1. Added utility function to convert from `native_jpointerArray<column_view>` to `vector<column_view>`. Removed redundant code in `stringConcat` functions. 2. Added utility to convert from `unique_ptr<column>` to `jlong`. This eliminates variables from nearly all the JNI functions. 3. Added utility functions for extracting `std::vector` from `native_jArray` classes. This reduces raw loops in `TableJni.cpp`. 4. Eliminated unnecessary use of `unique_ptr` in locations where they are immediately `release()`d. (E.g. `ColumnVectorJni::getNativeColumnView()`.) 5. Removed unnecessary `std::move()` calls. 6. Removed unnecessary `reinterpret_cast` (E.g. in `copyColumnViewToCV()`.) 7. `typedef` -> `using`. Authors: - MithunR (https://github.com/mythrocks) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) - Jason Lowe (https://github.com/jlowe) URL: #10019

This PR removes the deprecated method `Series.set_index`. Resolves #9541, follows up on #9529. Authors: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Christopher Harris (https://github.com/cwharris) - Vyas Ramasubramani (https://github.com/vyasr) URL: #9945

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

Closes #10043 A metadata singleton was allocated from the current rmm memory-resource inside the nvtext normalizer functions. If the memory resource is later changed, the metadata pointer may become invalidated. This PR removes the singleton pattern. The normalizer is used by the subword-tokenizer which is passed a vocabulary structure that is built only once and is maintained by the caller. The metadata has been added to this structure so it's lifetime can share the same scope. The normalizer can also be called directly through the `nvtext::normalize_characters` API. Here the metadata table (size about 1MB) is now created on each call. This showed only significant performance impact on benchmarks testing a small number (<50K) of shorter strings. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Robert Maynard (https://github.com/robertmaynard) URL: #10090

This PR fixes an issue in caching `apply` UDFs where we were incorrectly getting a cache miss on the same function/dataframe. This was happening because `dict_values` objects don't compare as equal. Authors: - https://github.com/brandon-b-miller Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Charles Blackmon-Luca (https://github.com/charlesbluca) - Ashwin Srinath (https://github.com/shwina) URL: #10133

@robertmaynard

Updates cmake-format script to branch 22.04. (@robertmaynard Is this supposed to be updated whenever we bump the release version?) Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Robert Maynard (https://github.com/robertmaynard) - AJ Schmidt (https://github.com/ajschmidt8) URL: #10132

This PR resolves #10076. It improves `gpu_utils.py` by removing code for handling CUDA < 11.0, which we no longer support. This is marked as "breaking" because of minor Python API changes. I changed the name of an error class from `UnSupportedCUDAError` to `UnsupportedCUDAError` and removed an unused error class named `UnSupportedGPUError`. It appears the `UnSupportedGPUError` class was introduced in #4692 but has never been used in the code. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Ashwin Srinath (https://github.com/shwina) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #10113

… reader (#10127) The option was put in place to prevent Python users of the read_orc API to read decimal columns as 128bit, since this type was not supported in Python at the time. Now that decimal128 in supported in Python, this option can be removed. The change in technically breaking, but it is very unlikely that anyone is using the API. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Conor Hoekstra (https://github.com/codereport) - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #10127

Fixes: #10117 This PR adds a duplicate column validation check in `ColumnAccessor.rename_levels` Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Charles Blackmon-Luca (https://github.com/charlesbluca) - Vyas Ramasubramani (https://github.com/vyasr) URL: #10120

This PR removes the `drop_nan` parameter from the internal API `IndexedFrame._drop_na_rows`. Its behavior was unused internally in cudf (always set to `True` in the public API `IndexedFrame.dropna`). The behavior of `drop_nan=False` was untested until 22.02 hotfix #10123, when an issue was found in gpu-bdb. However, that code can use the public API `df.dropna(axis=0)` instead. See rapidsai/gpu-bdb#228. This is marked as a non-breaking change because it only affects internal APIs. Resolves #10125, follows up on #10123. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) URL: #10140

Pariatlly resolves: #10129 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #10142

Currently, Java binding for `strings::split` calls the API with the default max split value. This PR adds a parameter for it, allowing to specify max split from Java. This is non-breaking because the Java code implicitly uses the default max split value. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Jason Lowe (https://github.com/jlowe) URL: #10137

Reference #9856 specifically #9856 (comment) Adds `cudf::strings::findall_record` which was initially implemented in nvstrings but not ported over since LIST column types did not exist at the time and returning a vector of small columns was very inefficient. This API should also allow using the current python function `cudf.str.findall()` with the `expand=False` parameter more effectively. A follow-on PR will address these python changes. This PR reorganizes the libcudf strings _find_ source files into the `cpp/src/strings/search` subdirectory as well. Also, `findall()` has only a regex version so the `_re` suffix is dropped from the name in the libcudf implementation. The python changes in this PR address only the name change and the addition of the new API in the cython interface. Depends on #9909 -- shares the `cudf::strings::detail::count_matches()` utility function. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) URL: #9911

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

benchmark fixture - fix sharing static object pointer, replace with static shared pointer. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Mike Wilson (https://github.com/hyperbolic2346) URL: #10145

`cudf::jni::convert_table_for_return()` is usually used on tables returned from a libcudf API call. It currently requires an l-value reference for its table argument. This necessitates parking the result of libcudf call in an avoidable temp variable. This commit adds the option to use an r-value reference. This allows table expressions to be used directly, reducing clutter. Note: 1. The previous signature is retained, because not all call sites can use the r-value interface cleanly. (E.g. when the libcudf call is complex.) 2. The third argument (vector<unique_ptr<column>>) has been converted from l-ref to r-ref, so that an empty default can be introduced. This commit also includes minor code cleanup in the periphery of calls to `convert_table_for_return()`. Authors: - MithunR (https://github.com/mythrocks) Approvers: - Jason Lowe (https://github.com/jlowe) URL: #10131

Closes #10006 Fixes a use case where the regex pattern creates a set of instructions that can cause the regex evaluation process to go into an infinite loop. For example, the pattern `(x?)+` creates the following instructions: ``` Instructions: 0: CHAR c='x', next=2 1: OR right=0, left=2, next=2 2: RBRA id=1, next=4 3: LBRA id=1, next=1 4: OR right=3, left=5, next=5 5: END startinst_id=3 startinst_ids: [ 3 -1] ``` This causes in an infinite loop at instruction 4 where the path may go like: 4->3->1->2->4 ... forever. Supporting this pattern does not look possible. The `+` quantifier is applied to capture group symbol `)` inside of which `x?` means 0 or more repeating the character `x`. This means it could match `x` or nothing and so applying the `+` to nothing would be invalid. That said, the pattern `x?+` currently already throws an error because of the invalid usage of `+` quantifier. Therefore, the fix here adds a checking step after the instruction set is created to check for a possible infinite-loop case. If one is detected, an exception is thrown indicating the pattern is not supported. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Devavret Makkar (https://github.com/devavret) - Vyas Ramasubramani (https://github.com/vyasr) URL: #10095

review-notebook-app · 2022-03-24T20:36:38Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Adds JNI bindings for t-digest reductions (in commit 756ed20) Authors: - https://github.com/nvdbaranec - Andy Grove (https://github.com/andygrove) Approvers: - Jason Lowe (https://github.com/jlowe)

This PR closes #10527, an issue with outdated code in our "10 minutes" notebook. The material changes are on lines 65 and 66. Authors: - Ashwin Srinath (https://github.com/shwina) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice)

This PR pins Click to a version supported by the version of black that we use. #10523 is too disruptive to backport to 22.04 in the middle of a release, so this is a minimum change to unblock linter runs. Authors: - Vyas Ramasubramani (https://github.com/vyasr) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar)

@nvdbaranec

…seen in NDS q72 in Spark (#10534) The following change addresses a performance degradation we noticed in the `mixed_join` and `compute_mixed_join_output_size` that looks to be tied to the theoretical occupancy of these kernels, as limited by the number of registers used. The regression is triggered by this patch: #9727, which improves handling of unreachable code paths. That said, somehow, this change is altering the number of registers these kernels need. Both `mixed_join` and `compute_mixed_join_output_size` are very sensitive to the register count, per NSight compute. With the patch, the register required changed from 92 to 102, and 118 to 141 respectively. The fix here hints the compiler what our block size is (128 threads). This, from our testing, allows the compiler to reduce the number of registers required to 128 for `compute_mixed_join_output_size` and 96 for `mixed_join`. This lead to better occupancy (I think @nvdbaranec measured it going from 30% to 50%) and I saw the wall clock time of q72 (which started all this) to go from 133s to 121s, which is within the ballpark I'd expect. Authors: - Alessandro Bellina (https://github.com/abellina) Approvers: - Mike Wilson (https://github.com/hyperbolic2346)

CMake 3.23 appears to have a bug relating to CUDA support that breaks our builds. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8)

codecov · 2022-03-31T12:06:25Z

Codecov Report

Merging #10512 (4c84184) into main (53a31d1) will increase coverage by 75.73%.
The diff coverage is n/a.

❗ Current head 4c84184 differs from pull request most recent head b50ae82. Consider uploading reports for the commit b50ae82 to get more accurate results

@@             Coverage Diff             @@
##             main   #10512       +/-   ##
===========================================
+ Coverage   10.42%   86.15%   +75.73%     
===========================================
  Files         119      141       +22     
  Lines       20607    22510     +1903     
===========================================
+ Hits         2148    19394    +17246     
+ Misses      18459     3116    -15343

Impacted Files	Coverage Δ
...ython/custreamz/custreamz/tests/test_dataframes.py	`99.39% <0.00%> (-0.01%)`	⬇️
python/custreamz/custreamz/_version.py	`0.00% <0.00%> (ø)`
python/dask_cudf/dask_cudf/__init__.py	`82.35% <0.00%> (ø)`
python/dask_cudf/dask_cudf/_version.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/_fuzz_testing/fuzzer.py	`0.00% <0.00%> (ø)`
python/dask_cudf/dask_cudf/io/tests/test_orc.py	`100.00% <0.00%> (ø)`
...ython/dask_cudf/dask_cudf/io/tests/test_parquet.py	`100.00% <0.00%> (ø)`
python/cudf/cudf/core/udf/pipeline.py
python/dask_cudf/dask_cudf/tests/test_struct.py	`100.00% <0.00%> (ø)`
python/cudf/cudf/core/udf/templates.py	`100.00% <0.00%> (ø)`
... and 111 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 774d859...b50ae82. Read the comment docs.

This commit introduces JNI bindings to retrieve the major and minor CUDA compute capability versions for the current CUDA device. This feature enables introspection from `spark-rapids` to detect the GPU architecture, for model-specific behaviour. This is required from NVIDIA/spark-rapids/pull/5122, to work around the erroneous behaviour of JNI `fixed_width_convert_to_rows()` on Pascal GPUs (#10569), (which in turn produces failures like NVIDIA/spark-rapids/issues/4980). Authors: - MithunR (https://github.com/mythrocks) Approvers: - https://github.com/nvdbaranec - Jason Lowe (https://github.com/jlowe) - Nghia Truong (https://github.com/ttnghia)

@bdice

@bdice spotted some copyright failures in #10512 This is because CI copyright checks were introduced in between of `22.04` release and some files that were modified prior to that didn't have their copyrights updated, so this PR updates those missing files. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - AJ Schmidt (https://github.com/ajschmidt8)

Replaces the deprecated use of `as_gpu_matrix` with `to_cupy`, and also updates some deprecated calls to `cupy.fromDlpack`. Closes #10582 Authors: - Charles Blackmon-Luca (https://github.com/charlesbluca) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - AJ Schmidt (https://github.com/ajschmidt8)

GPUtester and others added 30 commits January 20, 2022 18:07

Merge pull request #10099 from rapidsai/branch-22.02

57ff6f5

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

Merge pull request #10106 from rapidsai/branch-22.02

f1e0bb6

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

Merge pull request #10110 from rapidsai/branch-22.02

ea905d7

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

Remove unnecessary docker files. (#10069)

6a77acc

This PR removes docker-related files that are no longer necessary. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #10069

Avoid nan_as_null op if nan_count is 0 (#10082)

127537e

This PR avoid make `nan_as_null` a no-op if `nan_count` is 0, i.e., there are no nulls present. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - https://github.com/brandon-b-miller URL: #10082

Merge pull request #10135 from rapidsai/branch-22.02

eacaea0

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

Deprecate decimal_cols_as_float in ORC reader (#10142)

b684f17

Pariatlly resolves: #10129 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #10142

Merge pull request #10148 from rapidsai/branch-22.02

1246116

[gpuCI] Forward-merge branch-22.02 to branch-22.04 [skip gpuci]

ajschmidt8 requested a review from a team as a code owner March 24, 2022 20:36

ajschmidt8 requested review from trxcllnt, isVoid and bdice March 24, 2022 20:36

github-actions bot added CMake CMake build issue conda Java Affects Java cuDF API. Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Mar 24, 2022

Add Java bindings for t-digest reduction (#10446)

c71fe1b

Adds JNI bindings for t-digest reductions (in commit 756ed20) Authors: - https://github.com/nvdbaranec - Andy Grove (https://github.com/andygrove) Approvers: - Jason Lowe (https://github.com/jlowe)

galipremsagar approved these changes Mar 28, 2022

View reviewed changes

shwina and others added 9 commits March 29, 2022 14:50

Pin click version to last support by black<22.3.0.

daea0dd

Fix formatting.

e8dc00c

Update build.sh

25e5e73

Update build.sh

d30b4f6

Pin CMake to prevent 3.23 bugs.

1cccc29

Merge pull request #10544 from vyasr/pin_cmake

a02b7c2

CMake 3.23 appears to have a bug relating to CUDA support that breaks our builds. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - AJ Schmidt (https://github.com/ajschmidt8)

bdice approved these changes Apr 5, 2022

View reviewed changes

galipremsagar mentioned this pull request Apr 5, 2022

[REVIEW] Update copyrights with respect to main #10595

Merged

galipremsagar and others added 3 commits April 5, 2022 11:15

update changelog

b50ae82

raydouglass merged commit 803c42a into main Apr 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] cudf v22.04 #10512

[RELEASE] cudf v22.04 #10512

ajschmidt8 commented Mar 24, 2022

review-notebook-app bot commented Mar 24, 2022

codecov bot commented Mar 31, 2022 •

edited

Loading

[RELEASE] cudf v22.04 #10512

[RELEASE] cudf v22.04 #10512

Conversation

ajschmidt8 commented Mar 24, 2022

❄️ Code freeze for branch-22.04 and v22.04 release

What does this mean?

What is the purpose of this PR?

review-notebook-app bot commented Mar 24, 2022

codecov bot commented Mar 31, 2022 • edited Loading

Codecov Report

❄️ Code freeze for `branch-22.04` and v22.04 release

codecov bot commented Mar 31, 2022 •

edited

Loading