Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java bindings for approx_percentile #9094

Merged
merged 58 commits into from
Sep 24, 2021

Conversation

andygrove
Copy link
Contributor

@andygrove andygrove commented Aug 23, 2021

This PR builds on #8983 and adds Java bindings.

@github-actions github-actions bot added CMake CMake build issue Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Aug 23, 2021
nvdbaranec and others added 4 commits August 25, 2021 11:59
…zes (still need to add code to ignore them during cluster building). Use groupby group_labels to

avoid use of lower_bound() during reduction of values.
Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it is looking good. I just am commenting on the java code though.

java/src/main/java/ai/rapids/cudf/ColumnView.java Outdated Show resolved Hide resolved
java/src/main/native/src/ColumnViewJni.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

* a t-digest aggregation.
*
* @param percentiles Required percentiles [0,1]
* @return the percentiles as doubles, in the same order passed in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is the output type a list of doubles or what?

* a t-digest aggregation.
*
* @param percentiles Column containing percentiles [0,1]
* @return the percentiles as doubles, in the same order passed in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what is the output type?

nvdbaranec and others added 4 commits August 30, 2021 16:10
…roupby in lieu of lower_bound checks during merge aggregation. Proper handling of null

input values for regular tdigest aggregation.
…digests. Return null lists

when calling percentile_approx on empty tdigests.
@github-actions github-actions bot removed CMake CMake build issue conda libcudf Affects libcudf (C++/CUDA) code. labels Sep 24, 2021
@andygrove andygrove changed the title [WIP] Java bindings for approx percentile Java bindings for approx_percentile Sep 24, 2021
@andygrove andygrove marked this pull request as ready for review September 24, 2021 14:55
@andygrove andygrove requested a review from a team as a code owner September 24, 2021 14:55
@andygrove andygrove added the feature request New feature or request label Sep 24, 2021
@codecov
Copy link

codecov bot commented Sep 24, 2021

Codecov Report

Merging #9094 (6076788) into branch-21.10 (3ee3ecf) will decrease coverage by 0.05%.
The diff coverage is 11.44%.

❗ Current head 6076788 differs from pull request most recent head 1c1aad4. Consider uploading reports for the commit 1c1aad4 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.10    #9094      +/-   ##
================================================
- Coverage         10.85%   10.80%   -0.06%     
================================================
  Files               115      116       +1     
  Lines             19158    19321     +163     
================================================
+ Hits               2080     2087       +7     
- Misses            17078    17234     +156     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/dataframe.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/csv.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/orc.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/text.py 0.00% <0.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ba76310...1c1aad4. Read the comment docs.

@revans2
Copy link
Contributor

revans2 commented Sep 24, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 0b89459 into rapidsai:branch-21.10 Sep 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants