Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cudf::reduce for decimal32 and decimal64 (part 1) #6814

Merged
merged 28 commits into from
Dec 8, 2020

Conversation

codereport
Copy link
Contributor

@codereport codereport commented Nov 20, 2020

This PR resolves a part of #3556.

This is part 1 of 2. The PR implements MIN, MAX, SUM & PRODUCT & NUNIQUE.

Reduction Ops:

  enum Kind {
    SUM,             ///< sum reduction
    PRODUCT,         ///< product reduction
    MIN,             ///< min reduction
    MAX,             ///< max reduction
    COUNT_VALID,     ///< count number of valid elements
    COUNT_ALL,       ///< count number of elements
    ANY,             ///< any reduction
    ALL,             ///< all reduction
    SUM_OF_SQUARES,  ///< sum of squares reduction
    MEAN,            ///< arithmetic mean reduction
    VARIANCE,        ///< groupwise variance
    STD,             ///< groupwise standard deviation
    MEDIAN,          ///< median reduction
    QUANTILE,        ///< compute specified quantile(s)
    ARGMAX,          ///< Index of max element
    ARGMIN,          ///< Index of min element
    NUNIQUE,         ///< count number of unique elements
    NTH_ELEMENT,     ///< get the nth element
    ROW_NUMBER,      ///< get row-number of element
    COLLECT,         ///< collect values into a list
    LEAD,            ///< window function, accesses row at specified offset following current row
    LAG,             ///< window function, accesses row at specified offset preceding current row
    PTX,             ///< PTX UDF based reduction
    CUDA             ///< CUDA UDf based reduction
  };

To Do List:

  • SUM
    • Implementation
    • Basic unit tests
    • Comprehensive unit tests
  • PRODUCT
    • Implementation
    • Basic unit tests
    • Comprehensive unit tests
  • MAX
    • Implementation
    • Basic unit tests
    • Comprehensive unit tests
  • MIN
    • Implementation
    • Basic unit tests
    • Comprehensive unit tests

Operations that "fell out":

  • NUNIQUE
    • Implementation
    • Basic unit tests

@codereport codereport added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. labels Nov 20, 2020
@codereport codereport requested a review from a team as a code owner November 20, 2020 00:52
@codereport codereport added this to PR-WIP in v0.17 Release via automation Nov 20, 2020
@codereport codereport self-assigned this Nov 20, 2020
@codereport codereport requested review from trxcllnt, cwharris and davidwendt and removed request for cwharris November 20, 2020 00:52
@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@codecov
Copy link

codecov bot commented Nov 20, 2020

Codecov Report

Merging #6814 (8c994f9) into branch-0.18 (917759b) will increase coverage by 0.43%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #6814      +/-   ##
===============================================
+ Coverage        81.57%   82.01%   +0.43%     
===============================================
  Files               96       96              
  Lines            15912    16267     +355     
===============================================
+ Hits             12980    13341     +361     
+ Misses            2932     2926       -6     
Impacted Files Coverage Δ
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/_version.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_orc.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_json.py 100.00% <0.00%> (ø)
...ython/dask_cudf/dask_cudf/io/tests/test_parquet.py 100.00% <0.00%> (ø)
python/cudf/cudf/utils/applyutils.py 98.74% <0.00%> (+0.02%) ⬆️
python/cudf/cudf/core/join/join.py 92.44% <0.00%> (+0.03%) ⬆️
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 917759b...8c994f9. Read the comment docs.

@codereport codereport force-pushed the fp-reduce branch 2 times, most recently from 2b48661 to 442c8e1 Compare November 26, 2020 03:29
@codereport codereport added the non-breaking Non-breaking change label Dec 2, 2020
@codereport codereport changed the title [WIP] Implement cudf::reduce for decimal32 and decimal64 Implement cudf::reduce for decimal32 and decimal64 (part 1) Dec 7, 2020
@codereport codereport added 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond and removed 2 - In Progress Currently a work in progress labels Dec 7, 2020
@rgsl888prabhu rgsl888prabhu moved this from PR-WIP to PR-Needs review in v0.18 Release Dec 7, 2020
v0.18 Release automation moved this from PR-Needs review to PR-Reviewer approved Dec 7, 2020
Copy link
Contributor

@rgsl888prabhu rgsl888prabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, small questions and changes.

cpp/src/reductions/simple.cuh Outdated Show resolved Hide resolved
cpp/src/reductions/simple.cuh Outdated Show resolved Hide resolved
v0.18 Release automation moved this from PR-Reviewer approved to PR-Needs review Dec 7, 2020
cpp/include/cudf/scalar/scalar.hpp Outdated Show resolved Hide resolved
@codereport codereport added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Dec 8, 2020
v0.18 Release automation moved this from PR-Needs review to PR-Reviewer approved Dec 8, 2020
@codereport codereport merged commit 9120992 into rapidsai:branch-0.18 Dec 8, 2020
v0.18 Release automation moved this from PR-Reviewer approved to Done Dec 8, 2020
rapids-bot bot pushed a commit that referenced this pull request Dec 17, 2020
This PR resolves a part of #3556.

Supporting `cudf::reduce`:
1. Part 1 (`MIN`, `MAX`, `SUM` & `PRODUCT` & `NUNIQUE`) #6814
2. Part 2 (the rest) ◀️ 

**Reduction Ops:**

**Done in Previous PR**
✔️  `SUM,             ///< sum reduction`
✔️ `PRODUCT,         ///< product reduction`
✔️ `MIN,             ///< min reduction`
✔️ `MAX,             ///< max reduction`
✔️ `NUNIQUE,         ///< count number of unique elements`

**Not supported by `cudf::reduce`:**
* [x] `COUNT_VALID,     ///< count number of valid elements`
* [x] `COUNT_ALL,       ///< count number of elements`
* [x] `COLLECT,         ///< collect values into a list`
* [x] `LEAD,            ///< window function, accesses row at specified offset following current row`
* [x] `LAG,             ///< window function, accesses row at specified offset preceding current row`
* [x] `PTX,             ///< PTX UDF based reduction`
* [x] `CUDA             ///< CUDA UDf based reduction`
* [x] `ARGMAX,          ///< Index of max element`
* [x] `ARGMIN,          ///< Index of min element`
* [x] `ROW_NUMBER,      ///< get row-number of element`

**Won't be supported:**
* [x] `ANY,             ///< any reduction`
* [x] `ALL,             ///< all reduction`

**To Do / Investigate:**
* [x] `SUM_OF_SQUARES,  ///< sum of squares reduction`
* [x] `MEDIAN,          ///< median reduction`
* [x] `QUANTILE,        ///< compute specified quantile(s)`
* [x] `NTH_ELEMENT,     ///< get the nth element`

**Deferred until requested**
* [x] `MEAN,            ///< arithmetic mean reduction`
* [x] `VARIANCE,        ///< groupwise variance`
* [x] `STD,             ///< groupwise standard deviation`

Authors:
  - Conor Hoekstra <codereport@outlook.com>

Approvers:
  - null
  - Karthikeyan
  - David

URL: #6980
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
No open projects
v0.18 Release
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants