Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-merge branch-0.17 to branch-0.18 [skip ci] #6890

Merged
merged 7 commits into from
Dec 3, 2020

Conversation

kkraus14
Copy link
Collaborator

@kkraus14 kkraus14 commented Dec 3, 2020

Auto-merge triggered by push to branch-0.17 that creates a PR to keep branch-0.18 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge.

igormp and others added 7 commits December 2, 2020 22:37
Cleans up apt's cache after installing everything.

Also added the number of cores of the current machine in order to avoid an infinite amount of threads being spawned and using way too much RAM.

Fixes rapidsai#881.

Authors:
  - Igor Moura <imphilippini@gmail.com>
  - Igor Moura <imp2@cin.ufpe.br>
  - Karthikeyan <6488848+karthikeyann@users.noreply.github.com>

Approvers:
  - AJ Schmidt
  - Karthikeyan
  - AJ Schmidt

URL: rapidsai#6619
Closes rapidsai#6478

`cudf::gather` now will not run a pre-pass to check for index validity.

For `out_of_bounds_policy`, remove `FAIL`, while exposing `NULLIFY` and `DONT_CHECK` to user. `NULLIFY` checks out-of-bounds indices and sets them to null rows, while `DONT_CHECK` skips all checks. Using `DONT_CHECK` should yield higher performance, given `gather_map` contains only valid indices.

Note that the negative index (wrap-arounds) policy is unchanged. When gather map dtype is `signed`, wrap-around is applied.

A new Cython binding to `cudf::minmax`, used for Cython `gather` bound checking is added. Will also close rapidsai#6731

Authors:
  - Michael Wang <michelwang0905@icloud.com>
  - Michael Wang <isVoid@users.noreply.github.com>

Approvers:
  - null
  - Devavret Makkar
  - Ashwin Srinath
  - Keith Kraus
  - Jake Hemstad

URL: rapidsai#6875
…#6726)

This PR intends to
- Allow `hash_partition` to select a different hash function (e.g. identity hash function) in additional to `MurmurHash3_32`. (Close rapidsai#6307)
- Remove redundant identical `hash_partition` implementation in `src/hash/hashing.cu`.

Restrictions:
- MD5 is not supported.

Authors:
  - Hao Gao <haog@nvidia.com>

Approvers:
  - Nikolay Sakharnykh
  - Mark Harris
  - Ram (Ramakrishna Prabhu)
  - Mark Harris

URL: rapidsai#6726
…6887)

Fixes typo and 0-d numpy array handling. When numpy scalar is used on lhs while executing binary operation, `__eq__` from numpy returns a 0-d array rather than scalar.

closes rapidsai#6778

Authors:
  - Ramakrishna Prabhu <ramakrishnap@nvidia.com>
  - Ram (Ramakrishna Prabhu) <42624703+rgsl888prabhu@users.noreply.github.com>

Approvers:
  - Keith Kraus

URL: rapidsai#6887
…to cupy for cudf.Series(rapidsai#6839)

This pr adds index handling when dispatching to cupy functions with `__ufunc__` and `__array_function__` for cudf.Series.  

This PR does the following: 

- [x] Adds index handling for `__ufunc__` and `__array_function` (when being dispatched to `cupy`)
- [x] Adds test to ensure the same results as pandas with aligned index  
- [x] Adds tests for appropriate errors non-aligned index 
- [x] Removs support for `list` inputs  (should not have been supported initially too) 


Please note that I am unsure how to handle `list` inputs here. 
The problem being solved here is below:

With this **PR rapidsai#6839** we get the correct index when we do the following:
```python
>>> cudf_s1 = cudf.Series(data=[-1, 2, 3, 0], index=[2, 3, 1, 0])
>>> cudf_s2 = cudf.Series(data=[-1, -2, 3, 0], index=[2, 3, 1, 0])
>>> o = np.logaddexp(cudf_s1, cudf_s2)
>>> o.index
Int64Index([2, 3, 1, 0], dtype='int64')
>>> print(o)
2   -0.306853
3    2.018150
1    3.693147
0    0.693147
dtype: float64
```
On **Master** we get:
```python
>>> cudf_s1 = cudf.Series(data=[-1, 2, 3, 0], index=[2, 3, 1, 0])
>>> cudf_s2 = cudf.Series(data=[-1, -2, 3, 0], index=[2, 3, 1, 0])
>>> o = np.logaddexp(cudf_s1, cudf_s2)
>>> o.index
RangeIndex(start=0, stop=4, step=1)
>>> print(o)
0   -0.306853
1    2.018150
2    3.693147
3    0.693147
dtype: float64
````

Authors:
  - Vibhu Jawa <vjawa@nvidia.com>
  - Vibhu Jawa <vibhujawa@gmail.com>

Approvers:
  - null
  - Michael Wang
  - GALI PREM SAGAR

URL: rapidsai#6839
Expand existing murmur3 hashing functionality to hash the row elements serially rather than using a merge function. Also enables configuring the hash seed and null hash value.

Authors:
  - Ryan Lee <ryanlee@nvidia.com>
  - rwlee <rwlee@users.noreply.github.com>

Approvers:
  - null
  - Mark Harris
  - GALI PREM SAGAR
  - Robert (Bobby) Evans

URL: rapidsai#6781
@kkraus14 kkraus14 added gpuCI 5 - Ready to Merge Testing and reviews complete, ready to merge labels Dec 3, 2020
@kkraus14 kkraus14 requested review from a team as code owners December 3, 2020 20:15
@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

@kkraus14 kkraus14 merged commit 737e715 into rapidsai:branch-0.18 Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants