add `cupy.setxor1d` api #6582

pri1311 · 2022-03-23T13:24:00Z

Linked Issue: #6078

Hey! I have implememeted numpy.setxor1d but seems like when you provide a multi-dim array with assume_unique as True, numpy raises an error, since their code concatenates array as - np.concatenate((ar1,ar2)) and not np.concatenate((ar1,ar2), axis=None).
However, when assume_unique is false, the array is flattened as it uses the numpy.unique function which returns a flattened array. Do we replicate the same behaviour in CuPy or allow multi-dim arrays even when assume_unique is set to true?

Attaching a screenshot below to explain the above:

Open discussion in numpy - numpy/numpy#14670
Currently, the code I have written does not throw errors for multi-dim arrays, regardless of where assume_unique is True or False. However, tests currently don't cover that, since I can't compare it to numpy results. Should I add some manual-comparison tests? Please let me know your opinion.

emcastillo · 2022-03-24T02:28:09Z

We align to the latest supported numpy version in the master branch. If NumPy errors, CuPy should error too unless there is a good reason that prevents this (device synchronization, checks using divergent threads, etc). In such case it should be clearly stated in the docstring.

pri1311 · 2022-03-24T04:04:30Z

We align to the latest supported numpy version in the master branch. If NumPy errors, CuPy should error too

Alright, thanks. Will make the required changes.

asi1024 · 2022-03-24T04:57:02Z

An algorithm using in1d may be faster. Could you compare the performance?

pri1311 · 2022-03-28T14:14:47Z

Hey @asi1024! I wrote the function using in1d and I believe the following is code is what was expected

aux = cupy.concatenate((ar1[in1d(ar1, ar2, assume_unique=True, invert=True)], ar2[in1d(ar2, ar1, assume_unique=True, invert=True)]))
aux.sort()
return aux

However, this seems to run a little slower compared to the code already existing. (As of now I am comparing the time taken by each implementation to run the tests, please let me know if there is a more definite way to compare time).

asi1024 · 2022-04-04T10:18:49Z

@pri1311 Could you try cupyx.profiler.benchmark?

cupy/_logic/truth.py

pri1311 · 2022-04-04T10:54:49Z

@pri1311 Could you try cupyx.profiler.benchmark?

Aah yes, on it.

pri1311 · 2022-04-04T12:40:34Z

These are the current stats, for array sizes x: (3,2) y: (2,5) x: (30,20) y: (20,50) x: (300,200) y: (200,500)

pri1311 · 2022-04-27T09:05:57Z

I replaced the following two lines

flag = cupy.concatenate((cupy.asanyarray([True]), cupy.asanyarray(aux[1:] != aux[:-1]), cupy.asanyarray([True])))
return aux[flag[1:] & flag[:-1]]

with

flag = cupy.concatenate((cupy.asanyarray([True]), boolcompare(aux[1:], aux[:-1]), cupy.asanyarray([True])))
return aux[xorkernel(flag[1:],flag[:-1])]

boolcompare = cupy.ElementwiseKernel(
    'T x, T y',
    'bool z',
    'z = x != y',
    'boolcompare' 
)

and this is the change in performance.

I'll try to make further changes so as to avoid using the variable flag altogether by tonight. But would like to know if this is better than before or I could've done better?

Array sizes used for comparison:

x = cupy.random.rand(300,200)
y = cupy.random.rand(200,500)

Apologies for the delay.
Thank you!

asi1024 · 2022-04-27T09:32:30Z

@pri1311 I think that merging cupy.concatenate and bitwise-and into one ElementwiseKernel will reduce additional memory allocation!

pri1311 · 2022-04-27T09:33:10Z

Yes working on that

pri1311 · 2022-04-27T10:13:20Z

def setxor1dusingkernel(ar1, ar2, assume_unique=False):
    if not assume_unique:
        ar1 = cupy.unique(ar1)
        ar2 = cupy.unique(ar2)

    aux = cupy.concatenate((ar1, ar2), axis=None)
    if aux.size == 0:
        return aux

    aux.sort()

    setxorkernel = cupy.ElementwiseKernel(
        'raw T X, int32 len',
        'T z',
        '''
            if (i == 0) {
                z = (X[0] != X[1]);
            }
            else if (i == len - 1) {
                z = (X[i] != X[i-1]); 
            }
            else {
                z = X[i] != X[i-1] && X[i] != X[i+1];
            }
        ''',
        'setxorkernel' 
    )
    return setxorkernel(aux, aux.size, aux)

How about this?
Results:::

For array sizes

x = cupy.random.rand(300,200)
y = cupy.random.rand(200,500)

x = cupy.random.rand(3000,2000)
y = cupy.random.rand(2000,5000)

asi1024 · 2022-04-27T14:10:15Z

@pri1311 Great! That implementation looks mostly OK except for a few minor comments. Could you apply the fix to this PR?

pri1311 · 2022-04-28T17:34:51Z

Hey, a few tests are failing (and I understand why) let me just make that correction, and will commit the changes. For some reason, the bug didn't show up when I was manually running the function from a separate file.

pri1311 · 2022-05-22T23:21:40Z

Hey @asi1024, apologies for the delay, had my final exams going on. Have added the final changes from my side. Can you review them once?

cupy/_logic/truth.py

Co-authored-by: Akifumi Imanishi <akifumi.imanishi@gmail.com>

asi1024 · 2022-05-31T02:38:15Z

/test mini

asi1024 · 2022-05-31T06:13:45Z

LGTM! Thank you for your PR!

add cupy.setxor1d api

17db871

emcastillo assigned asi1024 Mar 24, 2022

emcastillo added cat:feature New features/APIs prio:medium labels Mar 24, 2022

kmaehashi mentioned this pull request Mar 29, 2022

[Tracker] Implement all numpy.* APIs in CuPy #6078

Open

asi1024 reviewed Apr 4, 2022

View reviewed changes

cupy/_logic/truth.py Outdated Show resolved Hide resolved

asi1024 added the st:awaiting-author Awaiting response from author label Apr 21, 2022

pri1311 added 2 commits May 22, 2022 23:17

fix setxor1d

e24eb05

fix coding style

0d5d14b

asi1024 reviewed May 28, 2022

View reviewed changes

cupy/_logic/truth.py Outdated Show resolved Hide resolved

cupy/_logic/truth.py Outdated Show resolved Hide resolved

cupy/_logic/truth.py Outdated Show resolved Hide resolved

cupy/_logic/truth.py Outdated Show resolved Hide resolved

code refactoring

c08e134

pri1311 requested a review from asi1024 May 30, 2022 02:10

asi1024 reviewed May 30, 2022

View reviewed changes

cupy/_logic/truth.py Outdated Show resolved Hide resolved

Update cupy/_logic/truth.py

bfa89ea

Co-authored-by: Akifumi Imanishi <akifumi.imanishi@gmail.com>

pri1311 requested a review from asi1024 May 30, 2022 15:38

code style changes

e5408ad

asi1024 approved these changes May 31, 2022

View reviewed changes

asi1024 removed the st:awaiting-author Awaiting response from author label May 31, 2022

asi1024 added this to the v11.0.0rc1 milestone May 31, 2022

asi1024 merged commit b5221a4 into cupy:master May 31, 2022

pri1311 deleted the setxor1d branch July 28, 2022 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add `cupy.setxor1d` api #6582

add `cupy.setxor1d` api #6582

pri1311 commented Mar 23, 2022

emcastillo commented Mar 24, 2022

pri1311 commented Mar 24, 2022

asi1024 commented Mar 24, 2022

pri1311 commented Mar 28, 2022

asi1024 commented Apr 4, 2022

pri1311 commented Apr 4, 2022

pri1311 commented Apr 4, 2022 •

edited

pri1311 commented Apr 27, 2022

asi1024 commented Apr 27, 2022

pri1311 commented Apr 27, 2022

pri1311 commented Apr 27, 2022

asi1024 commented Apr 27, 2022

pri1311 commented Apr 28, 2022

pri1311 commented May 22, 2022

asi1024 commented May 31, 2022

asi1024 commented May 31, 2022

add cupy.setxor1d api #6582

add cupy.setxor1d api #6582

Conversation

pri1311 commented Mar 23, 2022

emcastillo commented Mar 24, 2022

pri1311 commented Mar 24, 2022

asi1024 commented Mar 24, 2022

pri1311 commented Mar 28, 2022

asi1024 commented Apr 4, 2022

pri1311 commented Apr 4, 2022

pri1311 commented Apr 4, 2022 • edited

pri1311 commented Apr 27, 2022

asi1024 commented Apr 27, 2022

pri1311 commented Apr 27, 2022

pri1311 commented Apr 27, 2022

asi1024 commented Apr 27, 2022

pri1311 commented Apr 28, 2022

pri1311 commented May 22, 2022

asi1024 commented May 31, 2022

asi1024 commented May 31, 2022

add `cupy.setxor1d` api #6582

add `cupy.setxor1d` api #6582

pri1311 commented Apr 4, 2022 •

edited