Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Add `return_index` option to set operations (`setxor1d`, etc) #458

Closed
wants to merge 1 commit into from

2 participants

@bfroehle

Not ready for merge, just asking for comments.

From numpy.lib.arraysetops:

To do: Optionally return indices analogously to unique for all functions.

I've gone ahead and implemented a return_index parameter for one function -- setxor1d -- and I'd like some feedback before I plunge ahead to write the corresponding options for the rest of the functions (setdiff1d, union1d, etc).

Questions to be addressed:

  1. Is the name return_index alright, even thought it returns two sets of indices? It is the same name as used in np.unique.
  2. Will you accept this without a corresponding return_inverse? I would think yes, as MATLAB doesn't allow for such an option and it's not necessarily obvious what it would even mean.
  3. Code style... is it alright? Or would you prefer more of a global block format:
    def setxor1d(ar1, ar2, assume_sorted=False, return_index=False):

        if return_index:
            < new code >
        else:
            < old code >
@teoliphant
Owner

This seems OK to me.

But, I would prefer that you create a new function for the case of return_index = True. This has always been Guido's suggestion, though we don't follow it very well in NumPy.

For example, you could use the postfix _idxs to indicate the return_index = True:

setxor1d_idxs, etc.

@bfroehle

While we are discussing options, another would be something like argsetxor1d which could be to setxor1d as argsort is to sort.

That is,

>>> a = np.array( [5, 7, 1, 2] )
>>> b = np.array( [2, 4, 3, 1, 5] )
>>> ia, ib = argsetxor1d(a, b)
>>> ia
array([1])
>>> a[ia]
array([7])
>>> ib
array([2, 1])
>>> b[ib]
array([3, 4])
>>> np.alltrue(setxor1d(a, b) == np.sort(np.concatenate((a[ia], b[ib]))))
True
@bfroehle bfroehle closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 24, 2012
  1. @bfroehle
This page is out of date. Refresh to see the latest.
Showing with 82 additions and 6 deletions.
  1. +41 −6 numpy/lib/arraysetops.py
  2. +41 −0 numpy/lib/tests/test_arraysetops.py
View
47 numpy/lib/arraysetops.py
@@ -234,7 +234,7 @@ def intersect1d(ar1, ar2, assume_unique=False):
aux.sort()
return aux[:-1][aux[1:] == aux[:-1]]
-def setxor1d(ar1, ar2, assume_unique=False):
+def setxor1d(ar1, ar2, assume_unique=False, return_index=False):
"""
Find the set exclusive-or of two arrays.
@@ -248,12 +248,21 @@ def setxor1d(ar1, ar2, assume_unique=False):
assume_unique : bool
If True, the input arrays are both assumed to be unique, which
can speed up the calculation. Default is False.
+ return_index : bool, optional
+ If True, return the indices of `ar1` and `ar2` that result in the
+ unique array.
Returns
-------
setxor1d : ndarray
Sorted 1D array of unique values that are in only one of the input
arrays.
+ setxor1d_indices1 : ndarray, optional
+ The indices of the unique values in the (flattened) original ar1 array.
+ Only provided if `return_index` is True.
+ setxor1d_indices2 : ndarray, optional
+ The indices of the unique values in the (flattened) original ar2 array.
+ Only provided if `return_index` is True.
Examples
--------
@@ -262,21 +271,47 @@ def setxor1d(ar1, ar2, assume_unique=False):
>>> np.setxor1d(a,b)
array([1, 4, 5, 7])
+ >>> np.setxor1d(a, b, return_index=True)
+ (array([1, 4, 5, 7]), array([0, 4]), array([2, 3]))
+
"""
if not assume_unique:
- ar1 = unique(ar1)
- ar2 = unique(ar2)
+ if return_index:
+ ar1, idx1 = unique(ar1, True)
+ ar2, idx2 = unique(ar2, True)
+ else:
+ ar1 = unique(ar1)
+ ar2 = unique(ar2)
aux = np.concatenate( (ar1, ar2) )
if aux.size == 0:
- return aux
+ if return_index:
+ return aux, np.empty(0, dtype=int), np.empty(0, dtype=int)
+ else:
+ return aux
+
+ if return_index:
+ perm = aux.argsort()
+ aux = aux[perm]
+ else:
+ aux.sort()
- aux.sort()
# flag = ediff1d( aux, to_end = 1, to_begin = 1 ) == 0
flag = np.concatenate( ([True], aux[1:] != aux[:-1], [True] ) )
# flag2 = ediff1d( flag ) == 0
flag2 = flag[1:] == flag[:-1]
- return aux[flag2]
+
+ if return_index:
+ idx = perm[flag2]
+ n1 = ar1.size
+ i1 = idx[idx < n1]
+ i2 = idx[idx >= n1] - n1
+ if assume_unique:
+ return aux[flag2], i1, i2
+ else:
+ return aux[flag2], idx1[i1], idx2[i2]
+ else:
+ return aux[flag2]
def in1d(ar1, ar2, assume_unique=False):
"""
View
41 numpy/lib/tests/test_arraysetops.py
@@ -108,6 +108,47 @@ def test_setxor1d( self ):
assert_array_equal([], setxor1d([],[]))
+ def test_setxor1d_return_index( self ):
+ a = np.array( [5, 7, 1, 2] )
+ b = np.array( [2, 4, 3, 1, 5] )
+
+ ec = np.array( [3, 4, 7] )
+ eia = np.array( [1] )
+ eib = np.array( [2, 1] )
+
+ c, ia, ib = setxor1d( a, b, return_index=True )
+ assert_array_equal( c, ec )
+ assert_array_equal( ia, eia )
+ assert_array_equal( ib, eib )
+
+ a = np.array( [1, 2, 3] )
+ b = np.array( [6, 5, 4] )
+
+ ec = np.array( [1, 2, 3, 4, 5, 6] )
+ eia = np.array( [0, 1, 2] )
+ eib = np.array( [2, 1, 0] )
+
+ c, ia, ib = setxor1d( a, b, return_index=True )
+ assert_array_equal( c, ec )
+ assert_array_equal( ia, eia )
+ assert_array_equal( ib, eib )
+
+ a = np.array( [1, 8, 2, 3] )
+ b = np.array( [6, 5, 4, 8] )
+
+ ec = np.array( [1, 2, 3, 4, 5, 6] )
+ eia = np.array( [0, 2, 3] )
+ eib = np.array( [2, 1, 0] )
+ c, ia, ib = setxor1d( a, b, return_index=True )
+ assert_array_equal( c, ec )
+ assert_array_equal( ia, eia )
+ assert_array_equal( ib, eib )
+
+ c, ia, ib = setxor1d( [], [], return_index=True )
+ assert_array_equal( c, [] )
+ assert_array_equal( ia, [] )
+ assert_array_equal( ib, [] )
+
def test_ediff1d(self):
zero_elem = np.array([])
one_elem = np.array([1])
Something went wrong with that request. Please try again.