# NumPy Set Operations 

In [9]:
import numpy as np

## What is Set

A set in mathematics is a collection of unique elements.

Sets are used for operations involving frequent intersection, union and difference operations.

## Create Sets in NumPy

We can use NumPy's unique() method to find unique elements from any array. E.g. create a set array, but remember that the set arrays should only be 1-D arrays.

In [10]:
# Convert following array with repeated elements to a set

arr1 = np.array([1,2,1,1,4,3,2,1,3,5,3,2,3,4,5,2,1,6,4,7,6,7])
s1 = np.unique(arr1)
print(s1)

[1 2 3 4 5 6 7]


## Finding Union

To find the unique values of two arrays, use the union1d() method

In [11]:
# Find union of the following two set arrays

a1 = np.array([1,2,3,4])
a2 = np.array([3,4,5,6])
u1 = np.union1d(a1, a2)
print(u1)

[1 2 3 4 5 6]


## Finding Intersection

To find only the values that are present in both arrays, use the intersect1d() method

In [12]:
b1 = np.array([1,2,3,4,5])
b2 = np.array([4,5,6,7,8])

i1 = np.intersect1d(b1, b2, assume_unique=True)
print(i1)

[4 5]


Note: the intersect1d() method takes an optional argument assume_unique, which if set to True can speed up computation. It should always be set to True when dealing with sets.

## Finding Difference

To find only the values in the first set that is NOT present in the seconds set, use the setdiff1d() method.

In [13]:
# Find the differece of the set1 from set2

set1 = np.array([1,2,3,4,5])
set2 = np.array([4,5,6,7,8])

dif = np.setdiff1d(set1, set2, assume_unique=True)
print(dif)

[1 2 3]


Note: the setdiff1d() method takes an optional argument assume_unique, which if set to True can speed up computation. It should always be set to True when dealing with sets.

## Finding Symmetric Difference

To find only the values that are NOT present in BOTH sets, use the setxor1d() method.

In [14]:
# Find the symmetric difference of the set1 and set2

set1 = np.array([1,2,3,4,5])
set2 = np.array([3,4,5,6,7])

x = np.setxor1d(set1, set2, assume_unique=True)
print(x)

[1 2 6 7]


Note: the setxor1d() method takes an optional argument assume_unique, which if set to True can speed up computation. It should always be set to True when dealing with sets.