### np.where()

#### Applying Conditions Using np.where()

In [37]:
death_count = np.array([2131,534,3453,57,12432,4542,1241,5457])

In [38]:
# Lets find the instances where the number of deaths where it was below 3000
few_deaths = np.where(death_count<3000)

In [44]:
few_deaths

(array([0, 1, 3, 6], dtype=int64),)

It returns a tuple of indices that matches the condition.

The numpy where function is a vectorized version of the ternary expression <b>x if confidtion else y</b>.

In [2]:
import numpy as np

In [3]:
xarr = np.array([1.2,2.3,1.5,5.6,3.6])

In [4]:
yarr = np.array([2.4,5.8,3.5,1.6,8.4])

In [5]:
cond = np.array([True,False,True,False,True])

Lets see how we can combine the three arrays based on the states (True/False) in the cond array

In [6]:
# Traditional Way
result = [(x if c else y)
         for x,y,c in zip(xarr,yarr,cond)]

In [7]:
result

[1.2, 5.7999999999999998, 1.5, 1.6000000000000001, 3.6000000000000001]

In [8]:
# The The same can be achiveed using "np.where()"

np.where(cond, xarr, yarr)

array([ 1.2,  5.8,  1.5,  1.6,  3.6])

In [12]:
### The second and third arguments to np.where don't need to be arrays necessarily.
### One or both of them can be scalars
### Lets look at an example

data  = np.random.randn(4,4)

In [13]:
data

array([[-0.28493669,  1.4424988 , -0.88979428,  0.10438424],
       [-1.32028956, -1.99189442, -0.3002915 , -1.27869392],
       [-0.64660648, -0.37212835, -0.65319629, -1.26543881],
       [-0.06156641,  0.1550349 , -0.10034801, -0.09341672]])

In [14]:
np.where(data>0, 1, -1) # When greater than 0, make it 1 or else make it -1

array([[-1,  1, -1,  1],
       [-1, -1, -1, -1],
       [-1, -1, -1, -1],
       [-1,  1, -1, -1]])

In [15]:
data

array([[-0.28493669,  1.4424988 , -0.88979428,  0.10438424],
       [-1.32028956, -1.99189442, -0.3002915 , -1.27869392],
       [-0.64660648, -0.37212835, -0.65319629, -1.26543881],
       [-0.06156641,  0.1550349 , -0.10034801, -0.09341672]])

In [16]:
### Lets try something else. When greater than 0, make it 1 or else keep it as such.

np.where(data>0, 1, data)

array([[-0.28493669,  1.        , -0.88979428,  1.        ],
       [-1.32028956, -1.99189442, -0.3002915 , -1.27869392],
       [-0.64660648, -0.37212835, -0.65319629, -1.26543881],
       [-0.06156641,  1.        , -0.10034801, -0.09341672]])

In the example above, only the positive values are replaced, the negative ones are not.

In [14]:
### Another complicated example

In [20]:
cond1 = np.array([True, False, True, True, True, False])
cond2 = np.array([False, False, True, False, True, True])
result = []
for i in range(len(cond1)):
    if cond1[i] and cond2[i]: # add 0 when both are True
        result.append(0)
    elif cond1[i]: # add 1 when cond1 alone is True
        result.append(1)
    elif cond2[i]: # add 2 when cond2 alone is True
        result.append(2)
    else: # add 3 when both are false
        result.append(3)

In [21]:
result

[1, 3, 0, 1, 0, 2]

The same can be converted into a "nested where expression" as follows

In [23]:
np.where(cond1 & cond2, 0, np.where(cond1, 1, np.where(cond2, 2, 3)))

array([1, 3, 0, 1, 0, 2])

The code above follows the same syntax as "np.where(cond, x, y)" which is to be read as "x if cond else y".

In [26]:
### Something to be understood here is that the AND operator doesn't take lists as operands. As you can see below

cond3 = [True, False, True, True, True, False]
cond4 = [False, False, True, False, True, True]

In [27]:
type(cond3)

list

In [28]:
np.where(cond3 & cond4, 0,
        np.where(cond3, 1,
                np.where(cond3, 2, 3)))

TypeError: unsupported operand type(s) for &: 'list' and 'list'

In [29]:
### It gives an error as follows ---> unsupported operand type(s) for &: 'list' and 'list'.
### Hence its better to use numpy arrays in these scenarios

In [30]:
### Refer alternate-cyptic-syntax to see how it could be done without the use of where function and the rest of the topics