### Advance Numpy
Advance Indexing methods, 
Broadcasting, 
Other Operations

In [1]:
import numpy as np

### Fancy Indexing

In [3]:
# Fancy Indexing
a = np.arange(12).reshape(4,3)
a

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

- When you don't have pattern to get multiple rows/columns through slicing then you use fancy indexing.

In [4]:
a[[0,1,2]]

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [5]:
a = np.arange(24).reshape(6,4)
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [8]:
a[:,[0,2,3]]

array([[ 0,  2,  3],
       [ 4,  6,  7],
       [ 8, 10, 11],
       [12, 14, 15],
       [16, 18, 19],
       [20, 22, 23]])

In [12]:
a[[0,2,3],0]

array([ 0,  8, 12])

In [13]:
a[[0,3,4],[0,2,1]]

array([ 0, 14, 17])

### Boolean Indexing

In [14]:
a = np.random.randint(1,100,24).reshape(6,4)
a

array([[ 3, 85, 84, 60],
       [61, 83, 45, 54],
       [76, 62, 61, 80],
       [92, 39, 95, 42],
       [47, 24, 76, 73],
       [73, 46, 41, 62]], dtype=int32)

In [15]:
a > 50

array([[False,  True,  True,  True],
       [ True,  True, False,  True],
       [ True,  True,  True,  True],
       [ True, False,  True, False],
       [False, False,  True,  True],
       [ True, False, False,  True]])

In [16]:
a[a>50]

array([85, 84, 60, 61, 83, 54, 76, 62, 61, 80, 92, 95, 76, 73, 73, 62],
      dtype=int32)

- Here you can see that boolean array is masked on original array and the output is given in 1D form.

In [17]:
a[a%2==0]

array([84, 60, 54, 76, 62, 80, 92, 42, 24, 76, 46, 62], dtype=int32)

In [21]:
a[(a%2==0)&(a>50)] # As we are working with boolean values, we use bitwise operators and not logical.

array([84, 60, 54, 76, 62, 80, 92, 76, 62], dtype=int32)

### Broadcasting

In [23]:
# Same shape
a1 = np.arange(6).reshape(2,3)
a2 = np.arange(6,12).reshape(2,3)

print(a1,'\n',a2)
print(a1+a2)

[[0 1 2]
 [3 4 5]] 
 [[ 6  7  8]
 [ 9 10 11]]
[[ 6  8 10]
 [12 14 16]]


In [26]:
# Different shape
a1 = np.arange(6).reshape(2,3)
a2 = np.arange(3).reshape(1,3)

print(a1,'\n',a2)
print(a1+a2)

[[0 1 2]
 [3 4 5]] 
 [[0 1 2]]
[[0 2 4]
 [3 5 7]]


- Numpy automatically performs Broadcasting.
- You can see when there are different shapes then also arithmetic operation get performed. This happens because of **Broadcasting**.
- Smaller array is broadcasted to larger array.
    - *Meaning:* Here **a2** is of shape **(1,3)** but when doing arithmatic operation between **a1 & a2**, then **a2 become same shape as of a1**, and for that a2 needs 1 more row so that is repeated. *a2 = [[0,1,2],[0,1,2]]*
- Same happens for any smaller shape that is being operated on higher shape, smaller ones get broadcasted on higher and gives the result.
---

#### Broadcasting Rules
1. Make two arrays have same number of dimensions.
    - If the number of dimensions of two arrays are different then add 1 to the head of smaller dimension array.

*Eg: a1 shape is (3,3) and a2 shape is (3,) so add 1 in head of a2, that is a2 shape becomes (1,3)*

2. Make each dimension of two arrays the same size.
    - If the sizes of each dimension of two arrays do not match, 1 is stretched to the size of other. Like in above example a2 is (1,3), 1 becomes 3 to match shape of a1 (3,3), so a2 becomes (3,3). Hence, same size of both arrays.
    - If there is a dimension whose size is not 1 in either of two arrays, then it can't be broadcasted and error is raised.

*Eg: a1 with shape (3,2) and a2 with shape (3,) can't be broadcasted.*


In [29]:
a1 = np.arange(6).reshape(3,2)
a2 = np.arange(3)

print(a1,'\n',a2)
print(a1+a2) # Error

[[0 1]
 [2 3]
 [4 5]] 
 [0 1 2]


ValueError: operands could not be broadcast together with shapes (3,2) (3,) 

In [None]:
a = np.arange(3).reshape(1,3)
b = np.arange(4).reshape(4,1)

print(a,'\n',b)
print('\n',a+b)

[[0 1 2]] 
 [[0]
 [1]
 [2]
 [3]]

 [[0 1 2]
 [1 2 3]
 [2 3 4]
 [3 4 5]]


In [31]:
a = np.arange(16).reshape(4,4)
b = np.arange(4).reshape(2,2)

print(a,'\n',b)
print('\n',a+b) # Error

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]] 
 [[0 1]
 [2 3]]


ValueError: operands could not be broadcast together with shapes (4,4) (2,2) 

### Mathematical Formulas

In [33]:
# Sigmoid
def sigmoid(array):
    return 1/(1+np.exp(-(array)))

a = np.arange(6)
sigmoid(a)

array([0.5       , 0.73105858, 0.88079708, 0.95257413, 0.98201379,
       0.99330715])

In [34]:
# Mean Square Error
actual = np.random.randint(1,50,25)
predicted = np.random.randint(1,50,25)

In [35]:
print(actual)
print(predicted)

[47  9 45 40 30 23 34 42 20 35 36 25 12 46 17 38 31 29 13 49 32 24 17 38
 15]
[42 12 40 35 19 26 35 47 32 44 37 44 45 47 43  3 20 21 43 43 34 32 22 30
  4]


In [37]:
def mse(actual, predicted):
   return np.mean((actual-predicted)**2)
mse(actual, predicted)

np.float64(208.68)

### Working with missing value (np.nan)

In [38]:
a = np.array([1,2,3,4,np.nan,5])
a

array([ 1.,  2.,  3.,  4., nan,  5.])

In [43]:
a[~np.isnan(a)].astype(int)

array([1, 2, 3, 4, 5])