<a href="https://colab.research.google.com/github/Rohan-1103/Data-Science/blob/main/session_15_numpy_tricks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### np.sort(iterable)

- Default sorting algorithm for sort function is quicksort(n**2)

Return a sorted copy of an array.

https://numpy.org/doc/stable/reference/generated/numpy.sort.html

In [1]:
# code
import numpy as np
a = np.random.randint(1,100,15)
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [2]:
b = np.random.randint(1,100,24).reshape(6,4)
b

array([[94, 43, 94, 90],
       [85, 31, 83, 13],
       [48,  1, 43, 58],
       [90, 44, 48, 76],
       [78, 15, 68, 14],
       [26, 21, 97, 43]])

In [3]:
np.sort(a)[::-1]

array([97, 77, 68, 68, 60, 55, 48, 44, 42, 41, 23, 18,  6,  4,  4])

In [4]:
print(np.sort(b,axis=0))    # Column wise sorting
print()
print(np.sort(b))           # Row wise sorting (default)

[[26  1 43 13]
 [48 15 48 14]
 [78 21 68 43]
 [85 31 83 58]
 [90 43 94 76]
 [94 44 97 90]]

[[43 90 94 94]
 [13 31 83 85]
 [ 1 43 48 58]
 [44 48 76 90]
 [14 15 68 78]
 [21 26 43 97]]


### np.append(iterable, value)

The numpy.append() appends values along the mentioned axis at the end of the array

https://numpy.org/doc/stable/reference/generated/numpy.append.html

In [5]:
# code
np.append(a,200)

array([ 44,  60,   4,  23,  42,  68,  97,   4,   6,  77,  68,  48,  41,
        55,  18, 200])

In [6]:
b

array([[94, 43, 94, 90],
       [85, 31, 83, 13],
       [48,  1, 43, 58],
       [90, 44, 48, 76],
       [78, 15, 68, 14],
       [26, 21, 97, 43]])

In [7]:
np.append(b,np.random.random((b.shape[0],1)),axis=1)
# If no axis provided, converts to 1D

array([[94.        , 43.        , 94.        , 90.        ,  0.23760913],
       [85.        , 31.        , 83.        , 13.        ,  0.16105028],
       [48.        ,  1.        , 43.        , 58.        ,  0.2386133 ],
       [90.        , 44.        , 48.        , 76.        ,  0.54872041],
       [78.        , 15.        , 68.        , 14.        ,  0.30795366],
       [26.        , 21.        , 97.        , 43.        ,  0.52850162]])

### np.concatenate(iterable(s), axis)

numpy.concatenate() function concatenate a sequence of arrays along an existing axis.

https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html

In [8]:
# code
c = np.arange(6).reshape(2,3)
d = np.arange(6,12).reshape(2,3)

print(c)
print(d)

[[0 1 2]
 [3 4 5]]
[[ 6  7  8]
 [ 9 10 11]]


In [9]:
print(np.concatenate((c,d)))
print(np.concatenate((c,d),axis=1))        # Similar to hstack

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
[[ 0  1  2  6  7  8]
 [ 3  4  5  9 10 11]]


### np.unique(iterable)

With the help of np.unique() method, we can get the unique values from an array given as parameter in np.unique() method.

https://numpy.org/doc/stable/reference/generated/numpy.unique.html/

In [10]:
# code
e = np.array([1,1,2,2,7,7,7,3,3,4,4,5,5,6,6])

In [11]:
np.unique(e)

array([1, 2, 3, 4, 5, 6, 7])

### np.expand_dims()

With the help of Numpy.expand_dims() method, we can get the expanded dimensions of an array

https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html


- ML/Deep Learning: Row vector, column vector. As ML/DL work in batches

In [12]:
# code
a.shape

(15,)

In [13]:
np.expand_dims(a,axis=0).shape

(1, 15)

In [14]:
np.expand_dims(a,axis=1)

array([[44],
       [60],
       [ 4],
       [23],
       [42],
       [68],
       [97],
       [ 4],
       [ 6],
       [77],
       [68],
       [48],
       [41],
       [55],
       [18]])

### np.where(condition, True, False)

The numpy.where() function returns the <u>indices</u> of elements in an input array where the given condition is satisfied.

https://numpy.org/doc/stable/reference/generated/numpy.where.html

In [15]:
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [16]:
# find all indices with value greater than 50
np.where(a>50)

(array([ 1,  5,  6,  9, 10, 13]),)

In [17]:
# replace all values > 50 with 0
np.where(a>50,0,a)

array([44,  0,  4, 23, 42,  0,  0,  4,  6,  0,  0, 48, 41,  0, 18])

In [18]:
np.where(a%2 == 0,0,a)

array([ 0,  0,  0, 23,  0,  0, 97,  0,  0, 77,  0,  0, 41, 55,  0])

### np.argmax()

The numpy.argmax() function returns <u>indices of the max element of the array</u> in a particular axis.

https://numpy.org/doc/stable/reference/generated/numpy.argmax.html

In [19]:
# code
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [20]:
np.argmax(a)

np.int64(6)

In [21]:
b

array([[94, 43, 94, 90],
       [85, 31, 83, 13],
       [48,  1, 43, 58],
       [90, 44, 48, 76],
       [78, 15, 68, 14],
       [26, 21, 97, 43]])

In [22]:
np.argmax(b,axis=0)

array([0, 3, 5, 0])

In [23]:
np.argmax(b,axis=1)

array([0, 0, 3, 0, 0, 2])

### np.argmin()

In [24]:
# np.argmin
np.argmin(a)

np.int64(2)

### np.cumsum()

numpy.cumsum() function is used when we want to compute the <u>cumulative sum of array elements</u> over a given axis.

https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html

In [25]:
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [26]:
np.cumsum(a)

array([ 44, 104, 108, 131, 173, 241, 338, 342, 348, 425, 493, 541, 582,
       637, 655])

In [27]:
b

array([[94, 43, 94, 90],
       [85, 31, 83, 13],
       [48,  1, 43, 58],
       [90, 44, 48, 76],
       [78, 15, 68, 14],
       [26, 21, 97, 43]])

In [28]:
print(np.cumsum(b,axis=1))         # If no axis -> 1D array
print(np.cumsum(b,axis=0))         # If no axis -> 1D array

[[ 94 137 231 321]
 [ 85 116 199 212]
 [ 48  49  92 150]
 [ 90 134 182 258]
 [ 78  93 161 175]
 [ 26  47 144 187]]
[[ 94  43  94  90]
 [179  74 177 103]
 [227  75 220 161]
 [317 119 268 237]
 [395 134 336 251]
 [421 155 433 294]]


In [29]:
np.cumsum(b)

array([  94,  137,  231,  321,  406,  437,  520,  533,  581,  582,  625,
        683,  773,  817,  865,  941, 1019, 1034, 1102, 1116, 1142, 1163,
       1260, 1303])

### np.cumprod()

In [30]:
# np.cumprod
np.cumprod(a)

array([                  44,                 2640,                10560,
                     242880,             10200960,            693665280,
                67285532160,         269142128640,        1614852771840,
            124343663431680,     8455369113354240,   405857717441003520,
       -1806577658628407296, -7128050856014643200,   822293107703283712])

In [31]:
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

### np.percentile()

numpy.percentile()function used to compute the nth percentile of the given data (array elements) along the specified axis.

https://numpy.org/doc/stable/reference/generated/numpy.percentile.html

In [32]:
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [33]:
print(np.percentile(a,100))         # Max
print(np.percentile(a,50))          # Median
print(np.percentile(a,0))           # Min

97.0
44.0
4.0


In [34]:
np.median(a)

np.float64(44.0)

### np.histogram()

Numpy has a built-in numpy.histogram() function which <u>represents the frequency of data distribution</u> in the graphical form.

https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

In [35]:
# code
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [36]:
print(np.histogram(a,bins=[0,50,100]))
print(np.histogram(a,bins=[0,10, 20, 30, 40, 50, 60, 70, 80, 90, 100]))

(array([9, 6]), array([  0,  50, 100]))
(array([3, 1, 1, 0, 4, 1, 3, 1, 0, 1]), array([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100]))


### np.corrcoef()

Return Pearson product-moment correlation coefficients.

https://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html

In [37]:
salary = np.array([20000,40000,25000,35000,60000])
experience = np.array([1,3,2,4,2])

print(np.corrcoef(salary,experience))

[[1.         0.25344572]
 [0.25344572 1.        ]]


### np.isin

With the help of numpy.isin() method, we can see that one array having <u>values</u> are checked in a different numpy array having different elements with different sizes.

https://numpy.org/doc/stable/reference/generated/numpy.isin.html

In [38]:
# code
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [39]:
items = [10,20,30,40,50,60,70,80,90,100]

a[np.isin(a,items)]

array([60])

### np.flip

The numpy.flip() function reverses the order of array elements along the specified axis, preserving the shape of the array.

https://numpy.org/doc/stable/reference/generated/numpy.flip.html

In [40]:
# code
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [41]:
np.flip(a)

array([18, 55, 41, 48, 68, 77,  6,  4, 97, 68, 42, 23,  4, 60, 44])

In [42]:
b

array([[94, 43, 94, 90],
       [85, 31, 83, 13],
       [48,  1, 43, 58],
       [90, 44, 48, 76],
       [78, 15, 68, 14],
       [26, 21, 97, 43]])

In [43]:
print(np.flip(b,axis=1))
print(np.flip(b,axis=0))
print(np.flip(b))

[[90 94 43 94]
 [13 83 31 85]
 [58 43  1 48]
 [76 48 44 90]
 [14 68 15 78]
 [43 97 21 26]]
[[26 21 97 43]
 [78 15 68 14]
 [90 44 48 76]
 [48  1 43 58]
 [85 31 83 13]
 [94 43 94 90]]
[[43 97 21 26]
 [14 68 15 78]
 [76 48 44 90]
 [58 43  1 48]
 [13 83 31 85]
 [90 94 43 94]]


### np.put

The numpy.put() function <u>replaces specific elements</u> of an array with given values of p_array. Array indexed works on flattened array.

https://numpy.org/doc/stable/reference/generated/numpy.put.html

In [44]:
# code
a

array([44, 60,  4, 23, 42, 68, 97,  4,  6, 77, 68, 48, 41, 55, 18])

In [45]:
np.put(a,[0,1],[110,530])
a

array([110, 530,   4,  23,  42,  68,  97,   4,   6,  77,  68,  48,  41,
        55,  18])

### np.delete

The numpy.delete() function returns a new array with the deletion of sub-arrays along with the mentioned axis.

https://numpy.org/doc/stable/reference/generated/numpy.delete.html

In [46]:
# code
a

array([110, 530,   4,  23,  42,  68,  97,   4,   6,  77,  68,  48,  41,
        55,  18])

In [47]:
np.delete(a,[0,2,4])

array([530,  23,  68,  97,   4,   6,  77,  68,  48,  41,  55,  18])

### Set functions

- np.union1d
- np.intersect1d
- np.setdiff1d
- np.setxor1d
- np.in1d

In [48]:
m = np.array([1,2,3,4,5])
n = np.array([3,4,5,6,7])

np.union1d(m,n)

array([1, 2, 3, 4, 5, 6, 7])

In [49]:
np.intersect1d(m,n)

array([3, 4, 5])

In [50]:
print(np.setdiff1d(n,m))
print(np.setdiff1d(m,n))

[6 7]
[1 2]


In [51]:
np.setxor1d(m,n)

array([1, 2, 6, 7])

In [52]:
m[np.in1d(m,1)]

  m[np.in1d(m,1)]


array([1])

### np.clip

numpy.clip() function is used to Clip (limit)/(range bound) the values in an array.

https://numpy.org/doc/stable/reference/generated/numpy.clip.html

In [53]:
# code
a

array([110, 530,   4,  23,  42,  68,  97,   4,   6,  77,  68,  48,  41,
        55,  18])

In [54]:
np.clip(a,a_min=25,a_max=75)

array([75, 75, 25, 25, 42, 68, 75, 25, 25, 75, 68, 48, 41, 55, 25])

In [55]:
# 17. np.swapaxes

In [56]:
# 18. np.uniform

In [57]:
# 19. np.count_nonzero

In [58]:
# 21. np.tile
# https://www.kaggle.com/code/abhayparashar31/best-numpy-functions-for-data-science-50?scriptVersionId=98816580

In [59]:
# 22. np.repeat
# https://towardsdatascience.com/10-numpy-functions-you-should-know-1dc4863764c5

In [60]:

# 25. np.allclose and equals