- **These are some of the numpy functions which are very effective but less talked about.**

### 1. `np.sort`

- Returns a sorted copy of an array.
- Here we get a sorted array instead of a list which we get if we use `sorted()`.

https://numpy.org/doc/stable/reference/generated/numpy.sort.html

In [2]:
import numpy as np

In [2]:
# Getting 15 random numbers in 1d array

a = np.random.randint(1,100,15)
a

array([20, 27, 88, 51, 42, 73, 80, 40, 14, 52, 78, 48, 73, 61, 84])

In [3]:
# 2d array

b = np.random.randint(1,100,24).reshape(6,4)
b

array([[84, 83, 82, 21],
       [98, 70, 46, 37],
       [29, 83, 72,  6],
       [20, 44, 30, 73],
       [35,  6, 20, 21],
       [77, 94, 98,  3]])

In [4]:
# Now in sorted way
# By default it is in Ascending order

np.sort(a)

array([14, 20, 27, 40, 42, 48, 51, 52, 61, 73, 73, 78, 80, 84, 88])

In [5]:
# Here by default it gets sorted rowwise

np.sort(b)

array([[21, 82, 83, 84],
       [37, 46, 70, 98],
       [ 6, 29, 72, 83],
       [20, 30, 44, 73],
       [ 6, 20, 21, 35],
       [ 3, 77, 94, 98]])

In [6]:
# Here to sort columnwise we need to use "axis=0"

np.sort(b,axis=0)

array([[20,  6, 20,  3],
       [29, 44, 30,  6],
       [35, 70, 46, 21],
       [77, 83, 72, 21],
       [84, 83, 82, 37],
       [98, 94, 98, 73]])

In [7]:
# To sort in descending order we need to use the "[::-1]"

np.sort(a)[::-1]

array([88, 84, 80, 78, 73, 73, 61, 52, 51, 48, 42, 40, 27, 20, 14])

### 2. `np.append`

- The **numpy.append()** appends values along the mentioned axis at the end of the array.

https://numpy.org/doc/stable/reference/generated/numpy.append.html

In [8]:
# Appending 200 with the array a

np.append(a, 200)

array([ 20,  27,  88,  51,  42,  73,  80,  40,  14,  52,  78,  48,  73,
        61,  84, 200])

In [9]:
b

array([[84, 83, 82, 21],
       [98, 70, 46, 37],
       [29, 83, 72,  6],
       [20, 44, 30, 73],
       [35,  6, 20, 21],
       [77, 94, 98,  3]])

In [10]:
# If we want to add a column at the end whose all values will be 1

np.append(b,np.ones((b.shape[0],1)), axis=1)

array([[84., 83., 82., 21.,  1.],
       [98., 70., 46., 37.,  1.],
       [29., 83., 72.,  6.,  1.],
       [20., 44., 30., 73.,  1.],
       [35.,  6., 20., 21.,  1.],
       [77., 94., 98.,  3.,  1.]])

In [11]:
# If we want to add a column at the end with random values between 0 and 1

np.append(b,np.random.random((b.shape[0],1)), axis=1)

array([[84.        , 83.        , 82.        , 21.        ,  0.80967253],
       [98.        , 70.        , 46.        , 37.        ,  0.32096948],
       [29.        , 83.        , 72.        ,  6.        ,  0.77742789],
       [20.        , 44.        , 30.        , 73.        ,  0.46492067],
       [35.        ,  6.        , 20.        , 21.        ,  0.99623787],
       [77.        , 94.        , 98.        ,  3.        ,  0.47447311]])

### 3. `np.concatenate`

- **numpy.concatenate()** function concatenate a sequence of arrays along an existing axis.
- It mainly used with `2d` arrays i.e. tabular data.

https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html

In [12]:
c = np.arange(6).reshape(2,3)
d = np.arange(6,12).reshape(2,3)

print(c)
print("\n")
print(d)

[[0 1 2]
 [3 4 5]]


[[ 6  7  8]
 [ 9 10 11]]


In [13]:
# Doing concatination rowwise

np.concatenate((c,d), axis=0)

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [14]:
# Doing concatination columnwise

np.concatenate((c,d), axis=1)

array([[ 0,  1,  2,  6,  7,  8],
       [ 3,  4,  5,  9, 10, 11]])

### 4. `np.unique`

- With the help of **np.unique()** method, we can get the unique values from an array given as parameter in np.unique() method.

https://numpy.org/doc/stable/reference/generated/numpy.unique.html/

In [15]:
e = np.array([1,1,2,2,3,3,4,4,5,5,6,6])
e

array([1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6])

In [16]:
# Finding the unique items

np.unique(e)

array([1, 2, 3, 4, 5, 6])

### 5. `np.expand_dims`

- With the help of **Numpy.expand_dims()** method, we can get the expanded dimensions of an array.
- It is a bit less used function.
- This is used for creation of **row vector** and **column vector**.

https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html

In [20]:
print("The shape of array a is: ", a.shape)

The shape of array a is:  (15,)


In [21]:
print("The dimension of array a is: ", a.ndim)

The dimension of array a is:  1


In [23]:
# Now converting it into 2d array

np.expand_dims(a, axis=0).shape

(1, 15)

In [24]:
# So now the array will look like
# row vector

np.expand_dims(a, axis=0)

array([[20, 27, 88, 51, 42, 73, 80, 40, 14, 52, 78, 48, 73, 61, 84]])

In [25]:
# And if we want it to be 15X1

np.expand_dims(a, axis=1).shape

(15, 1)

In [26]:
# Now it looks like
# column vector

np.expand_dims(a, axis=1)

array([[20],
       [27],
       [88],
       [51],
       [42],
       [73],
       [80],
       [40],
       [14],
       [52],
       [78],
       [48],
       [73],
       [61],
       [84]])

### 6. `np.where`

- The **numpy.where()** function returns the indices of elements in an input array where the given condition is satisfied.

https://numpy.org/doc/stable/reference/generated/numpy.where.html

In [27]:
a

array([20, 27, 88, 51, 42, 73, 80, 40, 14, 52, 78, 48, 73, 61, 84])

In [29]:
# find all indices with value greater than 50
# The resultant array has the index positions of all the values where the condition is matched

np.where(a>50)

(array([ 2,  3,  5,  6,  9, 10, 12, 13, 14], dtype=int64),)

In [30]:
# now replace all values > 50 with 0
# Here the syntax is "(condition, true, false)"
# So here we are replacing all the true positions with number 0 
# And leaving the rest as it is as they are False (not matching the condition)

np.where(a>50, 0, a)

array([20, 27,  0,  0, 42,  0,  0, 40, 14,  0,  0, 48,  0,  0,  0])

In [31]:
# replacing all the even numbers with 0

np.where(a%2 == 0, 0, a)

array([ 0, 27,  0, 51,  0, 73,  0,  0,  0,  0,  0,  0, 73, 61,  0])

### 7. `np.argmax`

- The **numpy.argmax()** function returns indices of the max element of the array in a particular axis.

https://numpy.org/doc/stable/reference/generated/numpy.argmax.html

In [32]:
a = np.random.randint(1,100,15)
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [33]:
# Getting the index of the biggest number

np.argmax(a)

7

In [35]:
b = np.random.randint(1,100,24).reshape(6,4)
b

array([[55, 67, 89, 70],
       [89, 80, 58, 20],
       [30, 21, 70, 50],
       [77, 10, 38, 45],
       [43, 67, 58, 39],
       [62,  8, 49, 65]])

In [36]:
# Finding the index of biggest number column wise

np.argmax(b, axis=0)

array([1, 1, 0, 0], dtype=int64)

In [37]:
# Finding the index of biggest number row wise

np.argmax(b,axis=1)

array([2, 0, 2, 0, 1, 3], dtype=int64)

In [38]:
# np.argmin: to find the minimum

np.argmin(a)

9

### 8. `np.cumsum`

- **numpy.cumsum()** function is used when we want to compute the cumulative sum of array elements over a given axis.

https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html

In [39]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [40]:
# Doing the cumulative sum

np.cumsum(a)

array([ 60, 123, 147, 194, 245, 289, 326, 424, 487, 492, 574, 590, 641,
       654, 731])

In [41]:
b

array([[55, 67, 89, 70],
       [89, 80, 58, 20],
       [30, 21, 70, 50],
       [77, 10, 38, 45],
       [43, 67, 58, 39],
       [62,  8, 49, 65]])

In [44]:
# If we give axis=1 we get  row wise

np.cumsum(b, axis=1)

array([[ 55, 122, 211, 281],
       [ 89, 169, 227, 247],
       [ 30,  51, 121, 171],
       [ 77,  87, 125, 170],
       [ 43, 110, 168, 207],
       [ 62,  70, 119, 184]])

In [45]:
# If we give axis=0 we get  column wise

np.cumsum(b, axis=0)

array([[ 55,  67,  89,  70],
       [144, 147, 147,  90],
       [174, 168, 217, 140],
       [251, 178, 255, 185],
       [294, 245, 313, 224],
       [356, 253, 362, 289]])

In [46]:
# If we don't provide any axis then it will get converted to 1d

np.cumsum(b)

array([  55,  122,  211,  281,  370,  450,  508,  528,  558,  579,  649,
        699,  776,  786,  824,  869,  912,  979, 1037, 1076, 1138, 1146,
       1195, 1260])

In [48]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [49]:
# np.cumprod: to find the products

np.cumprod(a)

array([         60,        3780,       90720,     4263840,   217455840,
         978122368,  1830789248,  -971280128, -1061105920, -1010562304,
       -1261730304,  1287151616,  1220222976, -1316970496,  1672486912])

### 9. `np.percentile`

- **numpy.percentile()** function used to compute the `nth` percentile of the given data (array elements) along the specified axis. 

https://numpy.org/doc/stable/reference/generated/numpy.percentile.html

In [50]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [51]:
# Getting the 50th percentile
# It means the number where the half is behind it and half is ahead of it

np.percentile(a, 50)

51.0

In [52]:
# Getting the same with median()

np.median(a)

51.0

In [53]:
# So the 100 percentile is the highest number

np.percentile(a, 100)

98.0

In [54]:
# So the 0 percentile is the lowest number

np.percentile(a, 0)

5.0

### 10. `np.histogram`

- Numpy has a built-in **numpy.histogram()** function which represents the frequency of data distribution in the graphical form.

https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

In [55]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [57]:
# Here we are getting data with a binsize of 10

np.histogram(a, bins=[0,10,20,30,40, 50,60,70,80,90, 100])

(array([1, 2, 1, 1, 2, 2, 3, 1, 1, 1], dtype=int64),
 array([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100]))

In [58]:
# Doing same with a bin of 0-50 and 51-100

np.histogram(a, bins=[0, 50, 100])

(array([7, 8], dtype=int64), array([  0,  50, 100]))

### 11. `np.corrcoef`

- Return Pearson product-moment correlation coefficients.

https://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html

In [59]:
salary = np.array([20000,40000,25000,35000,60000])
experience = np.array([1,3,2,4,2])


# Correlation between salary and experience
# Here the correlation is 0.25
np.corrcoef(salary, experience)

array([[1.        , 0.25344572],
       [0.25344572, 1.        ]])

### 12. `np.isin`

- With the help of **numpy.isin()** method, we can see that one array having values are checked in a different numpy array having different elements with different sizes.
- We can search multiple items in one go inside an array with the help of this function.

https://numpy.org/doc/stable/reference/generated/numpy.isin.html

In [60]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [61]:
items = [10,20,30,40,50,60,70,80,90,100]

a[np.isin(a,items)]

array([60])

### 13. `np.flip`

- The **numpy.flip()** function reverses the order of array elements along the specified axis, preserving the shape of the array.

https://numpy.org/doc/stable/reference/generated/numpy.flip.html

In [62]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [63]:
# Flipping the array

np.flip(a)

array([77, 13, 51, 16, 82,  5, 63, 98, 37, 44, 51, 47, 24, 63, 60])

In [64]:
b

array([[55, 67, 89, 70],
       [89, 80, 58, 20],
       [30, 21, 70, 50],
       [77, 10, 38, 45],
       [43, 67, 58, 39],
       [62,  8, 49, 65]])

In [65]:
# Row wise flipping

np.flip(b, axis=0)

array([[62,  8, 49, 65],
       [43, 67, 58, 39],
       [77, 10, 38, 45],
       [30, 21, 70, 50],
       [89, 80, 58, 20],
       [55, 67, 89, 70]])

In [66]:
# Column wise flipping

np.flip(b, axis=1)

array([[70, 89, 67, 55],
       [20, 58, 80, 89],
       [50, 70, 21, 30],
       [45, 38, 10, 77],
       [39, 58, 67, 43],
       [65, 49,  8, 62]])

In [68]:
# If no axis is provided then both flip will take place

np.flip(b)

array([[65, 49,  8, 62],
       [39, 58, 67, 43],
       [45, 38, 10, 77],
       [50, 70, 21, 30],
       [20, 58, 80, 89],
       [70, 89, 67, 55]])

### 14. `np.put`

- The **numpy.put()** function replaces specific elements of an array with given values of `p_array`. 
- Array indexed works on flattened array. 

https://numpy.org/doc/stable/reference/generated/numpy.put.html

In [67]:
a

array([60, 63, 24, 47, 51, 44, 37, 98, 63,  5, 82, 16, 51, 13, 77])

In [69]:
# replacing the 0th and 1st index number with 110 and 530
# Syntax is: (array_name, [list of index positions], [list of values])
# These changes are permanent

np.put(a, [0,1], [110,530])

In [70]:
# Now if we see the array

a

array([110, 530,  24,  47,  51,  44,  37,  98,  63,   5,  82,  16,  51,
        13,  77])

### 15. `np.delete`

- The **numpy.delete()** function returns a new array with the deletion of sub-arrays along with the mentioned axis. 

https://numpy.org/doc/stable/reference/generated/numpy.delete.html

In [71]:
a

array([110, 530,  24,  47,  51,  44,  37,  98,  63,   5,  82,  16,  51,
        13,  77])

In [72]:
# Deleting the 0th, 2nd and 4th index numbers
# Syntax is: (array_name, [list of index positions want to delete])
# This change is temporary

np.delete(a, [0,2,4])

array([530,  47,  44,  37,  98,  63,   5,  82,  16,  51,  13,  77])

### 16. `Set functions`

- If we use these functions we will get the results in numpy array and not in sets. 
  - **np.union1d**
  - **np.intersect1d**
  - **np.setdiff1d**
  - **np.setxor1d**
  - **np.in1d**

In [73]:
m = np.array([1,2,3,4,5])
n = np.array([3,4,5,6,7])

print(m)
print(n)

[1 2 3 4 5]
[3 4 5 6 7]


In [74]:
# union

np.union1d(m,n)

array([1, 2, 3, 4, 5, 6, 7])

In [75]:
# intersection

np.intersect1d(m,n)

array([3, 4, 5])

In [76]:
# difference

np.setdiff1d(n,m)

array([6, 7])

In [77]:
# leaving the common items

np.setxor1d(m,n)

array([1, 2, 6, 7])

In [78]:
# Checking whether an item is present in a set or not
# Finding 1 in array m

np.in1d(m, 1)

array([ True, False, False, False, False])

In [79]:
# Getting the item 1 using the above logic as filter

m[np.in1d(m, 1)]

array([1])

### 17. `np.clip`

- **numpy.clip()** function is used to Clip (limit) the values in an array.

https://numpy.org/doc/stable/reference/generated/numpy.clip.html

In [80]:
a

array([110, 530,  24,  47,  51,  44,  37,  98,  63,   5,  82,  16,  51,
        13,  77])

In [82]:
# Clipping the values in the range between 25 and 75
# Here all the values which are lower than the min are replaced by the min value
# Same happens with the max value

np.clip(a, a_min=25, a_max=75)

array([75, 75, 25, 47, 51, 44, 37, 75, 63, 25, 75, 25, 51, 25, 75])

### 18. `np.swapaxes`

- It interchanges two axes of an array.

https://numpy.org/doc/stable/reference/generated/numpy.swapaxes.html

In [3]:
x = np.array([[1,2,3]])
x

array([[1, 2, 3]])

In [4]:
# Here the syntax is "(array_name, axis1, axis2)"

np.swapaxes(x, 0, 1)

array([[1],
       [2],
       [3]])

In [5]:
# Another example

x = np.array([[[0,1],[2,3]],[[4,5],[6,7]]])
x

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [6]:
np.swapaxes(x, 0, 2)

array([[[0, 4],
        [2, 6]],

       [[1, 5],
        [3, 7]]])

### 19. `np.uniform`

- It Draw samples from a uniform distribution.
- It mainly works with the random class of numpy.

https://numpy.org/doc/stable/reference/random/generated/numpy.random.uniform.html

In [7]:
# Syntax : numpy.random.uniform(low=0.0, high=1.0, size=None)

np.random.uniform(-5, 5, 100)

array([-1.9635658 ,  3.16145569, -4.21733583, -2.86265512,  3.52017265,
        4.81627106,  0.25792731,  3.74349264, -0.64903851, -2.72207614,
       -0.91719735, -2.59729104,  1.8075472 ,  1.9549762 , -1.39062413,
        2.54245382,  3.70792803,  0.44764339,  2.17887749, -1.50869446,
       -0.190727  , -1.62141717, -3.00887454, -0.31848572,  1.92718225,
        0.47014419,  1.81400861,  2.74089592,  2.99734316,  0.25013516,
        1.6762288 ,  0.35625612,  2.14408079, -3.13967813,  0.941644  ,
        1.37039114,  3.52072017,  0.27229069,  1.94957195, -1.280962  ,
        0.39116906,  4.87252026, -1.68803444,  2.10243881,  0.75554701,
       -1.0056304 ,  4.47831652, -1.6047053 ,  2.4591861 , -0.48502428,
       -4.85061949, -2.34439495,  3.73338302, -1.42835918, -3.32538073,
       -1.77637587, -3.73064729,  2.1352968 ,  0.38832276, -2.74951265,
       -0.88938952, -3.87081215,  2.49647514, -3.26302722,  2.81946983,
        3.62967207,  0.50318911,  1.35705296,  2.80045569,  1.17

### 20. `np.count_nonzero`

- It counts the number of non-zero values in the array.

https://numpy.org/doc/stable/reference/generated/numpy.count_nonzero.html

In [8]:
a = np.array([[0, 1, 7, 0],[3, 0, 2, 19]])
a

array([[ 0,  1,  7,  0],
       [ 3,  0,  2, 19]])

In [9]:
np.count_nonzero(a)

5

### 21. `np.tile`

- Construct an array by repeating `A` the number of times given by reps.
- Syntax is:
> `np.tile(A, reps)`

https://numpy.org/doc/stable/reference/generated/numpy.tile.html

https://www.kaggle.com/code/abhayparashar31/best-numpy-functions-for-data-science-50?scriptVersionId=98816580

In [10]:
np.tile("Arunava", 5)

array(['Arunava', 'Arunava', 'Arunava', 'Arunava', 'Arunava'], dtype='<U7')

### 22. `np.repeat`

- Repeat elements of an array. Syntax is:
> `np.repeat(a, repeats, axis=None)`

https://numpy.org/doc/stable/reference/generated/numpy.repeat.html

https://towardsdatascience.com/10-numpy-functions-you-should-know-1dc4863764c5

In [11]:
x = np.array([[1,2],[3,4]])
x

array([[1, 2],
       [3, 4]])

In [12]:
np.repeat(x, 2)

array([1, 1, 2, 2, 3, 3, 4, 4])

### 23. `np.allclose` and `equals`

- **np.allclose()** returns `True` if two arrays are element-wise equal within a tolerance.
- It finds whether two arrays are equal or approximately equal to each other based on some tolerance value if the shape of both arrays is the same.

https://numpy.org/doc/stable/reference/generated/numpy.allclose.html



In [13]:
a = np.array([0.25,0.4,0.6,0.32])
b = np.array([0.26,0.3,0.7,0.32])
tolerance = 0.1

np.allclose(a, b, tolerance)

False