## Numpy Tricks.
- Some more important functions in numpy that are useful in analysis of data.

In [1]:
import numpy as np 

In [9]:
# we have two array those are used in whole notebook. 

# 1D array. 
a = np.random.randint(1, 100, 20, dtype = np.int64)

# 2D Array. 
b = np.random.randint(1, 100, 20).reshape(4, 5)

### np.sort

Return a sorted copy of a numpy array.

https://numpy.org/doc/stable/reference/generated/numpy.sort.html

In [15]:
# sort one dimension array in ascending order. 
np.sort(a)

array([10, 11, 14, 20, 22, 22, 26, 35, 40, 51, 52, 59, 66, 72, 75, 77, 81,
       94, 94, 94])

In [16]:
# sort one dimension array in decsending order. 
np.sort(a)[ : : -1]

array([94, 94, 94, 81, 77, 75, 72, 66, 59, 52, 51, 40, 35, 26, 22, 22, 20,
       14, 11, 10])

In [24]:
# sort two dimension array in decsending order. 
# by default it sort the array row wise, but we can do it column wise by passing the axis argument in the sort function.
np.sort(b)
# np.sort(b, axis = 1)

array([[ 4, 12, 42, 74, 94],
       [12, 42, 45, 55, 56],
       [ 5,  9, 10, 78, 86],
       [15, 24, 36, 47, 82]], dtype=int32)

In [25]:
# sorting column wise.
np.sort(b, axis = 0)

array([[47,  5, 10,  4, 12],
       [55, 42, 12,  9, 36],
       [74, 82, 15, 24, 42],
       [86, 94, 45, 56, 78]], dtype=int32)

### np.argsort 
- this function return the indices of the sorted array.

In [5]:
import numpy as np 
arr = np.random.randint(1, 20, 10)
print(arr)

np.argsort(arr)

[10 13  9 15 16 13 17 16 10 15]


array([2, 0, 8, 1, 5, 3, 9, 4, 7, 6])

### np.append

The numpy.append() appends values along the mentioned axis at the end of the array

https://numpy.org/doc/stable/reference/generated/numpy.append.html

In [26]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [29]:
# append the value at the last of the array in 1D array.
np.append(a, 200)

array([ 35,  59,  51,  94,  40,  10,  26,  75,  52,  72,  66,  20,  81,
        77,  22,  14,  11,  94,  94,  22, 200])

In [30]:
b

array([[74, 94, 12,  4, 42],
       [55, 42, 45, 56, 12],
       [86,  5, 10,  9, 78],
       [47, 82, 15, 24, 36]], dtype=int32)

In [58]:
temp = np.random.random((b.shape[0], 1))
# print(temp)

x = np.append(b, temp, axis = 1)
x

array([[74.        , 94.        , 12.        ,  4.        , 42.        ,
         0.14669871],
       [55.        , 42.        , 45.        , 56.        , 12.        ,
         0.49814273],
       [86.        ,  5.        , 10.        ,  9.        , 78.        ,
         0.28503553],
       [47.        , 82.        , 15.        , 24.        , 36.        ,
         0.89380935]])

### np.concatenate

- numpy.concatenate() function concatenate a sequence of arrays along an existing axis.
- It is an alternate of stacking.

https://numpy.org/doc/stable/reference/generated/numpy.concatenate.html

In [82]:
# one dim array.

arr = np.random.randint(1, 50, 5)
np.concatenate((a, arr))

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22, 34, 15, 43, 17, 47])

In [84]:
# for 2D array. 

arr1 = np.arange(0, 12).reshape(3, 4)
arr2 = np.arange(20, 32).reshape(3, 4)

In [85]:
# axis 0 means row wise concatination. 
np.concatenate((arr1, arr2), axis = 0)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [86]:
# axis 1 means column wise concatination. 
np.concatenate((arr1, arr2), axis = 1)

array([[ 0,  1,  2,  3, 20, 21, 22, 23],
       [ 4,  5,  6,  7, 24, 25, 26, 27],
       [ 8,  9, 10, 11, 28, 29, 30, 31]])

### np.unique

With the help of np.unique() method, we can get the unique values from an array given as parameter in np.unique() method.

https://numpy.org/doc/stable/reference/generated/numpy.unique.html/

In [94]:
# from 1D array
arr = np.array([1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 45, 45, 6, 6, 6, 7, 7, 7, 9, 9, 0])
np.unique(arr)

array([ 0,  1,  2,  3,  4,  6,  7,  9, 45])

In [95]:
# from 2D array. 
arr = np.array([[1, 1, 1], [2, 2, 2], [1, 2, 3]])
np.unique(arr)

array([1, 2, 3])

### np.expand_dims

With the help of Numpy.expand_dims() method, we can get the expanded dimensions of an array

https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html

In [99]:
a.shape

(20,)

In [106]:
row_wise_expand = np.expand_dims(a, axis = 0)
column_wise_expand = np.expand_dims(a, axis = 1)


row_wise_expand.shape
column_wise_expand.shape

(20, 1)

### np.where

- The numpy.where() function returns the indices of elements in an input array where the given condition is satisfied.
- It almost same as the boolean indexing or Masked array.

https://numpy.org/doc/stable/reference/generated/numpy.where.html

In [110]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [115]:
# find the even numbers indics that are greater than 50. 

# mask array is formed.
(a > 50) & (a % 2 == 0)

array([False, False, False,  True, False, False, False, False,  True,
        True,  True, False, False, False, False, False, False,  True,
        True, False])

In [116]:
np.where((a > 50) & (a % 2 == 0))

(array([ 3,  8,  9, 10, 17, 18]),)

In [118]:
# replace all values > 50 with 0. 

# syntax ---> np.where(Condition, True, False)
np.where(a > 50 , 0, a)

array([35,  0,  0,  0, 40, 10, 26,  0,  0,  0,  0, 20,  0,  0, 22, 14, 11,
        0,  0, 22])

### np.argmax

The numpy.argmax() function returns indices of the max element of the array in a particular axis.

https://numpy.org/doc/stable/reference/generated/numpy.argmax.html

In [122]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [123]:
b

array([[74, 94, 12,  4, 42],
       [55, 42, 45, 56, 12],
       [86,  5, 10,  9, 78],
       [47, 82, 15, 24, 36]], dtype=int32)

In [124]:
np.argmax(a)

np.int64(3)

In [126]:
np.argmax(b, axis=0)

array([2, 0, 1, 1, 2])

In [127]:
np.argmax(b, axis = 1)

array([1, 3, 0, 1])

###  np.argmin() function

### np.cumsum

numpy.cumsum() function is used when we want to compute the cumulative sum of array elements over a given axis.

https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html

In [135]:
# 1D array.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
np.cumsum(arr)

array([ 1,  3,  6, 10, 15, 21, 28, 36])

In [138]:
# 2D array row wise cum sum
np.cumsum(b, axis=0)

array([[ 74,  94,  12,   4,  42],
       [129, 136,  57,  60,  54],
       [215, 141,  67,  69, 132],
       [262, 223,  82,  93, 168]])

In [139]:
# 2D array row wise cum sum
np.cumsum(b, axis=1)

array([[ 74, 168, 180, 184, 226],
       [ 55,  97, 142, 198, 210],
       [ 86,  91, 101, 110, 188],
       [ 47, 129, 144, 168, 204]])

### np.cumprod() function is also same as np.cumsum() function.

### np.percentile

numpy.percentile()function used to compute the nth percentile of the given data (array elements) along the specified axis. 

https://numpy.org/doc/stable/reference/generated/numpy.percentile.html

In [141]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [147]:
# 1D array
np.percentile(a, 50)

np.float64(51.5)

In [148]:
# 2D array row wise. 
np.percentile(b, 50, axis=0)

array([64.5, 62. , 13.5, 16.5, 39. ])

In [149]:
# 2D array row wise. 
np.percentile(b, 50, axis=1)

array([42., 45., 10., 36.])

### np.histogram

Numpy has a built-in numpy.histogram() function which represents the frequency of data distribution in the graphical form.

https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

In [150]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [152]:
np.histogram(a, bins=[0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

(array([0, 3, 4, 1, 1, 3, 1, 3, 1, 3]),
 array([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100]))

In [154]:
np.histogram(a, bins=[0, 50, 100])

(array([ 9, 11]), array([  0,  50, 100]))

### np.corrcoef

Return Pearson product-moment correlation coefficients.

https://numpy.org/doc/stable/reference/generated/numpy.corrcoef.html

In [155]:
salary = np.array([20000,40000,25000,35000,60000])
experience = np.array([1,3,2,4,2])

np.corrcoef(salary,experience)

array([[1.        , 0.25344572],
       [0.25344572, 1.        ]])

### np.isin

With the help of numpy.isin() method, we can see that one array having values are checked in a different numpy array having different elements with different sizes.

https://numpy.org/doc/stable/reference/generated/numpy.isin.html

In [156]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [157]:
items = [10,20,30,40,50,60,70,80,90,100]

np.isin(a, items)

array([False, False, False, False,  True,  True, False, False, False,
       False, False,  True, False, False, False, False, False, False,
       False, False])

In [158]:
a[np.isin(a, items)]

array([40, 10, 20])

### np.flip

The numpy.flip() function reverses the order of array elements along the specified axis, preserving the shape of the array.

https://numpy.org/doc/stable/reference/generated/numpy.flip.html

In [159]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [161]:
# reverse the array. 
np.flip(a)

array([22, 94, 94, 11, 14, 22, 77, 81, 20, 66, 72, 52, 75, 26, 10, 40, 94,
       51, 59, 35])

In [162]:
# 2D array flip along column wise
np.flip(b,axis=1)

array([[42,  4, 12, 94, 74],
       [12, 56, 45, 42, 55],
       [78,  9, 10,  5, 86],
       [36, 24, 15, 82, 47]], dtype=int32)

In [163]:
# 2D array flip along row wise
np.flip(b,axis=0)

array([[47, 82, 15, 24, 36],
       [86,  5, 10,  9, 78],
       [55, 42, 45, 56, 12],
       [74, 94, 12,  4, 42]], dtype=int32)

### np.put

The numpy.put() function replaces specific elements of an array with given values of p_array. Array indexed works on flattened array. 

https://numpy.org/doc/stable/reference/generated/numpy.put.html

In [164]:
a

array([35, 59, 51, 94, 40, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11,
       94, 94, 22])

In [165]:
np.put(a, [0,1], [110,530])

In [166]:
a

array([110, 530,  51,  94,  40,  10,  26,  75,  52,  72,  66,  20,  81,
        77,  22,  14,  11,  94,  94,  22])

### np.delete

The numpy.delete() function returns a new array with the deletion of sub-arrays along with the mentioned axis. 

https://numpy.org/doc/stable/reference/generated/numpy.delete.html

In [167]:
np.delete(a, [0, 1, 4])

array([51, 94, 10, 26, 75, 52, 72, 66, 20, 81, 77, 22, 14, 11, 94, 94, 22])

### Set functions

- np.union1d
- np.intersect1d
- np.setdiff1d
- np.setxor1d
- np.in1d

In [168]:
m = np.array([1,2,3,4,5])
n = np.array([3,4,5,6,7])

np.union1d(m,n)

array([1, 2, 3, 4, 5, 6, 7])

In [169]:
np.intersect1d(m,n)

array([3, 4, 5])

In [171]:
np.setdiff1d(m, n)

array([1, 2])

### np.clip

numpy.clip() function is used to Clip (limit) the values in an array.

https://numpy.org/doc/stable/reference/generated/numpy.clip.html

In [174]:
a

array([110, 530,  51,  94,  40,  10,  26,  75,  52,  72,  66,  20,  81,
        77,  22,  14,  11,  94,  94,  22])

In [176]:
# all values greater than 75 replace with 75 and all values less than 25 replace by 25

np.clip(a, min=25, max=75)

array([75, 75, 51, 75, 40, 25, 26, 75, 52, 72, 66, 25, 75, 75, 25, 25, 25,
       75, 75, 25])

### np.swapaxes
- swapaxes basically change the shape of the matrix.

In [195]:
b

array([[74, 94, 12,  4, 42],
       [55, 42, 45, 56, 12],
       [86,  5, 10,  9, 78],
       [47, 82, 15, 24, 36]], dtype=int32)

In [201]:
np.swapaxes(b, 0, 1)

array([[74, 55, 86, 47],
       [94, 42,  5, 82],
       [12, 45, 10, 15],
       [ 4, 56,  9, 24],
       [42, 12, 78, 36]], dtype=int32)

### np.uniform

✅ numpy.random.uniform generates random numbers uniformly distributed over a specified interval.

👉 This means:
- All numbers in the interval are equally likely to appear.
- The probability density is constant across the interval.

numpy.random.uniform(low=0.0, high=1.0, size=None)

Where:
- low → The lower bound of the interval (inclusive).
- high → The upper bound of the interval (exclusive).
- size → The shape of the output array (e.g. integer, tuple).

Defaults

    If you don’t specify anything:
        np.random.uniform()
    → generates a single random float in [0.0, 1.0)

In [202]:
import numpy as np

# A single random float in [0.0, 1.0)
print(np.random.uniform())

# A single random float in [5.0, 10.0)
print(np.random.uniform(5.0, 10.0))

# An array of 5 random floats in [0.0, 1.0)
print(np.random.uniform(size=5))

# A 2x3 array of random floats in [-1.0, 1.0)
print(np.random.uniform(-1.0, 1.0, size=(2,3)))


0.3151959986743641
7.973462925551418
[0.8492251  0.95731277 0.24979732 0.31032581 0.94189906]
[[-0.65826097 -0.04713201 -0.36829009]
 [-0.03371724  0.57937095  0.67929288]]


### np.count_nonzero

In [205]:
import numpy as np

a = np.array([[1, 0, 2],
              [0, 0, 3],
              [4, 5, 0]])

# Method 1: Using count_nonzero
num_zeros = np.count_nonzero(a == 0)
print("Number of zeros:", num_zeros)

# Method 2: Using sum of boolean mask
num_zeros2 = (a == 0).sum()
print("Number of zeros (method 2):", num_zeros2)


Number of zeros: 4
Number of zeros (method 2): 4


### np.tile
 - https://www.kaggle.com/code/abhayparashar31/best-numpy-functions-for-data-science-50?scriptVersionId=98816580

In [191]:
arr = np.arange(0, 3)
print(arr)

np.tile(arr, 3)

[0 1 2]


array([0, 1, 2, 0, 1, 2, 0, 1, 2])

### np.repeat
-  https://towardsdatascience.com/10-numpy-functions-you-should-know-1dc4863764c5

In [192]:
arr

array([0, 1, 2])

In [194]:
np.repeat(arr, 5)

array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2])

### np.allclose and equals