# NumPy

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types

## NumPy Arrays vs Python Sequences

- NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.
- The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.
- NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.
- A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays.

---

In [1]:
!pip install numpy



In [2]:
import numpy as np
np.__version__

'1.24.3'

# NumPy for Data Science and Machine Learning

0. Scalars
1. `np.random` module
1. Vectors 
2. Matrices 
3. 3-D Arrays and above

## Scalars, the 0-D Arrays

In [3]:
# creating a scalar
a = np.array(69)

In [31]:
# attributes
a.ndim # returns no. of dimensions of array
a.shape # returns shape of array
a.size # returns size of array
a.dtype # returns data-type of array

a.nbytes # returns size of array
a.itemsize # returns byte size of individual elements

4

## `np.random` Module

In [3]:
# setting up of random number generator
rng = np.random.default_rng(seed=7)

**Note:** The above is a newer interface to generate random numbers.

In [10]:
# uniform disc. dist.
rng.integers(1940, 2012, 7, endpoint=True)

# uniform cont. dist. 1
rng.random(5)

# uniform cont. dist. 2
rng.uniform(12,24,8)

# gaussian dist.
rng.standard_normal(3)

array([ 1.35882342, -1.54714468,  0.85938269])

In [24]:
# chooses n elements from array without replacement
rng.choice(np.array([15,36,98,12,56,74,55]), 
           5, replace=False)

# returns shuffled array
arr = np.array([15,36,98,12,56,74,55])
rng.shuffle(arr)
arr

array([36, 74, 15, 56, 55, 98, 12])

## Vectors, the 1-D Arrays

### Creation

In [43]:
# vector creation 1
arr = np.array([4,9,6,3,58,5,9,12,0,96,15,33])
arr

array([ 4,  9,  6,  3, 58,  5,  9, 12,  0, 96, 15, 33])

In [44]:
# vector creation 2
np.zeros(6)
np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [45]:
# vector creation 3
np.full(5,3.141528)
np.empty(4)

array([1.25197752e-312, 1.37929726e-312, 1.50661701e-312, 1.63393676e-312])

In [42]:
# vector creation 4
np.arange(56,79,3)
np.linspace(15,50,50)

array([15.        , 15.71428571, 16.42857143, 17.14285714, 17.85714286,
       18.57142857, 19.28571429, 20.        , 20.71428571, 21.42857143,
       22.14285714, 22.85714286, 23.57142857, 24.28571429, 25.        ,
       25.71428571, 26.42857143, 27.14285714, 27.85714286, 28.57142857,
       29.28571429, 30.        , 30.71428571, 31.42857143, 32.14285714,
       32.85714286, 33.57142857, 34.28571429, 35.        , 35.71428571,
       36.42857143, 37.14285714, 37.85714286, 38.57142857, 39.28571429,
       40.        , 40.71428571, 41.42857143, 42.14285714, 42.85714286,
       43.57142857, 44.28571429, 45.        , 45.71428571, 46.42857143,
       47.14285714, 47.85714286, 48.57142857, 49.28571429, 50.        ])

In [49]:
# vector creation 5
rng.normal(100,10,9) # like mentioned in the previous topic, we can use many random methods to fill arrays

array([ 99.48809587,  92.06703597,  93.739269  ,  87.22274848,
       112.57069314,  98.45912427, 109.65921619, 100.13324597,
        93.05596472])

**Note:** All above methods create an `ndarray` of 1 dimension and varying shape according to the user instructions.

**Note:** Many methods mentioned above have a `_like` counterpart that can be used to create an `ndarray` using the size of another array.

### Indexing, Slicing and Masking

In [22]:
arr1 = np.arange(10,100,7)
arr1

array([10, 17, 24, 31, 38, 45, 52, 59, 66, 73, 80, 87, 94])

In [71]:
# indexing
arr1[5]
arr1[-6]

# fancy indexing
arr1[[1,5,3,9,12,-1]]

array([17, 45, 31, 73, 94, 94])

In [79]:
# slicing
arr1[2:]
arr1[:-3]
arr1[::-3]

array([94, 73, 52, 31, 10])

In [84]:
# boolean indexing 1
arr1[arr1%3==0]

np.any(arr1>51)
np.all(arr1<100)

True

In [5]:
# boolean indexing 2
mask = (arr1>80) | (arr1<20)
arr1[mask] = 0
arr1

array([ 0,  0, 24, 31, 38, 45, 52, 59, 66, 73, 80,  0,  0])

In [32]:
np.nonzero(arr1) # returns index of non-zero values
np.where(arr1>50) # returns index of condtion

(array([ 6,  7,  8,  9, 10, 11, 12], dtype=int64),)

In [24]:
# returns array based on condition and if True/False does something
np.where(arr1%3==0, 1, 0)

array([0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0])

In [31]:
# sets lower thresh for below lower thresh and higher thresh for above higher thresh
np.clip(arr1,30,60)

array([30, 30, 30, 31, 38, 45, 52, 59, 60, 60, 60, 60, 60])

**Note:** **NumPy** supports one-sided clipping as well, which we can access using `np.maximum()` and `np.minimum()`.

### Operations

In [38]:
arr1 = rng.integers(-30,30,15)
arr2 = rng.integers(-100,100,15)
arr3 = rng.integers(50,150,10)

In [41]:
print(arr1, arr2, arr3, sep='\n')

[-15  29  -4  -2   0   4   3   0  29  18  17  12   7 -10  29]
[ -7 -57  69 -68  71  22 -78 -92 -12 -93 -72   2  94  -7  61]
[141 132 112  94 101  76  99  87  74 149]


In [47]:
# arithmetic operations 1
arr1+arr2
arr1-arr2
arr1*arr2
arr1/arr2

array([ 2.14285714, -0.50877193, -0.05797101,  0.02941176,  0.        ,
        0.18181818, -0.03846154, -0.        , -2.41666667, -0.19354839,
       -0.23611111,  6.        ,  0.07446809,  1.42857143,  0.47540984])

In [52]:
# arithmetic operations 2
# arr1**arr2
arr1%arr2
arr1//arr2

array([ 2, -1, -1,  0,  0,  0, -1,  0, -3, -1, -1,  6,  0,  1,  0],
      dtype=int64)

In [57]:
# scalar arithmetic
arr3+10
arr3*20
arr3%2
arr3**2

array([19881, 17424, 12544,  8836, 10201,  5776,  9801,  7569,  5476,
       22201], dtype=int64)

**Note:** **NumPy** makes use of the concept of **braodcasting** i.e. promotes a scalar to the size of the array or it could be said as element-wise operations with a scalar.

In [59]:
# make use of numpy functions
np.log10(arr3)
np.sqrt(arr3)

array([11.87434209, 11.48912529, 10.58300524,  9.69535971, 10.04987562,
        8.71779789,  9.94987437,  9.32737905,  8.60232527, 12.20655562])

In [66]:
# both are methods to calculate dot product
np.dot(arr1, arr2)
arr1@arr2

-2559

**Note:** `np.sum(arr1*arr2)` is also another way to calculate dot product.

In [78]:
# cross product
np.cross(np.array([2,0,0]), np.array([0,3,0]))

array([0, 0, 6])

**Note:** An error gets thrown when using doing cross product on arrays less than 2 dimensions.

In [83]:
# index and value of max value
np.argmax(arr3)
np.max(arr3)

149

**Note:** Similar functions exist for minimum as well.

In [88]:
# statistical functions
arr1.mean()
arr1.var()
arr1.std(ddof=1)

14.107748630755067

**Note:** **Pandas** `std()` uses Bessel’s correction by default.

**Note:** Each of these functions has a **nan-resistant variant**: eg `np.nansum()`, `np.nanmax()`, etc.

In [92]:
arr1 = rng.uniform(10,15,6)
arr1

array([11.8063203 , 12.99092034, 10.29625821, 11.93815901, 11.61518173,
       10.75099865])

In [96]:
# rounds to -inf
np.floor(arr1)

array([11., 12., 10., 11., 11., 10.])

In [97]:
# rounds to +inf
np.ceil(arr1)

array([12., 13., 11., 12., 12., 11.])

In [100]:
# rounds to the nearest integer (.5 to even) 
np.round(arr1)

array([12., 13., 10., 12., 12., 11.])

**Note:** [Numpy Functions Docs](https://numpy.org/doc/stable/reference/routines.math.html)

## Matrices, the 2-D Arrays

### Creation

In [101]:
mat1 = np.array([[1,2,3],
                 [4,5,6]])

In [111]:
# array attributes
mat1.shape
mat1.ndim

2

**Note:** Like the methods we use in 1-D arrays, we have the same methods to create 2-D arrays.

In [114]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [126]:
rng.integers(9,99,size=(3,4))

array([[76, 75, 81, 66],
       [21, 85, 46, 85],
       [82, 96, 10, 86]], dtype=int64)

### Indexing, Slicing and Masking

In [96]:
mat1 = rng.integers(0,200,(5,5))
mat1

array([[172,  71, 180, 103, 106],
       [153, 157, 181, 199,  30],
       [ 82, 186,  61,   1, 115],
       [150, 148, 162, 128,  27],
       [169,  83, 169, 163, 194]], dtype=int64)

In [138]:
# indexing
mat1[2] # indexes row
mat1[:,3] # indexes column
mat1[2,3] # indexes row and column

95

In [151]:
# slicing 1
mat1[1:-1,1:-1]

array([[162, 168, 118],
       [120, 181,  95],
       [ 91, 131,  75]], dtype=int64)

In [153]:
# slicing 2
mat1[::2,::2]

array([[ 33,  93,  59],
       [149, 181,  34],
       [ 25, 151,  46]], dtype=int64)

In [164]:
# masking
mask = ~(mat1>100)
mat1[mask]

array([33, 10, 93, 42, 59, 22, 95, 34, 91, 75, 61, 25, 93, 46],
      dtype=int64)

### Operations

In [121]:
mat1 = rng.integers(-50,50,(4,4))
mat1

array([[-49,  36,  12,  -2],
       [ 29,  45,   1, -42],
       [ 22, -27, -28, -31],
       [-31,  40, -14, -40]], dtype=int64)

In [11]:
mat2 = rng.integers(-100,100,(4,4))
mat2

array([[-24,  63, -20, -25],
       [ 15,  95, -22,  17],
       [-13,  21, -11,  27],
       [  9,  35,  91, -70]], dtype=int64)

In [12]:
mat3 = rng.integers(-50,50,(5,5))
mat3

array([[  6,  -6, -15, -27, -45],
       [-10,  32, -41, -10,  46],
       [ 45, -29, -50,  17, -49],
       [-20,  -1,  37, -41,  16],
       [  3, -37,  34,  34,  -2]], dtype=int64)

In [18]:
# arithmetic operations 
mat1+mat2
mat1/mat2
mat1%mat2

array([[ -4,  12,  -2, -11],
       [  7,  27, -11,   6],
       [ -6,   1,   0,  10],
       [  5,  20,  90, -38]], dtype=int64)

In [40]:
# scalar multiplication
mat3*2
mat3**2

array([[  36,   36,  225,  729, 2025],
       [ 100, 1024, 1681,  100, 2116],
       [2025,  841, 2500,  289, 2401],
       [ 400,    1, 1369, 1681,  256],
       [   9, 1369, 1156, 1156,    4]], dtype=int64)

In [45]:
# broadcasting
mat4 = rng.integers(50,150,(3,3))
mat4 / 150
mat4 * np.array([1,2,3])

array([[ 82, 188, 375],
       [ 55, 104, 366],
       [ 87, 260, 159]], dtype=int64)

In [35]:
# numpy functions
np.log(mat3)

  np.log(mat3)


array([[1.79175947,        nan,        nan,        nan,        nan],
       [       nan, 3.4657359 ,        nan,        nan, 3.8286414 ],
       [3.80666249,        nan,        nan, 2.83321334,        nan],
       [       nan,        nan, 3.61091791,        nan, 2.77258872],
       [1.09861229,        nan, 3.52636052, 3.52636052,        nan]])

In [39]:
mat3.sum() # element wise summation
mat3.sum(axis=0) # column wise summation
mat3.sum(axis=1) # row wise summation

array([-87,  17, -66,  -9,  32], dtype=int64)

In [73]:
# statistical functions
mat3.mean(axis=0)
mat3.var(axis=1)

array([ 307.44,  992.64, 1436.96,  738.16,  697.84])

In [72]:
mat3.min(axis=0)
mat3.argmin(axis=0)

array([3, 4, 2, 3, 2], dtype=int64)

In [47]:
# matrix multiplication
mat1 * mat2 # element wise multiplication
mat1 @ mat2
np.dot(mat1, mat2)

array([[ -759,  5655,  2207, -3140],
       [ -444,  2719, -3645,  3135],
       [ 1399, -3902,  4949, -2399],
       [-1433, -1068,  3203, -4142]], dtype=int64)

**Note:** `@` and `np.dot()` computes matrix multiplication as taught in mathematics. Whereas using `*` operator caues element-wise multiplication.

In [61]:
np.array([[1],[2],[3]]) * np.array([[1,2,3]]) # broadcasting occurs here
np.array([[1],[2],[3]]) @ np.array([[1,2,3]]) # matrix multiplication 3x1 * 1x3
np.array([[1,2,3]]) @ np.array([[1],[2],[3]]) # matrix multiplication 1x3 * 3x1

array([[14]])

In [86]:
mat1

array([[ 44,  12,  18,  39],
       [  7,  27,  33, -28],
       [-45, -20, -22,  37],
       [ 41, -50,  -1,  32]], dtype=int64)

In [89]:
np.argsort(mat1,axis=0) # returns index matrix of sorted matrix
np.sort(mat1,axis=0)

array([[-45, -50, -22, -28],
       [  7, -20,  -1,  32],
       [ 41,  12,  18,  37],
       [ 44,  27,  33,  39]], dtype=int64)

### Shaping and Reshaping

In [98]:
for i in np.nditer(mat1):
    print(i)

172
71
180
103
106
153
157
181
199
30
82
186
61
1
115
150
148
162
128
27
169
83
169
163
194


In [100]:
# converts any n-dim array to 1d array
mat1.ravel()

array([172,  71, 180, 103, 106, 153, 157, 181, 199,  30,  82, 186,  61,
         1, 115, 150, 148, 162, 128,  27, 169,  83, 169, 163, 194],
      dtype=int64)

In [64]:
# transpose
rng.integers(5,15,(3,2)).T

array([[11,  5,  7],
       [ 8, 10, 11]], dtype=int64)

In [110]:
# reshapes array into different dim
mat2.reshape(2,8)

array([[-24,  63, -20, -25,  15,  95, -22,  17],
       [-13,  21, -11,  27,   9,  35,  91, -70]], dtype=int64)

In [117]:
# converts matrix to either column or row matrix
mat2.reshape(-1,1)
mat2.reshape(1,-1)

array([[-24,  63, -20, -25,  15,  95, -22,  17, -13,  21, -11,  27,   9,
         35,  91, -70]], dtype=int64)

**Note:** Here, $-1$ means variable length. The positive number and its position denotes either row or column matrix. if positive number is in row index, it creates a row matrix and vice-versa.

In [127]:
# joining two matrices horizontally
matx = np.hstack((mat1, mat2))
matx

array([[-49,  36,  12,  -2, -24,  63, -20, -25],
       [ 29,  45,   1, -42,  15,  95, -22,  17],
       [ 22, -27, -28, -31, -13,  21, -11,  27],
       [-31,  40, -14, -40,   9,  35,  91, -70]], dtype=int64)

In [128]:
# joining two matrices vertically
maty = np.vstack((mat1, mat2))
maty

array([[-49,  36,  12,  -2],
       [ 29,  45,   1, -42],
       [ 22, -27, -28, -31],
       [-31,  40, -14, -40],
       [-24,  63, -20, -25],
       [ 15,  95, -22,  17],
       [-13,  21, -11,  27],
       [  9,  35,  91, -70]], dtype=int64)

In [132]:
# splitting two matrices horizontally
np.hsplit(matx,[3,5])

[array([[-49,  36,  12],
        [ 29,  45,   1],
        [ 22, -27, -28],
        [-31,  40, -14]], dtype=int64),
 array([[ -2, -24],
        [-42,  15],
        [-31, -13],
        [-40,   9]], dtype=int64),
 array([[ 63, -20, -25],
        [ 95, -22,  17],
        [ 21, -11,  27],
        [ 35,  91, -70]], dtype=int64)]

In [138]:
# splitting two matrices vertically
np.vsplit(maty,[2,5])

[array([[-49,  36,  12,  -2],
        [ 29,  45,   1, -42]], dtype=int64),
 array([[ 22, -27, -28, -31],
        [-31,  40, -14, -40],
        [-24,  63, -20, -25]], dtype=int64),
 array([[ 15,  95, -22,  17],
        [-13,  21, -11,  27],
        [  9,  35,  91, -70]], dtype=int64)]

# SymPy

[Sympy playlist](https://www.youtube.com/watch?v=VKOYjemQRqw&list=PLSE7WKf_qqo1T5VV1nqXTj2iNiSpFk72T)  
[Sympy video](https://www.youtube.com/watch?v=1yBPEPhq54M&pp=ygUFc3ltcHk%3D)

# SciPy

[Scipy video](https://www.youtube.com/watch?v=jmX4FOUEfgU&pp=ygUFc2NpcHk%3D)