### Motivation

- Must-have for **AI/ML coding rounds** in product-based companies  
- Essential for working on **machine learning**, **data science**, and **deep learning** projects  
- Required when writing **custom loss functions**, building ML models **from scratch** (without libraries like Scikit-learn or TensorFlow)


### Introduction to NumPy

- NumPy (Numerical Python) is a Python library that is the core library for scientific computing in Python. 
- It contains a collection of tools and techniques that can be used to solve mathematical models of problems in science and engineering
- One of these tools is a high-performance multidimensional array object that is a powerful data structure for the efficient computation of arrays and matrices.
- To work with these arrays, there’s a vast amount of high-level mathematical functions operate on these matrices and arrays.



### Why Learn NumPy?

#### 1. Model Inputs Need NumPy Arrays
- ML libraries like **Scikit-learn**, **TensorFlow**, **PyTorch**, etc. expect data in NumPy array format.
- Even Pandas and OpenCV internally rely on NumPy arrays.

#### 2. NumPy is Much Faster for Large Numerical Computation
- NumPy is written in **optimized C code** under the hood.
- It supports:
  - Matrix multiplication
  - Dot products
  - Broadcasting (auto-expanding arrays in operations)
- Much faster than native Python loops or lists.

#### 3. NumPy Handles N-Dimensional Arrays
- NumPy can create and operate on arrays of any dimension (1D, 2D, 3D, ..., nD).
- Useful for:
  - Scientific computing
  - Deep learning tensors
  - Image processing

#### 4. You Need NumPy to Build Custom Algorithms
- For hands-on ML or from-scratch models, NumPy is essential.
- You'll use it for:
  - Linear regression
  - Gradient descent
  - Backpropagation
  - Activation functions



### NumPy vs Pandas
| Feature     | NumPy                         | Pandas                          |
|-------------|-------------------------------|----------------------------------|
| Focus       | Numerical computing           | Tabular data analysis            |
| Structure   | N-dimensional arrays          | Labeled 1D/2D tables (Series/DataFrame) |
| Performance | Very fast, low-level          | Built on top of NumPy            |
| Use case    | Math behind ML                | EDA, data wrangling, preprocessing |

- Use **Pandas** to prepare data, and **NumPy** to power the math behind ML.



#### Why NumPy is More Memory Efficient than Python Lists

**Python List Memory Model**

- Stores **references (pointers)** to separate Python `int` objects.
- Each `int` object has metadata:
  - Type information
  - Reference count
  - Value
- ❌ High overhead, especially for large arrays.
- ❌ Not cache-friendly — elements are scattered in memory.



**NumPy Array Memory Model**

- Stores data in a **contiguous C-style memory block**.
- Only stores **raw values** (e.g., `int32`, `float64`) — no per-element metadata.
- ✅ Very compact: no need to store Python object info for each element.
- ✅ Cache-friendly and CPU-efficient.
- ✅ Enables **vectorized operations**, **SIMD**, and **broadcasting**.



**Result**

- NumPy arrays use significantly **less memory** than Python lists.
- Ideal for **large-scale numerical data**, especially in **machine learning** and **scientific computing**.
  


**Installing Numpy**

In [18]:
!pip install numpy



**Importing numpy**

In [2]:
# np is just an alias (a nickname) for the numpy module.
import numpy as np

**Checking Numpy Version**

In [3]:
np.__version__

'2.3.0'

**Numpy 1D, 2D and 3D Array**

In [4]:
# a list of numbers will create a 1D array => Think of this as a single row or a single column of values.
a1D = np.array([1, 2, 3, 4])
a1D

array([1, 2, 3, 4])

In [5]:
print(type(a1D))

<class 'numpy.ndarray'>


In [6]:
help(np.array)

Help on built-in function array in module numpy:

array(...)
    array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
          like=None)

    Create an array.

    Parameters
    ----------
    object : array_like
        An array, any object exposing the array interface, an object whose
        ``__array__`` method returns an array, or any (nested) sequence.
        If object is a scalar, a 0-dimensional array containing object is
        returned.
    dtype : data-type, optional
        The desired data-type for the array. If not given, NumPy will try to use
        a default ``dtype`` that can represent the values (by applying promotion
        rules when necessary.)
    copy : bool, optional
        If ``True`` (default), then the array data is copied. If ``None``,
        a copy will only be made if ``__array__`` returns a copy, if obj is
        a nested sequence, or if a copy is needed to satisfy any of the other
        requirements (``dtype``, ``order``, 

In [7]:
print(f"shape is {a1D.shape} and size is {a1D.size}")

shape is (4,) and size is 4


In [195]:
# a list of lists will create a 2D array => A matrix: rows and columns.
a2D = np.array([[1, 2], [3, 4]])
a2D

array([[1, 2],
       [3, 4]])

In [196]:
print(f"shape is {a2D.shape} and size is {a2D.size}")

shape is (2, 2) and size is 4


In [9]:
# An array of matrices.
# further nested lists will create higher-dimensional arrays. In general, any array object is called an ndarray in NumPy
a3D_list = [
    [
        [1,2],[3,4]
    ], [
        [5,6],[7,8]
    ]
] 
type(a3D_list)

list

In [10]:
a3D = np.array(a3D_list)
a3D

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

In [197]:
print(f"shape is {a3D.shape} and size is {a3D.size}")

shape is (2, 2, 2) and size is 8


**Supported data types in numpy**

In [None]:
# NumPy arrays (ndarray) are homogeneous, meaning all elements must be of the same data type.

In [198]:
np.array([1, 2, 3], dtype=np.int32)

array([1, 2, 3], dtype=int32)

In [199]:
np.array([1.5, 2.7], dtype=np.float64)

array([1.5, 2.7])

In [206]:
np.array([1, 2.7]) # NumPy automatically upcasts them to the most compatible common type.

array([1. , 2.7])

In [208]:
np.array([True, False, True], dtype=bool)

array([ True, False,  True])

In [213]:
np.array(['apple', 'banana'], dtype='U10') # Little endian byte order - Unicode strings, each up to 10 characters long

array(['apple', 'banana'], dtype='<U10')

In [166]:
np.array(['2023-01-01', '2023-02-01'], dtype='datetime64[D]')

array(['2023-01-01', '2023-02-01'], dtype='datetime64[D]')

In [11]:
np.array([1, 'b', 3])

array(['1', 'b', '3'], dtype='<U21')

In [12]:
np.array([1, '2', 3], dtype=np.int32)

array([1, 2, 3], dtype=int32)

In [13]:
np.array([1, 'b', 3], dtype=np.int32)

ValueError: invalid literal for int() with base 10: 'b'

**record array or structured array**

- Structured arrays in NumPy are homogeneous at the array level (each element has the same layout).
- We can specify a type and, optionally, a name on a per-column basis. This makes sorting and filtering even more powerful, and it can feel similar to working with data in Excel, CSVs, or relational databases.

In [None]:
# Record arrays (also called structured arrays) in NumPy are indeed implemented similar to struct in C.

In [15]:
data = np.array([ ("aman", 22, 90),
                 ("ankit", 15, 60), 
                 ("akash", 18, 90), 
                 ("aditya", 22, 80)], 
                dtype=[("name", 'U15' ), ("age", np.int32), ("marks", int),]
               )

In [16]:
data

array([('aman', 22, 90), ('ankit', 15, 60), ('akash', 18, 90),
       ('aditya', 22, 80)],
      dtype=[('name', '<U15'), ('age', '<i4'), ('marks', '<i8')])

**You can specify the dtype (data type) in several equivalent ways.**
1. Using NumPy type strings (compact format) - `[("name", "U15"), ("age", "i4"), ("marks", "i4")]`
2. Using NumPy type objects (recommended) - `[("name", np.unicode_), ("age", np.int32), ("marks", np.int32)]`
3. Mixing Python built-in types and NumPy types - `[("name", str), ("age", int), ("marks", np.int32)]`

In [17]:
print(data[0])

('aman', 22, 90)


In [18]:
data["name"]

array(['aman', 'ankit', 'akash', 'aditya'], dtype='<U15')

### By default, NumPy arrays (ndarray) are homogeneous, meaning all elements must be of the same data type.

- **Homogeneous arrays allow:**
  1. Vectorized operations  
  2. Efficient memory layout (contiguous blocks)  
  3. Fast computations using C under the hood  

- **NumPy** is designed for arrays of fixed-size elements — typically numerical data types (integers, floats), but also supports strings, booleans, dates, and custom structured types.

- All elements in a **NumPy array** share the same data type (`dtype`).

- The data is stored in a **contiguous block of memory** for efficiency.


It can support heterogeneous types using object arrays or structured dtypes, but these lose most performance benefits.

In [26]:
het = np.array([1, 'b', 3, True, 6.5])
het

array(['1', 'b', '3', 'True', '6.5'], dtype='<U32')

In [27]:
het.dtype # Little endian byte order - Unicode strings, each up to 21 characters long

dtype('<U32')

In [28]:
print(type(het[0]))

<class 'numpy.str_'>


In [29]:
# NumPy treats this as an array of generic Python objects.
# You lose all vectorized speed benefits of NumPy.
# Operations like +, broadcasting, etc., don't work efficiently.
het = np.array([1, 'b', 3, True, 6.5], dtype=object)
het

array([1, 'b', 3, True, 6.5], dtype=object)

In [30]:
het.dtype

dtype('O')

In [31]:
print(type(het[0]))

<class 'int'>


---

In [2]:
import numpy as np

**From built-in python containers (Lists, Tuples etc)**

In [3]:
# using list
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [4]:
# using tuples
np.array((1, 2, 3, 4))

array([1, 2, 3, 4])

**Upcasting in Numpy**
- Upcasting in NumPy refers to the automatic conversion of data types when performing operations involving arrays of different types, so that no data is lost and the result has a compatible type that can hold all input values.

In [5]:
np.array([1, 2, 3.05])

array([1.  , 2.  , 3.05])

**Two Dimensional numpy array**

In [6]:
np.array([[1,2,3],[4,5,6]])

array([[1, 2, 3],
       [4, 5, 6]])

**Minimum dimensions 2**

In [7]:
np.array([1, 2, 3], ndmin=2)

array([[1, 2, 3]])

**dtype**

In [8]:
 np.array([1, 2, 3], dtype=complex)

array([1.+0.j, 2.+0.j, 3.+0.j])

**data consisting of more than one Data-type**

In [12]:
x = np.array([(1,2),(3,4.2)],dtype=[('a','i4'),('b',np.float32)])

In [13]:
x

array([(1, 2. ), (3, 4.2)], dtype=[('a', '<i4'), ('b', '<f4')])

In [14]:
x['a']

array([1, 3], dtype=int32)

In [15]:
x['b']

array([2. , 4.2], dtype=float32)

**np.asarray**

In [None]:
np.array() => Always creates a new array (i.e., it copies data). Even if the input is already a NumPy array
np.asarray() => Just ensure object is a NumPy array

In [16]:
 a = [1, 2]

In [17]:
np.asarray(a)

array([1, 2])

In [18]:
a

[1, 2]

In [19]:
a = np.array([1, 2]) #Existing arrays are not copied

In [20]:
np.asarray(a) is a

True

**If dtype is set, array is copied only if dtype does not match:**

In [29]:
a = np.array([1, 2], dtype=np.float32)

In [30]:
np.asarray(a, dtype=np.float32) is a

True

In [31]:
a

array([1., 2.], dtype=float32)

In [21]:
np.asarray(a, dtype=np.float64) is a

False

**numpy.copy**

In [22]:
a

array([1, 2])

In [24]:
np.array(a, copy=True)  is a

False

In [25]:
np.array(a, copy=False)  is a

True

**Create an array x, with a reference y and a copy z**

In [26]:
x = np.array([1, 2, 3])

In [27]:
y = x

In [28]:
z = np.copy(x)

**when we modify x, y changes, but not z**

In [29]:
x[0] = 10

In [30]:
x

array([10,  2,  3])

In [31]:
y

array([10,  2,  3])

In [32]:
z

array([1, 2, 3])

**Read and write data from a file**

**Text Files (.txt, .csv)**

In [34]:
data = np.loadtxt("data.txt", delimiter=",")
print(data)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


In [35]:
data = data ** 2
data

array([[ 1.,  4.,  9.],
       [16., 25., 36.],
       [49., 64., 81.]])

In [38]:
np.savetxt("output_data.txt", data,fmt="%d")

**Binary Files (.npy, .npz)**

In [39]:
arr = np.array([1, 2, 3, 4])
np.save("array.npy", arr)

In [40]:
arr = np.load("array.npy")
print(arr)

[1 2 3 4]


In [41]:
# Save multiple arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.savez("arrays.npz", arr1=a, arr2=b)

In [42]:
# Load them back
data = np.load("arrays.npz")
print(data["arr1"])  # [1 2 3]
print(data["arr2"])  # [4 5 6]

[1 2 3]
[4 5 6]


| File Format     | Read Function                | Write Function                   | Notes                               |
| --------------- | ---------------------------- | -------------------------------- | ----------------------------------- |
| `.txt` / `.csv` | `loadtxt()` / `genfromtxt()` | `savetxt()`                      | Use delimiter for custom separators |
| `.npy`          | `load()`                     | `save()`                         | For single array (binary)           |
| `.npz`          | `load()`                     | `savez()` / `savez_compressed()` | For multiple arrays                 |


**Advantages of .npy and .npz over Text Files**
| Feature                  | `.npy` / `.npz` (Binary)                            | `.txt` / `.csv` (Text)                          |
| ------------------------ | --------------------------------------------------- | ----------------------------------------------- |
| **Efficiency**        | Much faster to read/write                           | Slower I/O                                      |
| **Storage Size**      | More compact                                        | Larger file size                                |
| **Multiple Arrays**   | `.npz` can store many arrays in one file            | Not supported; need multiple files              |
| **No Parsing Needed** | Directly loads as NumPy array                       | Needs delimiter, dtype parsing                  |
| **More Reliable**     | No risk of formatting issues                        | Sensitive to extra spaces, missing values, etc. |


**numpy.fromfunction**

In [None]:
#It constructs an array by calling a function on each coordinate/index of the desired shape.

In [69]:
np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int)

array([[ True, False, False],
       [False,  True, False],
       [False, False,  True]])

In [70]:
np.fromfunction(lambda i, j: i + j, (3, 3), dtype=int)

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

In [None]:
#Create a new 1-dimensional array from an iterable object.

In [71]:
iterable = (x*x for x in range(5)) #generator

In [72]:
iterable

<generator object <genexpr> at 0x00000218D20552F0>

In [73]:
np.fromiter(iterable, float)

array([ 0.,  1.,  4.,  9., 16.])

In [74]:
#A new 1-D array initialized from text data in a string

In [75]:
np.fromstring('1 2', dtype=int, sep=' ')

array([1, 2])

In [76]:
np.fromstring('1, 2', dtype=int, sep=',')

array([1, 2])

**create a record array from a (flat) list of arrays**

In [44]:
x1 = [1, 2, 3]

In [45]:
x2 = [4.0, 5.0, 6.0] 

In [46]:
x3 = ['a', 'b', 'c']

In [49]:
list(zip(x1, x2, x3))

[(1, 4.0, 'a'), (2, 5.0, 'b'), (3, 6.0, 'c')]

In [86]:
r = np.array(list(zip(x1, x2, x3)), dtype=[('a', 'i4'), ('b', 'f4'), ('c', 'U1')])
print(r)
print(r['a'])


[(1, 4., 'a') (2, 5., 'b') (3, 6., 'c')]
[1 2 3]


**data types**

In [89]:
my_list = [1,2,3]
import numpy as np
arr = np.array(my_list)
print("Type/Class of this object:",type(arr))
print("Here is the vector\n--------------------\n",arr)

Type/Class of this object: <class 'numpy.ndarray'>
Here is the vector
--------------------
 [1 2 3]


In [52]:
my_mat = [[1,2,3],[4,5,6],[7,8,9]]
mat = np.array(my_mat)

print("Type/Class of this object:",type(mat))
print("Here is the matrix\n----------\n",mat,"\n----------")

#ndim gives the dimensison, 2 for a matrix, 1 for a vector
print("Dimension of this matrix: ",mat.ndim,sep='')

Type/Class of this object: <class 'numpy.ndarray'>
Here is the matrix
----------
 [[1 2 3]
 [4 5 6]
 [7 8 9]] 
----------
Dimension of this matrix: 2


In [51]:
#size gives the total number of elements
print("Size of this matrix: ", mat.size,sep='')

#shape gives the number of elements along each axes (dimension)
print("Shape of this matrix: ", mat.shape,sep='') 

#dtype gives the data type contained in the array
print("Data type of this matrix: ", mat.dtype,sep='') 

Size of this matrix: 9
Shape of this matrix: (3, 3)
Data type of this matrix: int64


In [53]:
my_mat = [[1.1,2,3],[4,5.2,6],[7,8.3,9]]
mat = np.array(my_mat)

print("Data type of the modified matrix: ", mat.dtype,sep='') 
#dtype gives the data type contained in the array

Data type of the modified matrix: float64


In [54]:
b = np.array([(1.5,2,3), (4,5,6)])

print("Matrix made from tuples, not lists\n---------------------------------------")
print(b)

Matrix made from tuples, not lists
---------------------------------------
[[1.5 2.  3. ]
 [4.  5.  6. ]]


## Numpy-5 : Creating NumPy Arrays and Matrix Using Built-in Functions

**arange and linspace**

In [6]:
import numpy as np

`range(start, stop, step)`

`np.arange(start, stop, step, dtype)`

In [None]:
help(np.arange)

In [4]:
list(range(5))

[0, 1, 2, 3, 4]

In [9]:
print("A series of numbers:",np.arange(5,16)) 
# A series of numbers from low to high

A series of numbers: [ 5  6  7  8  9 10 11 12 13 14 15]


In [10]:
print("Numbers spaced apart by 2:",np.arange(0,12,2)) 
# Numbers spaced apart by 2

Numbers spaced apart by 2: [ 0  2  4  6  8 10]


In [11]:
print("Numbers spaced apart by float:",np.arange(0,11,2.5)) 
# Numbers spaced apart by 2.5

Numbers spaced apart by float: [ 0.   2.5  5.   7.5 10. ]


In [12]:
print("Every 5th number from 50 in reverse order\n",np.arange(50,-1,-5))

Every 5th number from 50 in reverse order
 [50 45 40 35 30 25 20 15 10  5  0]


**np.linspace**

`np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)`

In [102]:
print("21 linearly spaced numbers between 1 and 5\n--------------------------------------------")
print(np.linspace(1,5,10))

21 linearly spaced numbers between 1 and 5
--------------------------------------------
[1.         1.44444444 1.88888889 2.33333333 2.77777778 3.22222222
 3.66666667 4.11111111 4.55555556 5.        ]


In [13]:
np.linspace(1,5,10, dtype=np.int32)

array([1, 1, 1, 2, 2, 3, 3, 4, 4, 5], dtype=int32)

**np.logspace()**

`np.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None)`

In [None]:
np.linspace(start, stop) gives values evenly spaced in linear space.

np.logspace(start, stop) gives values evenly spaced in logarithmic space.

In [61]:
np.logspace(1, 2, 5)

array([ 10.        ,  17.7827941 ,  31.6227766 ,  56.23413252,
       100.        ])

In [56]:
np.linspace(1, 1024, num=5, dtype = np.int32)

array([   1,  256,  512,  768, 1024], dtype=int32)

In [58]:
np.logspace(0, 10, num=5, base=2, dtype = np.int32) # [1,1024)

array([   1,    5,   32,  181, 1024], dtype=int32)

In [None]:
# use case with learning rate => hyperparameter 
[0.000001, 0.00001, 0.0001, 0.001, 0.01, 0.1, 1.0]

In [None]:
# So on a log scale, equal distances represent equal ratios (e.g., x10), not equal differences.

In [118]:
np.logspace(2.0, 3.0, num=4, endpoint=False)

array([100.        , 177.827941  , 316.22776602, 562.34132519])

In [119]:
np.logspace(2.0, 3.0, num=4, base=2.0)

array([4.        , 5.0396842 , 6.34960421, 8.        ])

**numpy array and matrix creation**

In [16]:
print("Vector of zeroes\n---------------------")
print(np.zeros(5))

Vector of zeroes
---------------------
[0. 0. 0. 0. 0.]


In [17]:
print("Matrix of zeroes\n--------------------")
print(np.zeros((3,4))) # Notice Tuples

Matrix of zeroes
--------------------
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [105]:
print("Vector of ones\n---------------------")
print(np.ones(5))

Vector of ones
---------------------
[1. 1. 1. 1. 1.]


In [18]:
print("Matrix of ones\n---------------------")
print(np.ones((5,2))) # Note matrix dimension specified by Tuples

Matrix of ones
---------------------
[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]


In [19]:
print("Matrix of 5's\n---------------------")
print(5*np.ones((3,5)))

Matrix of 5's
---------------------
[[5. 5. 5. 5. 5.]
 [5. 5. 5. 5. 5.]
 [5. 5. 5. 5. 5.]]


In [None]:
# The np.empty() function creates a new array of given shape and type, without initializing entries., 
# So the values in the array will be random garbage values (whatever happens to be in memory at the time).

In [23]:
print("Empty matrix\n-------------")
np.empty((4,5))

Empty matrix
-------------


array([[6.23042070e-307, 4.67296746e-307, 1.69121096e-306,
        6.23054293e-307, 8.45593934e-307],
       [7.56593017e-307, 1.33511290e-306, 8.90092016e-307,
        8.90104918e-307, 9.34602321e-307],
       [3.11525958e-307, 1.69118108e-306, 8.06632139e-308,
        1.20160711e-306, 1.69119330e-306],
       [1.29062229e-306, 1.24610383e-306, 1.33510679e-306,
        1.37961709e-306, 2.56765117e-312]])

**np.reshape()**

**Diagonal, Triangular, Lower Triangular and Upper Triangular Matrix**

In [None]:
A triangular matrix is a square matrix where either:

All elements below the main diagonal are zero → Upper Triangular Matrix

All elements above the main diagonal are zero → Lower Triangular Matrix

A matrix must be square to be classified as triangular.

| Matrix Type      | Description              | Zeros are where? | NumPy function |
| ---------------- | ------------------------ | ---------------- | -------------- |
| Diagonal         | Only diagonal non-zero   | Off-diagonal     | `np.diag()`    |
| Lower Triangular | Zero above main diagonal | Above diagonal   | `np.tril()`    |
| Upper Triangular | Zero below main diagonal | Below diagonal   | `np.triu()`    |


In [28]:
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

In [29]:
print("Diagonal matrix:")
print(np.diag(np.diag(A)))

Diagonal matrix:
[[1 0 0]
 [0 5 0]
 [0 0 9]]


In [30]:
print("Lower triangular matrix:")
print(np.tril(A))

Lower triangular matrix:
[[1 0 0]
 [4 5 0]
 [7 8 9]]


In [31]:
print("Upper triangular matrix:")
print(np.triu(A))

Upper triangular matrix:
[[1 2 3]
 [0 5 6]
 [0 0 9]]


In [40]:
A = np.array([[1, 2, 3, 4],
              [4, 5, 6, 7],
              [7, 8, 9, 10]])

In [41]:
print(A.ndim, A.shape, A.dtype)

2 (3, 4) int64


In [42]:
print("Diagonal matrix:")
print(np.diag(np.diag(A)))

Diagonal matrix:
[[1 0 0]
 [0 5 0]
 [0 0 9]]


In [43]:
print("Lower triangular matrix:")
print(np.tril(A))

Lower triangular matrix:
[[1 0 0 0]
 [4 5 0 0]
 [7 8 9 0]]


In [44]:
print("Upper triangular matrix:")
print(np.triu(A))

Upper triangular matrix:
[[ 1  2  3  4]
 [ 0  5  6  7]
 [ 0  0  9 10]]


**np.eye()**

In [None]:
An identity matrix is a square matrix (same number of rows and columns) with:
- 1s on the main diagonal (top-left to bottom-right)
- 0s everywhere else

`np.eye(N, M=None, k=0, dtype=float, order='C')`

| Parameter | Description                                                                                 |
| --------- | ------------------------------------------------------------------------------------------- |
| `N`       | Number of rows (required)                                                                   |
| `M`       | Number of columns (optional, defaults to `N`)                                               |
| `k`       | Index of the diagonal:<br> `0` = main diagonal,<br> `1` = above main,<br> `-1` = below main |
| `dtype`   | Data type of output (default = `float`)                                                     |
| `order`   | `'C'` for row-major (default), `'F'` for column-major                                       |


In [46]:
mat1 = np.eye(4) 
print("Identity matrix of dimension", mat1.shape)
print(mat1)

Identity matrix of dimension (4, 4)
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [48]:
mat1 = np.eye(3,4) 
print(mat1)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]]


**Extract a diagonal or construct a diagonal array.**

#### Diagonals in a Rectangular Matrix

Given a matrix **A** of size \( n * m \) (n rows, m columns):


#### What is a diagonal?

- The **main diagonal** runs from the top-left element \((0,0)\) to \((\min(n, m)-1, \min(n, m)-1)\).
- Other diagonals run parallel to the main diagonal, either above or below it.


#### Number of diagonals

- **Total diagonals** (including main diagonal):

\[
  total diagonals = n + m - 1
\]

- **Upper diagonals** (above main diagonal):

\[
m - 1
\]

- **Lower diagonals** (below main diagonal):

\[
n - 1
\]


#### Example

For a \(4 * 3\) matrix \((n=4, m=3)\):

- Total diagonals = \(4 + 3 - 1 = 6\)
- Upper diagonals = \(3 - 1 = 2\)
- Lower diagonals = \(4 - 1 = 3\)


#### Accessing diagonals in NumPy

Use `np.diag(A, k)` where:

- `k = 0`: main diagonal
- `k > 0`: k-th diagonal **above** main diagonal
- `k < 0`: k-th diagonal **below** main diagonal



In [62]:
print(np.arange(9))
print(100*"-")
print(np.arange(9).reshape((3,3)))

[0 1 2 3 4 5 6 7 8]
----------------------------------------------------------------------------------------------------
[[0 1 2]
 [3 4 5]
 [6 7 8]]


In [63]:
x = np.arange(9).reshape((3,3))

In [64]:
x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [65]:
np.diag(x)

array([0, 4, 8])

In [125]:
np.diag(x, k=1)

array([1, 5])

In [126]:
np.diag(x, k=-1)

array([3, 7])

In [127]:
np.diag(np.diag(x))

array([[0, 0, 0],
       [0, 4, 0],
       [0, 0, 8]])

**NumPy allows triangular functions like `np.tril()` and `np.triu()` to operate on rectangular matrices for flexibility and general-purpose use — even though the strict mathematical definition of triangular matrices only applies to square matrices.**



In [128]:
#An array with ones at and below the given diagonal and zeros elsewhere.

In [68]:
np.tri(3, 5, 0, dtype=int)

array([[1, 0, 0, 0, 0],
       [1, 1, 0, 0, 0],
       [1, 1, 1, 0, 0]])

In [130]:
np.tri(3, 5, -1)

array([[0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0.],
       [1., 1., 0., 0., 0.]])

In [None]:
#return a Lower triangle of an array.

In [69]:
arr = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(arr)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [70]:
np.tril(arr, -1)

array([[ 0,  0,  0],
       [ 4,  0,  0],
       [ 7,  8,  0],
       [10, 11, 12]])

In [132]:
#return Upper triangle of an array.

In [71]:
np.triu(arr, -1)

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 0,  8,  9],
       [ 0,  0, 12]])

**Random number generation**

1. Uniform Distribution : Occurence of value 1 to 6 if a die is rolled 10,000 times
2. Normal Distribution : Height of person in india for a sample of 1 Million People

In [72]:
print("Random number generation (from Uniform distribution)")
print(np.random.rand(2,3)) 
# 2 by 3 matrix with random numbers ranging from 0 to 1, Note no Tuple is necessary

Random number generation (from Uniform distribution)
[[0.07555202 0.09988798 0.84163818]
 [0.81425217 0.20967887 0.12520277]]


In [73]:
print("Numbers from Normal distribution with zero mean and standard deviation 1 i.e. standard normal")
print(np.random.randn(4,3))

Numbers from Normal distribution with zero mean and standard deviation 1 i.e. standard normal
[[-0.22912484  1.71437712 -1.0000993 ]
 [-0.8236714   0.24857661 -0.26972985]
 [ 1.68099442 -0.77396406  0.37860004]
 [ 0.59336717 -0.31427903  0.4722494 ]]


In [74]:
print("Random integer vector:",np.random.randint(1,100,10)) #randint (low, high, # of samples to be drawn)
print ("\nRandom integer matrix")

Random integer vector: [74 16 54 89 14 68  8 17 26 19]

Random integer matrix


In [75]:
print(np.random.randint(1,100,(4,4))) #randint (low, high, # of samples to be drawn in a tuple to form a matrix)
print("\n20 samples drawn from a dice throw:",np.random.randint(1,7,20)) # 20 samples drawn from a dice throw

[[96  8 16 60]
 [44 22 35 13]
 [ 2 29 73 94]
 [19 83 12 44]]

20 samples drawn from a dice throw: [3 4 6 3 5 4 3 5 3 1 5 4 5 6 5 1 1 5 5 6]


---