<a href="https://colab.research.google.com/github/LegendTejas/Data_Science/blob/main/NumPy_and_SciPy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**NumPy**
NumPy which stands for Numerical Python, is a library consisting of multidimensional array objects and a collection of routines for processing those arrays. Using NumPy, mathematical and logical operations on arrays can be performed.

In [None]:
import numpy as np

In [None]:
a = np.array([1, 2, 3])
b = np.array([[1,2,3],
              [4,5,6]])
print(a)
print("\n")
print(b)

[1 2 3]


[[1 2 3]
 [4 5 6]]


In [None]:
print("shape of a: ",a.shape)
print("shape of b: ",b.shape)

shape of a:  (3,)
shape of b:  (2, 3)


**NOTE:** If an aray has inconsistent no. of elements then it will throw error

Example:

In [None]:
b = np.array([[1, 3],
              [4, 5, 6]])
b

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

We can use different datatype elements in a numpy array

In [None]:
c = np.array([[1.5, 2, 3],
            [True, 5.9, 8]])
c

array([[1.5, 2. , 3. ],
       [1. , 5.9, 8. ]])

**reshape()** function returns a new view or copy of the original array with a different shape, without changing the data.

In [None]:
a = np.array([1, 2, 3, 4, 5, 6])
b = a.reshape((2, 3))
print(b)

print("\n")
c = b.reshape((6,))
print(c)

[[1 2 3]
 [4 5 6]]


[1 2 3 4 5 6]


In [None]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [None]:
np.ones((2,3), dtype=int)

array([[1, 1, 1],
       [1, 1, 1]])

`arange()` is used to generate values in a specified interval with a specific step size, similar to Python’s built-in `range()` but returns a NumPy array.

**Syntax:**
`numpy.arange([start,] stop[, step], dtype=None)`

In [None]:
a = np.arange(0, 10, 2)
print(a)

[0 2 4 6 8]


In [None]:
np.arange(1,11)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

`linspace()` is used to generate evenly spaced values over a specified range.

**Syntax:**
`numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)`

**Parameters:**
* `start`: The starting value of the sequence.

* `stop`: The end value of the sequence.

* `num`: (default=50) Number of samples to generate.

* `endpoint`: (default=True) If True, include stop in the array.

* `retstep`: (default=False) If True, return the step between values.

* `dtype`: The data type of the output array.

In [None]:
np.linspace(1, 11, num=11) #evenly spaced eleven numbers

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

In [None]:
np.linspace(1, 12, num=11)

array([ 1. ,  2.1,  3.2,  4.3,  5.4,  6.5,  7.6,  8.7,  9.8, 10.9, 12. ])

In [None]:
np.linspace(0, 5, num=5, endpoint=False) #Excluding endpoint

array([0., 1., 2., 3., 4.])

In [None]:
#Example with step returned
arr, step = np.linspace(0, 10, num=5, retstep=True)
print(arr)
print("Step size:", step)

[ 0.   2.5  5.   7.5 10. ]
Step size: 2.5


**Difference between `arrange()` and `linspace()`**
<div style="font-size:25px">

<table>
  <tr>
    <th>Feature</th>
    <th><code>arange()</code></th>
    <th><code>linspace()</code></th>
  </tr>
  <tr>
    <td>Input</td>
    <td>start, stop, step</td>
    <td>start, stop, num(Number of elements)</td>
  </tr>
  <tr>
    <td>Step control</td>
    <td>You specify step</td>
    <td>NumPy calculates step automatically</td>
  </tr>
  <tr>
    <td>Stop value</td>
    <td><b>Excluded</b></td>
    <td>Usually <b>included</b></td>
  </tr>
  <tr>
    <td>Precision</td>
    <td>May suffer from float issues</td>
    <td>More accurate with floats</td>
  </tr>
  <tr>
    <td>Use case</td>
    <td>Known step</td>
    <td>Known number of samples</td>
  </tr>
</table>

</div>


In [None]:
#random numbers
np.random.randint(1,30,15)

array([ 6, 24, 15, 18,  6,  8,  9, 15,  5, 10, 12, 11,  2,  4, 27])

In [None]:
# 2-D array
np.random.randint(20,50,(3,4))

array([[40, 20, 34, 20],
       [39, 43, 43, 45],
       [23, 22, 48, 37]])

In [None]:
#random.rand
np.random.rand(3,4) #random float numbers

#generates random numbers from a uniform distribution over the interval [0,1].

array([[0.2885344 , 0.77999785, 0.37544616, 0.58003458],
       [0.97877334, 0.47333344, 0.71510201, 0.2141226 ],
       [0.67679787, 0.00518503, 0.08324744, 0.09340425]])

#Accessing an element in numpy array

In [None]:
b

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
b[0] #First row of the array

array([1, 2, 3])

In [None]:
b[1] #Second row of the array

array([4, 5, 6])

In [None]:
b[:] # All rows (same as b)

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
print(b[:, 0]) #All rows, first column (i.e., 1st element of each row)

[1 4]


In [None]:
print(b[:, 2]) #All rows, third column (i.e., 3rd element of each row)

[3 6]


In [None]:
b[0, 0] # b[0,0] or b[0][0] both will give  Element at first row and first column

np.int64(1)

It means the element is a NumPy scalar type (numpy.int64)

**If you just want the plain Python integer:**

Use `.item()` or `int()` to convert it:

In [None]:
print(b[0, 0].item())
# or
print(int(b[0, 0]))

1


In [None]:
int(b[1,2])

# This means access the element at:
# Row Index 1(i.e., the second row → [4, 5, 6])
# Column Index 2(i.e., the third element in that row → 6)

6

In [None]:
b

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
b > 3

array([[False, False, False],
       [ True,  True,  True]])

In [None]:
#To get those numbers which are greater than 3
b[b > 3]

array([4, 5, 6])

#Numpy axis
An **axis** refers to a **dimension** along which operations like `sum()`, `mean()`, or `concatenate()` are performed.

* Think of it as a direction in the array:

* **Axis 0** → down the rows (**column-wise**)

* **Axis 1** → across the columns(**row-wise**)

In [None]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

###**Operations using axis**

1. `np.sum(a, axis=0)` → Sum column-wise

In [None]:
np.sum(a, axis=0)  # [1+4, 2+5, 3+6] = [5, 7, 9]

array([5, 7, 9])

2. `np.sum(a, axis=1)` → Sum row-wise

In [None]:
np.sum(a, axis=1)  # [1+2+3, 4+5+6] = [6, 15]

3. `np.mean(a, axis=0)` → Mean down columns (column-wise)

In [None]:
np.mean(a, axis=0)
# [ (1+4)/2, (2+5)/2, (3+6)/2 ] → [2.5, 3.5, 4.5]

array([2.5, 3.5, 4.5])

4. `np.mean(a, axis=1)`→ Mean across rows (row-wise)

In [None]:
np.mean(a, axis=1)
# [ (1+2+3)/3, (4+5+6)/3 ] → [2.0, 5.0]

array([2., 5.])

#### **np.concatenate()**

In [None]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

b = np.array([[7, 8, 9],
              [10, 11, 12]])

`np.concatenate((a, b), axis=0)` → Add rows (stack vertically)

In [None]:
# Shape becomes (4, 3)
np.concatenate((a, b), axis=0)

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

`np.concatenate((a, b), axis=1)` → Add columns (stack horizontally)

In [None]:
# Shape becomes (2, 6)
np.concatenate((a, b), axis=1)

array([[ 1,  2,  3,  7,  8,  9],
       [ 4,  5,  6, 10, 11, 12]])

**✅ Rule of Thumb:**

* **Axis 0** = operate down the rows (column-wise)

* **Axis 1** = operate across the columns (row-wise)

* Higher dimensions: axis increases as dimensions go deeper.

###NumPy Stacking

In NumPy, stacking means joining multiple arrays along a new or existing axis. There are several stacking functions available:

**1. np.stack()**

Stacks arrays along a new axis.
All input arrays must have the same shape.

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Stack along a new axis (default axis=0)
result = np.stack((a, b))
print(result)

[[1 2 3]
 [4 5 6]]


You can change the axis:

In [None]:
np.stack((a, b), axis=1)
# Output: [[1 4]
#          [2 5]
#          [3 6]]

**2. np.hstack()**

Horizontal stacking — stacks arrays along columns (axis=1 for 2D).

In [None]:
a = np.array([[1], [2], [3]])
b = np.array([[4], [5], [6]])

np.hstack((a, b))

array([[1, 4],
       [2, 5],
       [3, 6]])

**3. np.vstack()**

Vertical stacking — stacks arrays on top of each other (axis=0).

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

np.vstack((a, b))

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
arr1 = ([8,9,0])
arr2 = ([5,6,7])

vertical_stack = np.vstack((arr1,arr2))
horizontal_stack = np.hstack((arr1,arr2))

print("Vertical Stack: ")
print(vertical_stack)
print("Horizontal Stack: ")
print(horizontal_stack)
print("\n")
print(np.shape(vertical_stack))
print(np.shape(horizontal_stack))

Vertical Stack: 
[[8 9 0]
 [5 6 7]]
Horizontal Stack: 
[8 9 0 5 6 7]


(2, 3)
(6,)


**4. np.dstack()**

Stacks along depth (third axis).

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

np.dstack((a, b))

array([[[1, 4],
        [2, 5],
        [3, 6]]])

In [None]:
b = np.array([1,2,3,4,5,6,7,8])
c = b.reshape((2,4))
print(c)
a = np.transpose(c)
a

[[1 2 3 4]
 [5 6 7 8]]


array([[1, 5],
       [2, 6],
       [3, 7],
       [4, 8]])

###NumPy Broadcasting

Broadcasting automatically expands the smaller array(s) to match the shape of the larger one during arithmetic operations, following specific rules.

In [None]:
a = np.array([1, 2, 3])       # shape (3,)
b = np.array([[10], [20]])    # shape (2,1)

# Broadcasting b to shape (2,3)
result = a + b
print(result)

[[11 12 13]
 [21 22 23]]


In [None]:
arr1 = np.array([1,2,3,4,5])
print(np.shape(arr1))
arr2 = np.array([[6],[7],[8],[9],[10]])
print(np.shape(arr2))
print("\n")
sum = arr1 + arr2
print(sum)

(5,)
(5, 1)


[[ 7  8  9 10 11]
 [ 8  9 10 11 12]
 [ 9 10 11 12 13]
 [10 11 12 13 14]
 [11 12 13 14 15]]


###**Common Use cases of Broadcasting:**

Adding a scalar to an array:

In [None]:
arr = np.array([1, 2, 3]) + 5
print(arr)

[6 7 8]


Row-wise or column-wise operations:

In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([10, 20, 30])
print(a + b)

[[11 22 33]
 [14 25 36]]


Element-wise operations with reshaped arrays:

In [None]:
a = np.array([1, 2, 3])
b = a.reshape(3, 1)  # shape (3,1)
print(a + b)         # shapes: (3,) and (3,1)

[[2 3 4]
 [3 4 5]
 [4 5 6]]


In [None]:
arr1 = np.random.randint(1, 10, size= 5)
print(arr1)
print(np.shape(arr1))
arr2 = np.array([[6]])
print(np.shape(arr2))
print("\n")
sum = arr1 + arr2
print(sum)

[3 8 7 3 6]
(5,)
(1, 1)


[[ 9 14 13  9 12]]


#**NumPy - Practice Exercises, Questions, and Solutions**

1. ### Create an empty and a full NumPy array?

In [None]:
# Create an empty array of shape (3, 4)
empty_array = np.empty((3, 4))
print("Empty Array:\n", empty_array)

# Create a full array of shape (3, 3) filled with the value 5
full_array = np.full((3, 3), 5)
print("Full Array:\n", full_array)

Empty Array:
 [[3.47941232e-315 0.00000000e+000 2.10077583e-312 6.79038654e-313]
 [2.22809558e-312 2.14321575e-312 2.35541533e-312 6.79038654e-313]
 [2.22809558e-312 2.14321575e-312 2.46151512e-312 2.41907520e-312]]
Full Array:
 [[5 5 5]
 [5 5 5]
 [5 5 5]]


2. Check whether a Numpy array contains a specified row

In [None]:
arr = np.array([[1, 2, 3, 4, 5],
                [6, 7, 8, 9, 10],
                [11, 12, 13, 14, 15],
                [16, 17, 18, 19, 20]
               ])

print(arr)

# check for some lists
print([1, 2, 3, 4, 5] in arr.tolist())
print([16, 17, 20, 19, 18] in arr.tolist())

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]]
True
False


3. Flatten a 2D NumPy array into 1D array

In [None]:
#M-1: Using ravel():
#Returns a flattened 1D array — similar to flatten(), but returns a view (if possible) instead of a copy:
arr = np.array([[1, 2, 3], [2, 4, 5], [1, 2, 3]])
flattened = arr.ravel()

print("Flattened array using ravel():", flattened)

#M-2: Using reshape(): You can reshape it into a 1D array using -1:
flattened = arr.reshape(-1)
print("Flattened array using reshape():", flattened)

#M-3:  Using np.hstack()
flattened = np.hstack(arr)
print("Flattened using hstack():", flattened)

#M-4: Using np.concatenate() with list comprehension
#Concatenates all rows into a single 1D array:
flattened = np.concatenate([row for row in arr])
print("Flattened using concatenate + list comprehension:", flattened)

Flattened array using ravel(): [1 2 3 2 4 5 1 2 3]
Flattened array using reshape(): [1 2 3 2 4 5 1 2 3]
Flattened using hstack(): [1 2 3 2 4 5 1 2 3]
Flattened using concatenate + list comprehension: [1 2 3 2 4 5 1 2 3]


4. Replace NumPy array elements that doesn't satisfy the given condition

**Method-1: Using Boolean Masking and relational operators**

Exampe 1: In 1-D NumPy array

In [None]:
# Creating a 1-D Numpy array
arr = np.array([75, 42, 60, 30])
print("Given array:")
print(arr)

print("\nReplace all elements of array which are greater than 50 to 15")
arr[arr > 50] = 15

print("New array :\n")
print(arr)

Given array:
[75 42 60 30]

Replace all elements of array which are greater than 50. to 15
New array :

[15 42 15 30]


Example 2: In 2-D Numpy array

In [None]:
# Creating a 2-D Numpy array
n_arr = np.array([[45, 52, 10],
                  [5, 50, 25]])
print("Given array:")
print(n_arr)

print("\nReplace all elements of array which are greater than 30 to 5")
n_arr[n_arr > 30] = 5

print("New array :\n")
print(n_arr)

Given array:
[[45 52 10]
 [ 5 50 25]]

Replace all elements of array which are greater than 30 to 5
New array :

[[ 5  5 10]
 [ 5  5 25]]


Example 3: In 3-D Numpy array

In [None]:
n_arr = np.array([[[11, 25, 70], [30, 45, 55], [20, 45.8, 7.1]],
                  [[50, 65, 8], [70, 85, 10], [11, 22.2, 33.6]],
                  [[19, 69, 36], [1, 5, 24], [4.9, 20.8, 96.7]]])

print("Given array:")
print(n_arr)

print("\nReplace all elements of array which are less than 10 to Nan")
n_arr[n_arr < 10.] = np.nan

print("New array :\n")
print(n_arr)

Given array:
[[[11.  25.  70. ]
  [30.  45.  55. ]
  [20.  45.8  7.1]]

 [[50.  65.   8. ]
  [70.  85.  10. ]
  [11.  22.2 33.6]]

 [[19.  69.  36. ]
  [ 1.   5.  24. ]
  [ 4.9 20.8 96.7]]]

Replace all elements of array which are less than 10 to Nan
New array :

[[[11.  25.  70. ]
  [30.  45.  55. ]
  [20.  45.8  nan]]

 [[50.  65.   nan]
  [70.  85.  10. ]
  [11.  22.2 33.6]]

 [[19.  69.  36. ]
  [ nan  nan 24. ]
  [ nan 20.8 96.7]]]


**Method 2: Using numpy.where()**

Example 1:

In [None]:
# Creating a 2-D Numpy array
n_arr = np.array([[45, 52, 10],
                  [1, 5, 25]])

print("Given array:")
print(n_arr)

print("\nReplace all elements of array which are \
greater than or equal to 25 to 0")

print("else remains the same ")
print(np.where(n_arr >= 25, 0, n_arr))

Given array:
[[45 52 10]
 [ 1  5 25]]

Replace all elements of array which are greater than or equal to 25 to 0
else remains the same 
[[ 0  0 10]
 [ 1  5  0]]


Example 2:

In [None]:
# Creating a 2-D Numpy array
n_arr = np.array([[45, 52, 10],
                  [1, 5, 25],
                  [50, 40, 81]])

print("Given array:")
print(n_arr)

print("\nReplace all elements of array which are \
less than or equal to 25 with Nan")

print("else with 1 ")
print(np.where(n_arr <= 25, np.nan, 1))

Given array:
[[45 52 10]
 [ 1  5 25]
 [50 40 81]]

Replace all elements of array which are less than or equal to 25 with Nan
else with 1 
[[ 1.  1. nan]
 [nan nan nan]
 [ 1.  1.  1.]]


5. Return the indices of elements where the given condition is satisfied

###**np.where()** in Python

Syntax:

numpy.where(condition[, x, y])

When True, yield x, otherwise yield y

In [None]:
arr = np.array([10, 15, 20, 25, 30])

# Use np.where() to replace values based on condition
# If the value is greater than 20, return 1, otherwise return 0
result = np.where(arr > 20, 1, 0)

print(result)

[0 0 0 1 1]


Example 2: In this example, for elements where the condition arr1 > 20 is true, the corresponding element from arr1 is chosen. Otherwise, the element from arr2 is selected.

In [None]:
arr1 = np.array([10, 15, 20, 25, 30])
arr2 = np.array([100, 150, 200, 250, 300])

# Use np.where() to select elements from arr1 where the condition is true, and arr2 otherwise
result = np.where(arr1 > 20, arr1, arr2)

print(result)

[100 150 200  25  30]


6. How to access different rows of a multidimensional NumPy array?

example: Accessing the First and Last row of a 2-D NumPy array

In [None]:
arr = np.array([[10, 20, 30],
                [40, 5, 66],
                [70, 88, 94]])

print("Given Array :")
print(arr)

# Access the First and Last rows of array
res_arr = arr[[0,2]]
print("\nAccessed Rows :")
print(res_arr)

Given Array :
[[10 20 30]
 [40  5 66]
 [70 88 94]]

Accessed Rows :
[[10 20 30]
 [70 88 94]]


## Mathematical Operations

###**1. Arithmetic Operations**
(Element-wise, broadcasting supported)

| Operation      | Function            | Example                         |
| -------------- | ------------------- | ------------------------------- |
| Addition       | `np.add(a, b)`      | `a + b`                         |
| Subtraction    | `np.subtract(a, b)` | `a - b`                         |
| Multiplication | `np.multiply(a, b)` | `a * b`                         |
| Division       | `np.divide(a, b)`   | `a / b`                         |
| Floor Division | `np.floor_divide()` | `a // b`                        |
| Modulus        | `np.mod(a, b)`      | `a % b` or `np.remainder(a, b)` |
| Power          | `np.power(a, b)`    | `a ** b`                        |


###**2. Unary Operations**
| Operation  | Function         | Description              |
| ---------- | ---------------- | ------------------------ |
| Negation   | `np.negative(a)` | `-a`                     |
| Absolute   | `np.abs(a)`      | Absolute value           |
| Rounding   | `np.round(a)`    | Round to nearest integer |
| Floor      | `np.floor(a)`    | Round down               |
| Ceil       | `np.ceil(a)`     | Round up                 |
| Truncation | `np.trunc(a)`    | Drop decimal part        |


###**3. Exponential & Logarithmic**

| Operation   | Function      |
| ----------- | ------------- |
| Exponential | `np.exp(a)`   |
| Natural log | `np.log(a)`   |
| Log base 10 | `np.log10(a)` |
| Log base 2  | `np.log2(a)`  |
| Log(1 + x)  | `np.log1p(a)` |
| exp(x) - 1  | `np.expm1(a)` |


###**4. Trigonometric Functions**

| Function         | Inverse/Other                   |
| ---------------- | ------------------------------- |
| `np.sin(x)`      | `np.arcsin(x)`                  |
| `np.cos(x)`      | `np.arccos(x)`                  |
| `np.tan(x)`      | `np.arctan(x)`                  |
| `np.sinh(x)`     | `np.arcsinh(x)`                 |
| `np.cosh(x)`     | `np.arccosh(x)`                 |
| `np.tanh(x)`     | `np.arctanh(x)`                 |
| Angle Conversion | `np.deg2rad()` / `np.rad2deg()` |


###**5. Statistical Operations**

| Operation     | Function                       |
| ------------- | ------------------------------ |
| Minimum       | `np.min(a)` or `a.min()`       |
| Maximum       | `np.max(a)`                    |
| Sum           | `np.sum(a)`                    |
| Product       | `np.prod(a)`                   |
| Mean          | `np.mean(a)`                   |
| Median        | `np.median(a)`                 |
| Standard Dev  | `np.std(a)`                    |
| Variance      | `np.var(a)`                    |
| Percentile    | `np.percentile(a, q)`          |
| Argmin/Argmax | `np.argmin(a)`, `np.argmax(a)` |


###**6. Linear Algebra Operations**

| Operation       | Function                     |
| --------------- | ---------------------------- |
| Dot Product     | `np.dot(a, b)`               |
| Matrix Multiply | `np.matmul(a, b)` or `a @ b` |
| Transpose       | `a.T` or `np.transpose(a)`   |
| Inverse         | `np.linalg.inv(a)`           |
| Determinant     | `np.linalg.det(a)`           |
| Eigenvalues     | `np.linalg.eig(a)`           |
| Norm            | `np.linalg.norm(a)`          |
| Solve Ax = B    | `np.linalg.solve(a, b)`      |


###**7. Special Functions**

| Function               | Use Case                           |
| ---------------------- | ---------------------------------- |
| `np.clip(a, min, max)` | Limit values between bounds        |
| `np.cumsum(a)`         | Cumulative sum                     |
| `np.cumprod(a)`        | Cumulative product                 |
| `np.diff(a)`           | First difference (a\[i+1] - a\[i]) |
| `np.unique(a)`         | Unique elements                    |


##Mathematical operations

In [None]:
#Element-wise operations

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(A)
print(B)
print(A + B)  # Addition
print(A * B)  # Element-wise multiplication


[[1 2]
 [3 4]]
[[5 6]
 [7 8]]
[[ 6  8]
 [10 12]]
[[ 5 12]
 [21 32]]


Matrix operations

In [None]:
# dot product
np.dot(A,B)

array([[19, 22],
       [43, 50]])

In [None]:
#cross product
a= np.array([[1,0,2],[0,2,3]])
b= np.array([[0,0,1],[2,1,0]])
print(a)
print(b)
print('Cross Product:\n',np.cross(a,b))

[[1 0 2]
 [0 2 3]]
[[0 0 1]
 [2 1 0]]
Cross Product:
 [[ 0 -1  0]
 [-3  6 -4]]


In [None]:
#determinant
np.linalg.det(A)

-2.0000000000000004

In [None]:
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [None]:
#matrix multiplication

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print(A)
print(B)

np.matmul(A,B)

[[1 2]
 [3 4]]
[[5 6]
 [7 8]]


array([[19, 22],
       [43, 50]])

In [None]:
#Transpose
A = np.array([[1, 2], [3, 4]])
np.transpose(A)


array([[1, 3],
       [2, 4]])

##Array Manipulation

In [None]:
np.append(A,[[7],[8]],axis=1)

array([[1, 2, 7],
       [3, 4, 8]])

In [None]:
np.append(A,[[7,8]])

array([1, 2, 3, 4, 7, 8])

In [None]:
np.delete(A,1,axis=0) #axis=0 for rows, axis=1 for cols

array([[1, 2]])

In [None]:
import numpy as np
np.eye(4,k=2) #diagonal matrix, diagonal=1

array([[0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [None]:
#Inverse of a matrix
np.linalg.inv(np.array([[2,3],[4,5]]))

array([[-2.5,  1.5],
       [ 2. , -1. ]])

##Matrix Decomposition

 **LU decomposition** factors a matrix ```A``` into the product of a lower triangular matrix ```L``` and an upper triangular matrix ```U```. If ```A``` is a square matrix, it can be expressed as:


$PA=LU$

In [None]:
#LU decomposition
print(A)
from scipy.linalg import lu
P,L,U= lu(A)
print(P,L,U,sep="\n")

[[1 2]
 [3 4]]
[[0. 1.]
 [1. 0.]]
[[1.         0.        ]
 [0.33333333 1.        ]]
[[3.         4.        ]
 [0.         0.66666667]]


In [None]:
np.matmul(P,A)

array([[3., 4.],
       [1., 2.]])

In [None]:
np.matmul(L,U)

array([[3., 4.],
       [1., 2.]])

**QR decomposition** expresses a matrix ```A``` as the product of an orthogonal matrix ```Q``` ($Q^T . Q=I$)and an upper triangular matrix ```R```:

$A= QR$

In [None]:
q,r= np.linalg.qr(A)
print(q,r,sep="\n")

[[-0.31622777 -0.9486833 ]
 [-0.9486833   0.31622777]]
[[-3.16227766 -4.42718872]
 [ 0.         -0.63245553]]


In [None]:
np.matmul(q,r)

array([[1., 2.],
       [3., 4.]])

**Singular Value Decomposition** SVD decomposes a matrix ```A``` into three matrices: ```U``` (orthogonal), ```S``` (diagonal with singular values), and $V^T$ (transpose of an orthogonal matrix):

$A= USV^T$

$U$= m × m matrix of left singular vectors (orthogonal columns).

$S$= diagonal matrix of singular values (non-negative real numbers), sorted in descending order.

$V^T$ = transpose of an n x n matrix $V$ of right singular vectors (orthogonal columns).

For a 2X2 matrix:

$U=$
\begin{bmatrix} u_{11} & u_{12} \\ u_{21} & u_{22} \end{bmatrix}
$S=$

\begin{bmatrix} \sigma_1 & 0 \\ 0 & \sigma_2 \end{bmatrix}
$V=$

\begin{bmatrix} v_{11} & v_{12} \\ v_{21} & v_{22} \end{bmatrix}


In [None]:
U,S,Vt= np.linalg.svd(A)
print(U,S,Vt,sep="\n")

[[-0.40455358 -0.9145143 ]
 [-0.9145143   0.40455358]]
[5.4649857  0.36596619]
[[-0.57604844 -0.81741556]
 [ 0.81741556 -0.57604844]]


In [None]:
sigma=np.zeros((U.shape[0],Vt.shape[0]))
np.fill_diagonal(sigma,S)
print(sigma)
recon_A= np.matmul(U,np.matmul(sigma,Vt))
print(f'Reconstructed array \n{recon_A}')
#np.allclose(recon_A,A)

[[5.4649857  0.        ]
 [0.         0.36596619]]
Reconstructed array 
[[1. 2.]
 [3. 4.]]


In [None]:
U@(sigma@Vt)

array([[1., 2.],
       [3., 4.]])

**Cholesky decomposition** is applicable to positive definite matrices (eigen values are positive) and factors it into a lower triangular matrix```L``` and its conjugate transpose $L^T$:

$A= LL^T$


In [None]:
A=np.diag([1,2,3])
L=np.linalg.cholesky(A)
print(L)

[[1.         0.         0.        ]
 [0.         1.41421356 0.        ]
 [0.         0.         1.73205081]]


In [None]:
np.matmul(L,np.transpose(L))

array([[1., 0., 0.],
       [0., 2., 0.],
       [0., 0., 3.]])

**Eigen values and Eigen vectors**

For a given square matrix $A$, an eigenvector
$𝑣$ is a non-zero vector that, when multiplied by
$𝐴$, results in a new vector that points in the same direction as $𝑣$ (it might be stretched or flipped but keeps the same line of action). Mathematically:


$Av=λv$

$v$ is the eigenvector.

$𝜆$ is the eigenvalue associated with $𝑣$

In [None]:
A = np.array([[2, 1], [1, 2]])

# Calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print(eigenvalues)
print(eigenvectors)

[3. 1.]
[[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]


In [None]:
A@eigenvectors

array([[ 2.12132034, -0.70710678],
       [ 2.12132034,  0.70710678]])

In [None]:
eigenvalues*eigenvectors

array([[ 2.12132034, -0.70710678],
       [ 2.12132034,  0.70710678]])

2. Given a 2D NumPy array A of shape (3, 3) and a 1D array b of shape (3,), add the 1D array to each row of the 2D array using broadcasting.

In [None]:
import numpy as np
arr1 = np.array([1,2,3])
print(np.shape(arr1))
arr2 = np.array([[6,7,8], [1,2,3], [4,5,6]])
print(np.shape(arr2))
print("\n")
sum = arr1 + arr2
print(sum)

(3,)
(3, 3)


[[ 7  9 11]
 [ 2  4  6]
 [ 5  7  9]]


3. Given a 1D NumPy array arr, replace all negative values in the array with 0.

In [None]:
arr = np.array([1,2,3,-4,-5,-1,2,-1])
arr[arr < 0] = 0
print(arr)

[1 2 3 0 0 0 2 0]


4. Given a 2D NumPy array, compute the sum of elements that are greater than a given value and less than another value. Implement this using loops and conditions.

In [None]:
import numpy as np

# Example 2D array
arr = np.array([[5, 12, 7],
                [3, 18, 10],
                [20, 6, 1]])

# Set your range limits
lower = 5
upper = 15

# Initialize sum
total = 0

# Loop through the array using indices
for i in range(len(arr)):
    for j in range(len(arr[i])):
        if lower < arr[i][j] < upper:
            total += arr[i][j]

print("Sum of elements between", lower, "and", upper, ":", total)

Sum of elements between 5 and 15 : 35


Method 2:

In [None]:
arr = np.array([[5, 12, 7],
                [3, 18, 10],
                [20, 6, 1]])
lower = 7
upper = 20
sum = 0
for i in arr:
  for j in i:

    if lower < j < upper:
      sum+=j

print("Sum of elements between", lower, "and", upper, ":", sum)

Sum of elements between 7 and 20 : 40


##NumPy array slicing

| Feature     | View                              | Copy                                |
| ----------- | --------------------------------- | ----------------------------------- |
| Memory      | Shared with original              | Separate from original              |
| Performance | Faster, but changes affect source | Safer, but uses more memory         |
| When to Use | For fast slicing without changes  | When original data shouldn't change |


In [None]:
arr = np.array([1,2,3,4,5,6])
view = arr[:4]
view[0] = 100
print(view)
print(arr)

copy = arr[:5].copy()
copy[0] = 200
print(copy)

[100   2   3   4]
[100   2   3   4   5   6]
[200   2   3   4   5]


In [None]:
matrix = np.array([[1,2,3],[4,5,6]])
slice_view = matrix[:, :2] #View
slice_view[0, 0] = 999
print("Changed Matrix: \n",matrix)
print("\n")
matrix_copy = matrix[:, :2].copy()
matrix_copy[1, 0] = 888
print("Unchanged Matrix: \n", matrix)

Changed Matrix: 
 [[999   2   3]
 [  4   5   6]]


Unchanged Matrix: 
 [[999   2   3]
 [  4   5   6]]


#**Fancy Indexing**

Fancy indexing means using arrays (or lists) of indices to access or modify elements in a NumPy array. It’s more powerful and flexible than basic slicing.

###**1. Using List/Array of Indices**

In [None]:
arr = np.array([10, 20, 30, 40, 50])
indices = [1, 3, 4]

# Select elements at positions 1, 3, and 4
print(arr[indices])

[20 40 50]


###**2. Fancy Indexing in 2D Arrays**

In [None]:
a = np.array([[10, 11, 12],
              [20, 21, 22],
              [30, 31, 32]])

# Select [10, 21, 32] using row and column index pairs
rows = [0, 1, 2]
cols = [0, 1, 2]

print(a[rows, cols])

[10 21 32]


###**3. Selecting with Boolean Arrays (Boolean Masking)**

In [None]:
arr = np.array([5, 10, 15, 20, 25])

# Select elements greater than 15
mask = arr > 15  #[False, False, False, True, True]
print(arr[mask])

[20 25]


###**4. Using np.ix_() for Grid Indexing**

In [None]:
a = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

rows = [0, 2]
cols = [1, 2]

# Get the cross-product elements using np.ix_
print(a[np.ix_(rows, cols)])

[[2 3]
 [8 9]]


###**5. Modifying Elements with Fancy Indexing**

In [None]:
arr = np.array([10, 20, 30, 40, 50])
arr[[0, 2, 4]] = -1
print(arr)

[-1 20 -1 40 -1]


##Aggregate Functions in NumPy

In NumPy, aggregate functions (also called reduction operations) perform a summary operation on an array’s elements and reduce them to a single value or along a specified axis.

Axis Reference:
* `axis=0`: operation along columns
* `axis=1`: operation along rows

####**1. np.sum() – Sum of elements**

Syntax: `np.sum(array, axis=None)`

In [None]:
import numpy as np

a = np.array([[1, 2], [3, 4]])
np.sum(a)           # Output: 10
np.sum(a, axis=0)   # Output: [4 6] → column-wise
np.sum(a, axis=1)   # Output: [3 7] → row-wise

array([3, 7])

####**2. np.mean() – Mean (Average)**

Syntax: `np.sum(array, axis=None)`

In [None]:
np.mean(a)          # Output: 2.5
np.mean(a, axis=0)  # Output: [2. 3.]

####**3. np.median() – Median value**

Syntax: `np.median(array, axis=None)`

In [None]:
np.median(a)

np.float64(2.5)

####**4. np.std() – Standard deviation**

Syntax: `np.std(array, axis=None)`

In [None]:
np.std(a)           # Output: 1.118...

####**np.var() – Variance**

Syntax: `np.var(array, axis=None)`


In [None]:
np.var(a)           # Output: 1.25

####**6. np.min() – Minimum value**
####**7. np.max() – Maximum value**
Syntax: `np.min(array, axis=None)`

In [None]:
np.min(a)           # Output: 1
np.max(a)           # Output: 4

####**8. np.argmin() / np.argmax() – Index of min/max value**

In [None]:
np.argmin(a)        # Output: 0
np.argmax(a)        # Output: 3

####**9. np.prod() – Product of all elements**

In [None]:
np.prod(a)          # Output: 24 (1*2*3*4)

####**10. np.cumsum() / np.cumprod() – Cumulative sum/product**

In [None]:
np.cumsum(a)        # Output: [ 1  3  6 10]
np.cumprod(a)       # Output: [ 1  2  6 24]

Given a 4×4 matrix of integers, extract all elements greater than 50.

In [None]:
arr = np.arange(45, 61)
newarr = arr.reshape(4,4)
print(newarr)
print("\n")
mask = newarr > 50
print(newarr[mask])

[[45 46 47 48]
 [49 50 51 52]
 [53 54 55 56]
 [57 58 59 60]]


[51 52 53 54 55 56 57 58 59 60]


Given a dataset of people [age, salary, score], keep only rows where salary (column 1) > 50,000.

In [None]:
import numpy as np

# Sample dataset: [age, salary, score]
data = np.array([
    [25, 48000, 85],
    [30, 52000, 90],
    [22, 60000, 70],
    [28, 45000, 88],
    [35, 75000, 95]
])

# Create a mask for salary > 50000 (column 1)
mask = data[:, 1] > 50000  # shape: (5,)

# Apply mask to rows
filtered_rows = data[mask]

print("Filtered Data (salary > 50000):\n", filtered_rows)

Filtered Data (salary > 50000):
 [[   30 52000    90]
 [   22 60000    70]
 [   35 75000    95]]


##**Custom Vectorized Function**

In NumPy, you can create custom vectorized functions using the `np.vectorize()` function. This allows you to apply a scalar Python function element-wise to arrays without writing explicit loops.

Basic Syntax: np.vectorize(func)

func : a python function that takes scalar inputs

💡 Example: Custom Vectorized Function to Label Even/Odd Numbers

In [None]:
# Define a scalar function
def even_or_odd(x):
    return "Even" if x % 2 == 0 else "Odd"

# Vectorize it
vec_even_or_odd = np.vectorize(even_or_odd)

# Create a NumPy array
arr = np.array([1, 2, 3, 4, 5])

# Apply vectorized function
result = vec_even_or_odd(arr)

print(result)

['Odd' 'Even' 'Odd' 'Even' 'Odd']


###**Important Notes:**
* `np.vectorize()` does not give performance boosts like true NumPy broadcasting—it is essentially a convenience wrapper around a loop.

* Use `np.where`, broadcasting, or ufuncs for performance-critical operations.

Example with Multiple Arguments:

In [None]:
def add_and_double(x, y):
    return (x + y) * 2

vec_add_double = np.vectorize(add_and_double)

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(vec_add_double(a, b))  # Output: [10 14 18]

[10 14 18]


You are given a NumPy array of salaries. Classify each salary into one of the following categories based on its value:

* "Low" if the salary is less than 40,000

* "Medium" if the salary is between 40,000 (inclusive) and 60,000 (exclusive)

* "High" if the salary is 60,000 or more

* Use "Unknown" or "None" as a fallback for any salary not fitting the above conditions (optional)




In [None]:
#Method 1
salary = np.array([20000, 30000, 40000, 50000, 120000])

conditions = [
    salary < 40000,
    (salary >= 40000) & (salary < 60000),
    salary > 60000
]

choices = ['Low','Medium','High']
levels = np.select(conditions, choices, default = 'Unknown')
print("Levels: \n",levels)

Levels: 
 ['Low' 'Low' 'Medium' 'Medium' 'High']


In [None]:
#Method 2
salary = np.array([20000, 30000, 40000, 50000, 120000])
levels = []

for s in salary:
    if s < 40000:
        levels.append('Low')
    elif s >= 40000 and s < 60000:
        levels.append('Medium')
    elif s > 60000:
        levels.append('High')
    else:
        levels.append('None')

print("Levels: \n", np.array(levels))

Levels: 
 ['Low' 'Low' 'Medium' 'Medium' 'High']


###**Mini Project**

Create a matrix of test scores for several candidates. Each row represents a candidate, and each column represents a score in a particular subject.

**Task:**

* Replace all test scores below 50 with 50 (treating scores under 50 as "invalid").

* Find candidates who scored more than 70 in all subjects and shortlist them.
  Classify the candidates:
  * "Excellent" if all scores are greater than 80.
  * "Good" if the average score is greater than or equal to 70.
  * "Rejected" otherwise

In [None]:
import numpy as np

# Step 1: Creating the test score matrix
scores = np.arange(20, 110, 5).reshape(6, 3)
print("Original Scores:\n", scores)

# Step 2: Replacing all test scores below 50 with 50
newscores = np.where(scores < 50, 50, scores)
print("\nCorrected Scores (below 50 replaced with 50):\n", newscores)

# Step 3: Shortlisting candidates who scored >70 in all subjects
shortlist_mask = np.all(newscores > 70, axis=1)
shortlisted = newscores[shortlist_mask]

print("\nShortlisted Candidates (All scores > 70):\n", shortlisted)

# Step 4: Classifying shortlisted candidates
print("\nClassification of Shortlisted Candidates:")
for i, candidate in enumerate(shortlisted):
    if np.all(candidate > 80):
        label = "Excellent"
    elif np.mean(candidate) >= 70:
        label = "Good"
    else:
        label = "Rejected"
    print(f"Candidate {i+1}: {label}")

Original Scores:
 [[ 20  25  30]
 [ 35  40  45]
 [ 50  55  60]
 [ 65  70  75]
 [ 80  85  90]
 [ 95 100 105]]

Corrected Scores (below 50 replaced with 50):
 [[ 50  50  50]
 [ 50  50  50]
 [ 50  55  60]
 [ 65  70  75]
 [ 80  85  90]
 [ 95 100 105]]

Shortlisted Candidates (All scores > 70):
 [[ 80  85  90]
 [ 95 100 105]]

Classification of Shortlisted Candidates:
Candidate 1: Good
Candidate 2: Excellent


You have a 2D array. Extract all even numbers from the array using Boolean masking.

In [None]:
arr = np.array([[1,2,3,4,5,6,7,67,8,64]])
newarr = arr.reshape(2,5)
print(newarr)
print(newarr.shape)
mask = newarr % 2 == 0
even_numbers = newarr[mask]
print("\n",even_numbers)

[[ 1  2  3  4  5]
 [ 6  7 67  8 64]]
(2, 5)

 [ 2  4  6  8 64]


You have a 2D array of student scores. Replace all scores greater than 80 with 100, and all scores less than 50 with 0. Then, calculate the average score per row and column, and return the row with the highest average. Don't use IF conditions.

In [None]:
scores = np.arange(20, 110, 5).reshape(6, 3)
print(scores)

[[ 20  25  30]
 [ 35  40  45]
 [ 50  55  60]
 [ 65  70  75]
 [ 80  85  90]
 [ 95 100 105]]


In [None]:
newscores = np.where(scores > 80, 100, np.where(scores < 50, 0, scores))
newscores

array([[  0,   0,   0],
       [  0,   0,   0],
       [ 50,  55,  60],
       [ 65,  70,  75],
       [ 80, 100, 100],
       [100, 100, 100]])

In [None]:
row_avg = newscores.mean(axis=1)
col_avg = newscores.mean(axis=0)
max_row = newscores[np.argmax(row_avg)]

print("Row-wise average:", row_avg)
print("Column-wise average:", col_avg)
print("Row with highest average:", max_row)

Row-wise average: [  0.           0.          55.          70.          93.33333333
 100.        ]
Column-wise average: [49.16666667 54.16666667 55.83333333]
Row with highest average: [100 100 100]


You have a matrix of sales data. For each column (representing a different product), apply a custom discount function to simulate a sales discount.

The discount is as follows:

* 15% discount if sales are above 1000.

* 10% discount if sales are between 500 and 1000.

* No discount for sales below 500.

Then, calculate the total sales after discount for each product.

In [None]:
sales = scores = np.arange(300, 1500, 100).reshape(4, 3)

discounted_sales = sales.copy()
discounted_sales = np.where(sales > 1000, sales * (1 - 0.15), discounted_sales)
discounted_sales = np.where((sales >= 500) & (sales <= 1000), sales *(1 - 0.10), discounted_sales)
print("Discounted Sales Matrix:\n", discounted_sales)

total = discounted_sales.sum(axis=0)
print("Total Sales After Discount per Product:", total)

Discounted Sales Matrix:
 [[ 300.  400.  450.]
 [ 540.  630.  720.]
 [ 810.  900.  935.]
 [1020. 1105. 1190.]]
Total Sales After Discount per Product: [2670. 3035. 3295.]


In [None]:
sales = scores = np.arange(300, 1500, 100).reshape(4, 3)
def discount(i):
    if i > 1000:
        i = i * (1 - 0.15)
    elif i >= 500 and i <= 1000:
        i = i * (1 - 0.1)
    else:
        i = i
    return i

discount_func = np.vectorize(discount)
discount_cost = discount_func(sales)
total = discount_cost.sum(axis = 0)

print("Discounted Sales Matrix:\n",discount_cost)
print("Total Sales After Discount per Product:",total)

Discounted Sales Matrix:
 [[ 300  400  450]
 [ 540  630  720]
 [ 810  900  935]
 [1020 1105 1190]]
Total Sales After Discount per Product: [2670 3035 3295]


#SciPy - Scientific Computing with Python

### Introduction
* SciPy is a Python library used for scientific and technical computing.
* It builds on NumPy and includes modules for:
- Optimization
- Integration
- Interpolation
- Linear algebra
- Statistics
- Signal and image processing
- Differential equations



### SciPy Sub-packages Overview

| Sub-package       | Description                                |
|-------------------|--------------------------------------------|
| scipy.optimize    | Optimization (min/max problems)            |
| scipy.integrate   | Numerical integration                      |
| scipy.interpolate | Interpolating functions                    |
| scipy.linalg      | Linear algebra operations                  |
| scipy.stats       | Statistics and probability                 |
| scipy.fft         | Fast Fourier Transforms                    |
| scipy.signal      | Signal processing                          |
| scipy.sparse      | Sparse matrix support                      |
| scipy.spatial     | Spatial algorithms and distances           |
| scipy.cluster     | Clustering algorithms                      |
| scipy.ndimage     | Image processing                           |
| scipy.io          | File input/output (e.g., .mat files)       |


###Examples:

**1.Integration**

In [1]:
from scipy import integrate

# Define a simple function to integrate
result, error = integrate.quad(lambda x: x**2, 0, 3)
print(result)

9.000000000000002


**2. Optimization**

In [3]:
from scipy.optimize import minimize

# Minimize the function f(x) = x^2 + 5
res = minimize(lambda x: x**2 + 5, x0=3)
print(res.x)  # Output close to 0

[-2.83269601e-08]


**3. Linear Algebra**

In [4]:
from scipy import linalg
import numpy as np

A = np.array([[3, 2], [1, 4]])
b = np.array([6, 5])

x = linalg.solve(A, b)
print(x)  # Solution to Ax = b

[1.4 0.9]


**4. Statistics**

In [5]:
from scipy import stats

data = [1, 2, 3, 4, 5, 6, 7]
mean = stats.tmean(data)
std = stats.tstd(data)
print(mean, std)

4.0 2.160246899469287


###**When to Use SciPy?**

* When NumPy isn't enough for scientific tasks.

* When you need precise control and advanced mathematical methods.

* In fields like physics, engineering, finance, AI, and data science.

#Sparse Matrices

Sparse matrices are matrices in which most of the elements are zero. They are common in various fields, such as scientific computing, machine learning, and graph theory, where data sets often contain many zero entries. Using efficient data structures to represent sparse matrices can save memory and computational resources.

### Characteristics of Sparse Matrices
- **Storage Efficiency**: Sparse matrices can be very large but contain only a few non-zero elements, making it inefficient to store them as regular dense matrices.
- **Computational Efficiency**: Many algorithms can be optimized to take advantage of the sparsity, reducing computation time.

### Common Data Structures for Sparse Matrices

1. **Coordinate List (COO)**
   - Stores a list of (row, column, value) tuples for each non-zero element.
   - Simple and easy to construct, but not the most efficient for arithmetic operations.

   Attributes: data, col,row

   

2. **Compressed Sparse Row (CSR)**
   - Efficient for arithmetic operations, row slicing, and matrix-vector products.
   - Stores data in three arrays: `data`, `indices`, and `indptr`.
     - `data`: Non-zero values.
     - `indices`: Column indices of the corresponding values in `data`.
     - `indptr`: Cumulative count of non-zero elements per row.




3. **Compressed Sparse Column (CSC)**
   - Similar to CSR but optimized for column operations.
   - Uses `data`, `indices`, and `indptr`, but stores data column-wise.


4. **Dictionary of Keys (DOK)**
   - A flexible data structure for constructing sparse matrices incrementally.
   - Uses a dictionary to map (row, column) pairs to values.

**Use it when:**

- Dynamic Construction: If you need to build a sparse matrix iteratively or if the matrix structure is not known in advance.

- Sparse Matrix Modifications: If you expect to frequently add or modify individual elements.

Attributes: items()


5. **List of Lists (LIL)**
   - Represents each row as a list of (column index, value) pairs.
   - Good for incrementally constructing a sparse matrix.
   
  Attributes: data, rows

   

### Choosing the Right Format
- **COO**: Good for constructing sparse matrices.
- **CSR**: Best for fast arithmetic operations and matrix-vector products.
- **CSC**: Optimal for column slicing and matrix-vector products.
- **DOK**: Useful for incremental construction.
- **LIL**: Effective for building sparse matrices row-by-row.

### Applications
- **Graph Representation**: Sparse matrices are often used to represent adjacency matrices in graph theory.
- **Machine Learning**: They are common in feature matrices for text data (e.g., bag-of-words models).
- **Finite Element Analysis**: Sparse matrices represent large systems of equations arising from discretized differential equations.



In [6]:
import numpy as np
from scipy.sparse import coo_matrix

# Coordinates (row, column, value)
rows = np.array([0, 1, 2, 0])
cols = np.array([0, 2, 2, 1])
values = np.array([4, 5, 7, 9])

# Create COO sparse matrix
sparse_matrix = coo_matrix((values, (rows, cols)), shape=(3, 3))
print(f"Sparse output\n{sparse_matrix}\n")
print(sparse_matrix.toarray())

Sparse output
<COOrdinate sparse matrix of dtype 'int64'
	with 4 stored elements and shape (3, 3)>
  Coords	Values
  (0, 0)	4
  (1, 2)	5
  (2, 2)	7
  (0, 1)	9

[[4 9 0]
 [0 0 5]
 [0 0 7]]


In [7]:
sparse_matrix.col

array([0, 2, 2, 1], dtype=int32)

In [8]:
from scipy.sparse import csr_matrix

# Create a CSR sparse matrix
sparse_csr = csr_matrix(sparse_matrix) #np.array([[2,0,0],[4,5,0],[0,1,0]])
print(sparse_csr)
print()
print(sparse_csr.toarray())

<Compressed Sparse Row sparse matrix of dtype 'int64'
	with 4 stored elements and shape (3, 3)>
  Coords	Values
  (0, 0)	4
  (0, 1)	9
  (1, 2)	5
  (2, 2)	7

[[4 9 0]
 [0 0 5]
 [0 0 7]]


In [None]:
sparse_csr.indptr

array([0, 2, 3, 4], dtype=int32)

In [9]:
dense_matrix = np.array([[0, 0, 3],
                          [4, 0, 5],
                          [0, 6, 0]])

# Convert to CSR format
sparse_matrix_csr = csr_matrix(dense_matrix)

print("CSR matrix:\n", sparse_matrix_csr)
print("Dense representation:\n", sparse_matrix_csr.toarray())

CSR matrix:
 <Compressed Sparse Row sparse matrix of dtype 'int64'
	with 4 stored elements and shape (3, 3)>
  Coords	Values
  (0, 2)	3
  (1, 0)	4
  (1, 2)	5
  (2, 1)	6
Dense representation:
 [[0 0 3]
 [4 0 5]
 [0 6 0]]


Data: [3, 4, 5, 6]

Column Indices: [2, 0, 2, 1]

Row Pointer: [0, 1, 3, 4]

In [None]:
from scipy.sparse import csc_matrix
csc_mat= csc_matrix(np.array(np.array([[0,0,3],[4,0,5],[0,6,0]])))
print(csc_mat.toarray())
print(csc_mat.data)
print(csc_mat.indices)
csc_mat.indptr

[[0 0 3]
 [4 0 5]
 [0 6 0]]
[4 6 3 5]
[1 2 0 1]


array([0, 1, 2, 4], dtype=int32)

In [None]:
from scipy.sparse import csc_matrix

# Create a CSC sparse matrix
sparse_csc = csc_matrix(sparse_matrix)
print(sparse_csc)
print()
print(sparse_csc.toarray())

  (0, 0)	4
  (0, 1)	9
  (1, 2)	5
  (2, 2)	7

[[4 9 0]
 [0 0 5]
 [0 0 7]]


In [None]:
from scipy.sparse import dok_matrix

# Create a DOK sparse matrix
sparse_dok = dok_matrix((3, 3))

sparse_dok[0, 0] = 4
sparse_dok[1, 2] = 5
sparse_dok[2, 2] = 7
sparse_dok[0, 1] = 9
print(f'after\n{sparse_dok}')

print(sparse_dok.toarray())

after
  (0, 0)	4.0
  (1, 2)	5.0
  (2, 2)	7.0
  (0, 1)	9.0
[[4. 9. 0.]
 [0. 0. 5.]
 [0. 0. 7.]]


In [None]:
sparse_dok.items()

dict_items([((0, 0), 4.0), ((1, 2), 5.0), ((2, 2), 7.0), ((0, 1), 9.0)])

In [None]:
   from scipy.sparse import lil_matrix

   # Create a LIL sparse matrix
   sparse_lil = lil_matrix((3, 3))
   sparse_lil[0, 0] = 4
   sparse_lil[1, 2] = 5
   sparse_lil[2, 2] = 7
   sparse_lil[0, 1] = 9
   print(sparse_lil.toarray())

[[4. 9. 0.]
 [0. 0. 5.]
 [0. 0. 7.]]


In [None]:
sparse_lil.rows #col indices in each row for non-zero values

array([list([0, 1]), list([2]), list([2])], dtype=object)

####Common methods and functions

.toarray() or .todense(): Convert sparse matrix to a dense (NumPy) array.

.transpose(): Transpose of the matrix.

.tocsr(), .tocsc(), .tocoo(), etc.: Convert between sparse formats.

.multiply(): Element-wise multiplication.

.dot(): Matrix-vector or matrix-matrix multiplication.

.sum(axis): Sum over the specified axis.

.nonzero(): Indices of non-zero elements.

.eliminate_zeros(): Remove zero entries from sparse matrix.

.setdiag(): Set values along the diagonal of the matrix.

.shape, .nnz: Get the shape and the number of non-zero elements in the matrix.


In [None]:
sparse_csr.toarray()

array([[4, 9, 0],
       [0, 0, 5],
       [0, 0, 7]])

In [None]:
sparse_csc.toarray()

array([[4, 9, 0],
       [0, 0, 5],
       [0, 0, 7]])

In [None]:
sparse_csr.transpose().toarray()

array([[4, 0, 0],
       [9, 0, 0],
       [0, 5, 7]])

In [None]:
sparse_csr.multiply(sparse_csc).toarray()

array([[16, 81,  0],
       [ 0,  0, 25],
       [ 0,  0, 49]])

In [None]:
sparse_csr.dot(sparse_csc).toarray()

array([[16, 36, 45],
       [ 0,  0, 35],
       [ 0,  0, 49]])

In [None]:
sparse_csr.toarray()

array([[4, 9, 0],
       [0, 0, 5],
       [0, 0, 7]])

In [None]:
sparse_csr.sum(axis=0)

matrix([[ 4,  9, 12]])

In [None]:
sparse_csr.nonzero()

(array([0, 0, 1, 2], dtype=int32), array([0, 1, 2, 2], dtype=int32))

In [None]:
sparse_csr.setdiag(values=[2,3,1],k=-2)
sparse_csr.toarray()

array([[4, 9, 0],
       [0, 0, 5],
       [2, 0, 7]])

In [None]:
sparse_csr.nnz

5

Problem:

Create a sparse matrix using SciPy’s csr_matrix format with at least 100 rows and 100 columns, where only 5% of the elements are non-zero.

Convert the matrix to COO and CSC formats.

Print the number of non-zero elements in each format.

Perform matrix addition with another sparse matrix of the same dimensions.

In [None]:
from scipy.sparse import csr_matrix, coo_matrix, csc_matrix
import numpy as np

# Creating a sparse matrix with 100 rows and 100 columns, with 5% non-zero elements
rows, cols = 100, 100
density = 0.05
num_nonzeros = int(rows * cols * density)

# Random indices and data
row_indices = np.random.randint(0, rows, size=num_nonzeros)
col_indices = np.random.randint(0, cols, size=num_nonzeros)
data = np.random.rand(num_nonzeros)

# Creating the sparse matrix in CSR format
csr100 = csr_matrix((data, (row_indices, col_indices)), shape=(rows, cols))

# Converting to COO and CSC formats
coo100 = csr100.tocoo()
csc100 = csr100.tocsc()

# Print the number of non-zero elements in each format
print("Non-zero elements in CSR format:", csr100.nnz)
print("Non-zero elements in COO format:", coo100.nnz)
print("Non-zero elements in CSC format:", csc100.nnz)

# Matrix addition with another sparse matrix
sparse_csr2 = csr_matrix((data, (row_indices, col_indices)), shape=(rows, cols))
added_matrix = csr100 + sparse_csr2
print("Result of sparse matrix addition:\n", added_matrix)


Non-zero elements in CSR format: 489
Non-zero elements in COO format: 489
Non-zero elements in CSC format: 489
Result of sparse matrix addition:
   (0, 49)	0.8396363364201294
  (0, 56)	1.8241471496881876
  (0, 66)	1.343394503538519
  (0, 73)	0.25324246310442367
  (0, 74)	1.2894278128161885
  (1, 0)	0.200976985545253
  (1, 27)	1.3308285156365123
  (1, 42)	0.6059742932551588
  (1, 68)	1.9588083441937185
  (1, 98)	1.7221234998102692
  (2, 46)	1.891178131357571
  (2, 50)	1.7538014987441575
  (2, 55)	0.535150219713431
  (2, 57)	1.02402436201118
  (2, 91)	1.924367314276664
  (3, 21)	0.35989910431702055
  (3, 41)	1.7929614930711262
  (3, 65)	0.6821225657560044
  (3, 73)	0.14788785064886434
  (4, 22)	0.5951829953401278
  (4, 75)	0.787565234653864
  (5, 17)	1.7738481779708548
  (5, 25)	1.9131176196784034
  (5, 55)	0.8258392618855259
  (5, 58)	1.9604079467693913
  :	:
  (92, 81)	0.06240287166284286
  (92, 83)	0.29422080164801345
  (93, 31)	1.6842326586479681
  (93, 37)	0.7147432170608714
  (93, 

 Create a 5x5 sparse matrix where only the diagonal elements are 1 and the rest are zeros. Print both the sparse and dense representations.

In [None]:
diag_csr=csr_matrix(np.eye(5))
print(diag_csr)
print(diag_csr.toarray())

  (0, 0)	1.0
  (1, 1)	1.0
  (2, 2)	1.0
  (3, 3)	1.0
  (4, 4)	1.0
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


For a given sparse matrix, print the indices and values of all non-zero elements.

In [None]:
csr1= csr_matrix(np.array([[1,0,0],[3,4,0],[0,0,2]]))
r,c= csr1.nonzero()
val= csr1.data
for r,c,v in zip(r,c,val):
  print((r,c),v)

(0, 0) 1
(1, 0) 3
(1, 1) 4
(2, 2) 2


Given a sparse matrix, count the number of non-zero elements in each row and print the results.

In [None]:
non_zero_counts = csr1.getnnz(axis=1)

print("Non-zero elements per row:", non_zero_counts)

Non-zero elements per row: [1 2 1]


Given a sparse matrix, extract a 2x2 submatrix from the top left corner and print the result.

In [None]:
print(csr1.toarray())
csr1[:2,:2].toarray()

[[1 0 0]
 [3 4 0]
 [0 0 2]]


array([[1, 0],
       [3, 4]])

Write python code to perform following operation using NumPy Library package:
1.	Calculate and display the sum of all the diagonal elements of a NumPy array.
2.	Compute and display the covariance matrix of a given NumPy arrays
Use: 3 X 3 Matrix


In [None]:
import numpy as np

li = list(map(int,input("Enetr elements of a matrix seperated by spaces: ").strip().split()))
matrix = np.array_split(li,3)
matrix = np.array(matrix)
print(np.array(matrix))

# Calculate and display the sum of all diagonal elements
diagonal_sum = np.trace(matrix)
print("Sum of all diagonal elements:", diagonal_sum)

# Compute and display the covariance matrix
covariance_matrix = np.cov(matrix)
print("Covariance matrix:\n", covariance_matrix)


Write Python Code to invert a matrix representing a system of linear equations and find the solution.
- Enter coefficients
- Enter constants
- Create matrix for both
- Solve and find the answer


In [None]:
import numpy as np

# Input the coefficients of the equations
print("Enter the coefficients for a 3x3 matrix (system of 3 equations with 3 variables):")
coefficients = []
for i in range(3):
    row = list(map(float, input(f"Enter coefficients for equation {i+1} separated by spaces: ").split()))
    coefficients.append(row)
print(coefficients)
# Input the constants on the right side of the equations
print("Enter the constants for each equation:")
constants = list(map(float, input("Enter constants separated by spaces: ").split()))

# Convert lists to NumPy arrays
A = np.array(coefficients)
B = np.array(constants)

# Calculate the inverse of the coefficient matrix A
try:
    A_inv = np.linalg.inv(A)
    # Solve the system of equations using matrix inversion  X = inverse(A).B
    solution = np.dot(A_inv, B)
    print("The solution for the system of equations is:", solution)
except np.linalg.LinAlgError:
    print("The matrix is singular and cannot be inverted. The system may not have a unique solution.")


Code to Convert a sparse matrix in CSR format back to a dense matrix without using SciPy Libraries. Follow the below steps to complete the task:
- The user inputs the row and column indices and the corresponding values for the non-zero elements of the matrix.
- A CSR-like structure is created using three lists: values, row_indices, and col_indices.
- The convert_to_dense function converts this structure into a dense matrix.
- The dense matrix is printed out.

In [None]:
import numpy as np

def create_csr_structure(row_indices, col_indices, values, num_rows, num_cols):
    # Initialize CSR-like structure
    csr_values = values
    csr_col_indices = col_indices
    csr_row_start = [0] * (num_rows + 1)

    # Populate row_start with the starting index of each row in values
    for row in row_indices:
        csr_row_start[row + 1] += 1

    # Cumulative sum to get the row start pointers
    for i in range(1, len(csr_row_start)):
        csr_row_start[i] += csr_row_start[i - 1]

    return csr_values, csr_row_start, csr_col_indices

def convert_to_dense(values, row_start, col_indices, num_rows, num_cols):
    # Initialize a dense matrix with zeros
    dense_matrix = np.zeros((num_rows, num_cols))

    # Populate the dense matrix
    for row in range(num_rows):
        start = row_start[row]
        end = row_start[row + 1]
        for idx in range(start, end):
            col = col_indices[idx]
            dense_matrix[row][col] = values[idx]

    return dense_matrix

# User inputs for the sparse matrix in CSR format
num_rows = int(input("Enter the number of rows: "))
num_cols = int(input("Enter the number of columns: "))
num_nonzero = int(input("Enter the number of non-zero elements: "))

# Initialize lists for CSR format inputs
row_indices = []
col_indices = []
values = []

print("Enter the row index, column index, and value for each non-zero element:")

for _ in range(num_nonzero):
    row, col, val = map(int, input("Row Col Value: ").split())
    row_indices.append(row)
    col_indices.append(col)
    values.append(val)

# Create CSR-like structure
csr_values, csr_row_start, csr_col_indices = create_csr_structure(row_indices, col_indices, values, num_rows, num_cols)

# Convert to dense matrix
dense_matrix = convert_to_dense(csr_values, csr_row_start, csr_col_indices, num_rows, num_cols)

# Print the dense matrix
print("Dense matrix:")
print(dense_matrix)



In [None]:
# seed function

import numpy as np
np.random.seed(0)
print(np.random.rand(3,3))