### **Introduction to NumPy for Data Science**
**NumPy** (short for Numerical Python) is one of the most fundamental and widely-used libraries in Python for numerical computing and data manipulation. It provides support for working with arrays, matrices, and numerical operations, making it essential for scientific computing and data science tasks.
In this note, we’ll cover the basics of NumPy, including its key features, how to work with arrays, important functions, and practical use cases.


#### Key Features of NumPy

1.**Multidimensional Arrays**: NumPy’s core is the `ndarray` (N-dimensional array) object, which is a fast, flexible container for large datasets.

2.**Mathematical Functions**: It includes efficient implementations of mathematical operations, such as element-wise operations, matrix transformations, and linear algebra functions.

3.**Broadcasting**: NumPy allows operations on arrays of different shapes, applying the operation element-wise through broadcasting.

4.**Integration with Other Libraries**: NumPy is the foundation of libraries like Pandas, SciPy, TensorFlow, and others in Python’s data science ecosystem.


#### NumPy Basics

##### 1. Importing NumPy
To use NumPy, you first need to import the library:

`import numpy as np`

It’s common to import NumPy with the alias **np**, as it saves typing in the code.

##### 2. NumPy Arrays

NumPy’s main object is the **array**. You can create arrays in different ways:


##### 2.1 Creating Arrays from Lists



In [2]:
import numpy as np

# 1D array
arr = np.array([1, 2, 3, 4, 5])
print(arr)


[1 2 3 4 5]


In [4]:
# 2D array (Matrix)
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix)


[[1 2 3]
 [4 5 6]]


##### 2.2 Creating Arrays with Zeros, Ones, and Custom Values
NumPy provides convenience functions to create arrays with specific values:

•	**np.zeros():** Creates an array filled with zeros.

•	**np.ones():** Creates an array filled with ones.

•	**np.full():** Creates an array filled with a specified value.


In [6]:
# Array of zeros
zero_array = np.zeros((2, 3))
print(zero_array)

# Array of ones
ones_array = np.ones((3, 4))
print(ones_array)

# Array filled with a specific value
custom_array = np.full((2, 2), 8) # fill 1 2 by 2 matrix with 7
print(custom_array)


[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[[8 8]
 [8 8]]


#### 2.3 Creating Arrays with a Sequence of Numbers
•	**np.arange()**: Generates a sequence of numbers with a specified step.

•	**np.linspace()**: Generates a sequence of numbers with a specified number of points between a start and stop value.

In [8]:
# Array with a range of numbers
range_array = np.arange(0, 12, 2)  # From 0 to 10, with a step of 2
print(range_array)

# Array with a sequence of evenly spaced numbers
linspace_array = np.linspace(0, 1, 10)  # From 0 to 1, 5 points
print(linspace_array)


[ 0  2  4  6  8 10]
[0.         0.11111111 0.22222222 0.33333333 0.44444444 0.55555556
 0.66666667 0.77777778 0.88888889 1.        ]


In [11]:
import numpy as np
# NumPy matrices
A = np.arange(15,24).reshape(3,3)
B = np.arange(20,29).reshape(3,3)
print("A: ",A)
print("B: ",B)
# Multiply A and B
#result = A.dot(B)
#print("Result: ", result)

A:  [[15 16 17]
 [18 19 20]
 [21 22 23]]
B:  [[20 21 22]
 [23 24 25]
 [26 27 28]]


#### Array Operations
NumPy provides various operations for manipulating arrays efficiently. These operations are often faster than their equivalent Python list operations.

##### 1. Element-wise Operations
NumPy allows you to perform operations on each element of an array.

In [12]:
# Element-wise addition
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
result = arr1 + arr2  # Adds corresponding elements
print(result)

# Element-wise multiplication
result = arr1*arr2  # Multiplies corresponding elements
print(result)


[5 7 9]
[ 4 10 18]


#### 2. Broadcasting
*Broadcasting* is a feature that allows NumPy to perform operations between arrays of different shapes.


In [14]:
# Add scalar to each element of the array
arr = np.array([1, 2, 3])
result = arr + 40
print(result)


[41 42 43]


#### 3. Mathematical Functions
NumPy provides a wide range of mathematical functions that can be applied to arrays.


In [22]:
arr = np.array([1, 2, 3, 4])

# # Exponentiation
print(np.exp(arr))

# # Square root
print(np.sqrt(arr))

# # square
print(np.square(arr))

# # Trigonometric functions
print(np.sin(arr))
print(np.cos(arr))

## Power 
print(arr**2) # power of 2
print(arr**3) # power of 3
print(arr**4) # power of 4
print(arr**5) # power of 5

[ 2.71828183  7.3890561  20.08553692 54.59815003]
[1.         1.41421356 1.73205081 2.        ]
[ 1  4  9 16]
[ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
[ 0.54030231 -0.41614684 -0.9899925  -0.65364362]
[ 1  4  9 16]
[ 1  8 27 64]
[  1  16  81 256]
[   1   32  243 1024]


In [17]:
arr = np.array([1, 2, 3, 4])
arr_square = arr**2
arr_square

array([ 1,  4,  9, 16])

#### Array Indexing and Slicing
Just like Python lists, you can access and modify individual elements or slices of a NumPy array.
##### 1. Indexing


In [25]:
arr = np.array([10, 20, 30, 40, 50])
print(arr[3])  # Access element at index 3
# 

40


##### 2. Slicing
You can extract specific parts (slices) of an array using slicing notation.


In [26]:
arr = np.array([10, 20, 30, 40, 50])

# Slice from index 1 to 4 (exclusive)
print(arr[1:4])

# Slice with a step
print(arr[::2])  # first element followed by every second element


[20 30 40]
[10 30 50]


For 2D arrays (matrices), you can access rows and columns with indexing and slicing.

In [None]:
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(matrix)

# Access the first row
print(matrix[0])

# Access the element at row 1, column 2
print(matrix[1, 2])

# Slice a submatrix (rows 0 and 1, columns 0 and 1)
print(matrix[0:2, 0:2])

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[1 2 3]
6
[[1 2]
 [4 5]]


This code demonstrates various ways of accessing and slicing elements in a NumPy 2D array (matrix). 

#### The matrix:

```python
matrix = np.array([[1, 2, 3], 
                   [4, 5, 6], 
                   [7, 8, 9]])
```

This creates a 3x3 matrix (2D array):
$$
\begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}
$$

#### 1. **Access the first row (`matrix[0]`)**

```python
print(matrix[0])
```

- `matrix[0]` refers to the first row of the matrix.
- Indexing in Python and NumPy starts from `0`, so `matrix[0]` selects the entire first row (index `0`).
  
Output:
```plaintext
[1 2 3]
```

This prints the first row of the matrix, which is `[1, 2, 3]`.

#### 2. **Access the element at row 1, column 2 (`matrix[1, 2]`)**

```python
print(matrix[1, 2])
```

- `matrix[1, 2]` refers to the element located in the second row (index `1`) and the third column (index `2`).
  
In the matrix:
$$
\begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}
$$
The element in row 1, column 2 is `6`.

Output:
```plaintext
6
```

#### 3. **Slice a submatrix (`matrix[0:2, 0:2]`)**

```python
print(matrix[0:2, 0:2])
```

- `matrix[0:2, 0:2]` slices a submatrix from the original matrix.
- **Rows `0:2`**: This specifies that you want rows from index `0` up to (but not including) index `2`, which means you are selecting the first two rows.
- **Columns `0:2`**: Similarly, this specifies columns from index `0` up to (but not including) index `2`, meaning you are selecting the first two columns.

So, from the original matrix:
$$
\begin{bmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{bmatrix}
$$

The submatrix formed by rows `0:2` and columns `0:2` is:
$$
\begin{bmatrix}
1 & 2 \\
4 & 5
\end{bmatrix}
$$

Output:
```plaintext
[[1 2]
 [4 5]]
```

#### Summary:

- `matrix[0]` returns the first row `[1, 2, 3]`.
- `matrix[1, 2]` accesses the element at row `1` and column `2`, which is `6`.
- `matrix[0:2, 0:2]` slices a 2x2 submatrix from the top-left corner, resulting in `[[1, 2], [4, 5]]`.

### Shape and Reshaping
##### 1. Array Shape
The shape of an array is its size along each dimension. You can access the shape of an array with .shape.


In [31]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)  # Output: (2, 3)
print(arr.size)  # Output: 6

(2, 3)
6


#### 2. Reshaping Arrays
You can change the shape of an array with np.reshape().


In [35]:
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = arr.reshape((3, 2))  # Reshape to a 2x3 array
print(reshaped_arr)


[[1 2]
 [3 4]
 [5 6]]


#### Aggregations and Statistics
NumPy offers a wide range of functions for calculating aggregates and statistics.


In [36]:
# Sum of all elements
print(np.sum(arr))

# Mean (average)
print(np.mean(arr))

# Standard deviation
print(np.std(arr))

# Maximum and minimum
print(np.max(arr))
print(np.min(arr))

21
3.5
1.707825127659933
6
1


You can apply these functions along specific axes in a multidimensional array

In [26]:
matrix = np.array([[1, 2, 3], [4, 5, 6]])

# Sum along the rows (axis=1)
print(np.sum(matrix, axis=1))

# Sum along the columns (axis=0)
print(np.sum(matrix, axis=0))


#### Linear Algebra with NumPy
NumPy has several functions for performing matrix and vector operations. These include:


##### 1.	Dot Product:

In [3]:
 	# Dot product of two arrays
import numpy as np
a = np.array([1, 2])
b = np.array([3, 4])
dot_product = np.dot(a, b)
print(dot_product)


11


#### Code Breakdown:

```python
import numpy as np

# Two 1D arrays
a = np.array([1, 2])  # Array 'a' with elements [1, 2]
b = np.array([3, 4])  # Array 'b' with elements [3, 4]

# Compute the dot product
dot_product = np.dot(a, b)

# Print the result
print(dot_product)  # Output will be 11
```

#### What is the Dot Product?

For two 1D arrays (vectors) `a = [1, 2]` and `b = [3, 4]`, the **dot product** is calculated as:


$$ \text{dot product} = (a_1 \times b_1) + (a_2 \times b_2)$$


For your specific arrays:

$$ \text{dot product} = (1 \times 3) + (2 \times 4) = 3 + 8 = 11 $$

#### Explanation of the Code:

1. **Arrays Creation**:
   - `a = np.array([1, 2])` creates a NumPy array `a` with elements `[1, 2]`.
   - `b = np.array([3, 4])` creates a NumPy array `b` with elements `[3, 4]`.

2. **Dot Product Calculation**:
   - `np.dot(a, b)` computes the dot product between the two arrays `a` and `b`.
   - For 1D arrays, this is equivalent to multiplying corresponding elements and then summing the results.

3. **Print Output**:
   - `print(dot_product)` outputs `11`, which is the result of the dot product of `[1, 2]` and `[3, 4]`.

#### General Formula for Dot Product of Two Vectors:
If you have two vectors $ a = [a_1, a_2, ..., a_n]$ and $ b = [b_1, b_2, ..., b_n] $, the dot product is calculated as:

$$\text{dot product} = a_1 \times b_1 + a_2 \times b_2 + ... + a_n \times b_n $$

In this case, since the vectors have only 2 elements, it's a simple summation of products of the corresponding elements.

#### Practical Use of Dot Product:
- **In Geometry**: The dot product is used to find the angle between two vectors.
- **In Machine Learning**: It is often used in calculations related to linear algebra, such as in the computation of weighted sums in neural networks and projections in data transformations.

#### 2.	Matrix Multiplication:

In [12]:
 	# Matrix multiplication
matrix1 = np.array([[1, 2, 5], [3, 4, 7], [3, 4, 7]])
matrix2 = np.array([[5, 6, 4], [7, 8, 4], [2, 6, 9]])
result = np.dot(matrix1, matrix2)
print(result)

[[29 52 57]
 [57 92 91]
 [57 92 91]]


In [11]:
matrix1 = np.array([[1, 2, 5], [3, 4, 1], [3, 9, 7]])
matrix1

array([[1, 2, 5],
       [3, 4, 1],
       [3, 9, 7]])

In [6]:
matrix2 = np.array([[5, 6], [7, 8]])
matrix2 

array([[5, 6],
       [7, 8]])

In [5]:
matrix1 = np.array([[1, 2], [3, 4]])
matrix1

array([[1, 2],
       [3, 4]])

This code performs **matrix multiplication** between two 2D arrays (`matrix1` and `matrix2`) using NumPy's `np.dot()` function. Let's break it down:

#### The matrices:

```python
matrix1 = np.array([[1, 2], 
                    [3, 4]])

matrix2 = np.array([[5, 6], 
                    [7, 8]])
```

These create two 2x2 matrices:

Matrix 1:
$$
\begin{bmatrix}
1 & 2 \\
3 & 4
\end{bmatrix}
$$

Matrix 2:
$$
\begin{bmatrix}
5 & 6 \\
7 & 8
\end{bmatrix}
$$

#### Matrix Multiplication:

Matrix multiplication is not the same as element-wise multiplication. Instead, it follows a rule where each element in the resulting matrix is the dot product of rows from the first matrix and columns from the second matrix.

In Python, this operation is performed by:
```python
result = np.dot(matrix1, matrix2)
```

- The dot product of two matrices is calculated by multiplying the rows of the first matrix by the columns of the second matrix and summing the products.

For two 2x2 matrices `(A)` and `(B )`, the result of their multiplication is:

$$
\text{Result}_{i,j} = \sum \text{A}_{i,k} \times \text{B}_{k,j}
$$

#### Step-by-step Calculation:

For the first element of the resulting matrix (row 1, column 1):


$$(1 \times 5) + (2 \times 7) = 5 + 14 = 19$$


For the first row, second column: (row 1, column 2)


$$(1 \times 6) + (2 \times 8) = 6 + 16 = 22$$

For the second row, first column: (row 2, column 1)


$$(3 \times 5) + (4 \times 7) = 15 + 28 = 43$$

For the second row, second column:  (row 2, column 2)


$$(3 \times 6) + (4 \times 8) = 18 + 32 = 50$$

#### The result:

So, the resulting matrix is:
$$
\begin{bmatrix}
19 & 22 \\
43 & 50
\end{bmatrix}
$$

#### Output:

```python
print(result)
```

This will output:
```plaintext
[[19 22]
 [43 50]]
```

### Summary:

- Matrix multiplication of `matrix1` and `matrix2` is performed using the `np.dot()` function.
- The resulting matrix is computed by summing the products of corresponding elements from rows of the first matrix and columns of the second matrix.
- The final result is `[[19, 22], [43, 50]]`.

#### 3.	Determinant, Inverse, and Eigenvalues:

In [15]:
from numpy.linalg import inv, det, eig

matrix = np.array([[1, 2], [3, 4]])

# Determinant
print(det(matrix))

# Inverse
print(inv(matrix))

# Eigenvalues
print(eig(matrix))


-2.0000000000000004
[[-2.   1. ]
 [ 1.5 -0.5]]
EigResult(eigenvalues=array([-0.37228132,  5.37228132]), eigenvectors=array([[-0.82456484, -0.41597356],
       [ 0.56576746, -0.90937671]]))


In [None]:
!pip install numpy.linalg

This code imports three essential linear algebra functions from the `numpy.linalg` module: `inv` (for inverse), `det` (for determinant), and `eig` (for eigenvalues and eigenvectors). It operates on a 2x2 matrix defined as `matrix = np.array([[1, 2], [3, 4]])`.

#### 1. **Determinant of the Matrix (`det`)**

The determinant is a scalar value that can be computed from a square matrix. It provides important properties of the matrix, such as whether it is invertible (a non-zero determinant implies that the matrix is invertible).

In the code:
```python
print(det(matrix))
```

The determinant of the matrix:

$$\text{matrix} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$$

is calculated as:

$$\text{det(matrix)} = (1 \times 4) - (2 \times 3) = 4 - 6 = -2$$


So, the determinant of the matrix is `-2`.

#### 2. **Inverse of the Matrix (`inv`)**

The inverse of a matrix \( A \) is a matrix \( A^{-1} \) such that:

$$A \times A^{-1} = I$$

where `(I)` is the identity matrix.

In the code:
```python
print(inv(matrix))
```

For the matrix:
$$\text{matrix} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$$

its inverse (if the matrix is invertible, i.e., has a non-zero determinant) is calculated using the formula:

$$A^{-1} = \frac{1}{\text{det}(A)} \times \text{adj}(A)$$

where the adjoint or adjugate is found by swapping elements of the diagonal and negating off-diagonal elements.

So, for this matrix, the inverse is:

$$A^{-1} = \frac{1}{-2} \times \begin{bmatrix} 4 & -2 \\ -3 & 1 \end{bmatrix} = \begin{bmatrix} -2 & 1 \\ 1.5 & -0.5 \end{bmatrix}$$


Output:
```
[[-2.   1. ]
 [ 1.5 -0.5]]
```

#### 3. **Eigenvalues and Eigenvectors (`eig`)**

Eigenvalues and eigenvectors are properties of a square matrix that help to understand linear transformations. Eigenvalues are scalars associated with a matrix, and eigenvectors are the vectors that do not change direction when the matrix is applied to them, only their magnitude is scaled by the corresponding eigenvalue.

In the code:
```python
print(eig(matrix))
```

The `eig(matrix)` function returns two values:
- The first array contains the **eigenvalues**.
- The second array contains the **eigenvectors**.

For the matrix:

$$\text{matrix} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$$


The eigenvalues and eigenvectors are calculated as:
```python
eigenvalues, eigenvectors = eig(matrix)
```

Output might look like:
```
Eigenvalues: [5.37228132 -0.37228132]
Eigenvectors: [[ 0.41597356 -0.82456484]
               [ 0.90937671  0.56576746]]
```

- **Eigenvalues**: `[5.37, -0.37]`
- **Eigenvectors**:
    
   $$ \begin{bmatrix} 0.415 & -0.824 \\ 0.909 & 0.566 \end{bmatrix}$$
   

#### Summary:

- **`det(matrix)`** computes the determinant of the matrix. If the determinant is non-zero, the matrix is invertible.

- **`inv(matrix)`** computes the inverse of the matrix. The inverse exists only if the determinant is non-zero.

- **`eig(matrix)`** computes the eigenvalues and eigenvectors of the matrix, which describe its transformation properties in linear algebra.



#### Random Numbers with NumPy
NumPy has a submodule np.random for generating random numbers, useful for simulations and testing machine learning models.


In [24]:
# Generate an array of random numbers
random_numbers = np.random.rand(5)
print(random_numbers)

# Generate a 2D array of random integers between 1 and 10
random_integers = np.random.randint(1, 10, (3, 3))
print(random_integers)

# Set a seed for reproducibility
np.random.seed(42)
print(np.random.rand(3))  # Same result each time


[0.19639042 0.35444646 0.81336595 0.24784955 0.45861133]
[[2 9 3]
 [1 6 7]
 [2 6 6]]
[0.37454012 0.95071431 0.73199394]


##### Conclusion
NumPy is the backbone of numerical computation in Python and plays a crucial role in data science and machine learning. Its ability to handle arrays, perform mathematical operations, and interact with other libraries makes it indispensable for data scientists. Understanding NumPy is essential for mastering data manipulation and performing efficient numerical computations in your data science journey.
