# Session 13 Numpy Fundamentals

# NumPy: The Fundamental Package for Scientific Computing

## What is NumPy?
NumPy is the core library for scientific computing in Python. It provides:
- A powerful **n-dimensional array object (ndarray)**
- Derived objects such as masked arrays and matrices
- A collection of functions for fast numerical operations, including:
  - Mathematical computations
  - Logical operations
  - Shape manipulations
  - Sorting and selection
  - I/O operations
  - Discrete Fourier transforms
  - Basic linear algebra
  - Statistical operations
  - Random simulations

At the heart of NumPy is the `ndarray` object, which enables efficient storage and processing of large amounts of homogeneous data.

### Why Use NumPy Instead of Python Lists?
- **Fixed Size**: NumPy arrays have a fixed size, unlike Python lists that can grow dynamically.
- **Data Type Homogeneity**: All elements in a NumPy array must have the same data type, making operations faster.
- **Optimized Performance**: NumPy operations are executed efficiently due to optimized C and Fortran implementations.
- **Broad Adoption**: Many scientific libraries such as Pandas, SciPy, and Scikit-learn use NumPy arrays as their backbone.

---
## NumPy Arrays vs Python Sequences
| Feature          | NumPy Array | Python List |
|----------------|------------|-------------|
| **Size**       | Fixed       | Dynamic     |
| **Data Type**  | Homogeneous | Heterogeneous |
| **Performance**| Faster      | Slower due to dynamic typing |
| **Functionality** | Supports vectorized operations | Requires explicit looping |

---
## Creating NumPy Arrays
### Using `np.array()`
```python
import numpy as np

# 1D array
a = np.array([1,2,3])

# 2D array
b = np.array([[1,2,3],[4,5,6]])

# 3D array (Tensor)
c = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
```

### Creating an Array with a Specific Data Type
```python
np.array([1,2,3], dtype=float)    # Floating-point array
np.array([1,2,3], dtype=bool)     # Boolean array
np.array([1,2,3], dtype=complex)  # Complex number array
```

### Using `np.arange()`
- Similar to Pythonâ€™s `range()`, but returns a NumPy array.
- **Syntax:** `np.arange(start, stop, step, dtype=None)`
```python
np.arange(1, 11, 2)  # Output: array([1, 3, 5, 7, 9])
```

### Using `np.reshape()`
```python
np.arange(16).reshape(2, 2, 2, 2)  # 4D array
```

### Creating Arrays with Default Values
#### `np.ones()` and `np.zeros()`
- **Syntax:**
  - `np.ones(shape, dtype=float)`
  - `np.zeros(shape, dtype=float)`
```python
np.ones((3,4))  # 3x4 matrix of ones
np.zeros((3,4))  # 3x4 matrix of zeros
```

#### `np.random.random()`
- **Syntax:** `np.random.random(shape)`
```python
np.random.random((3,4))  # 3x4 matrix of random numbers between 0 and 1
```

#### `np.linspace()`
- Generates evenly spaced numbers over a specified range.
- **Syntax:** `np.linspace(start, stop, num, dtype=None)`
```python
np.linspace(-10, 10, 10, dtype=int)  # 10 evenly spaced numbers from -10 to 10
```

#### `np.identity()`
- Creates an identity matrix.
- **Syntax:** `np.identity(n, dtype=float)`
```python
np.identity(3)  # 3x3 identity matrix
```

---
## NumPy Array Attributes
```python
a1 = np.arange(10, dtype=np.int32)  # 1D array
a2 = np.arange(12, dtype=float).reshape(3,4)  # 3x4 matrix
a3 = np.arange(8).reshape(2,2,2)  # 3D array
```

### Number of Dimensions (`ndim`)
```python
a3.ndim  # Output: 3 (3D array)
```

### Shape of the Array (`shape`)
```python
print(a3.shape)  # Output: (2, 2, 2)
```

### Total Number of Elements (`size`)
```python
print(a2.size)  # Output: 12
```

### Size of Each Element (`itemsize`)
```python
a3.itemsize  # Output: Size in bytes of each element
```

### Data Type (`dtype`)
```python
print(a1.dtype)  # Output: int32
print(a2.dtype)  # Output: float64
print(a3.dtype)  # Output: int32
```

---
## Changing Data Type
#### `astype()`
- Converts the data type of an array.
- **Syntax:** `array.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)`
```python
a3.astype(np.int32)  # Convert array elements to int32
```

---

## Array Operations in NumPy

### Creating Arrays

```python
import numpy as np

# Creating a 3x4 array with values from 0 to 11
a1 = np.arange(12).reshape(3, 4)

# Creating another 3x4 array with values from 12 to 23
a2 = np.arange(12, 24).reshape(3, 4)

a2  # Display array
```

### Scalar Operations
#### Arithmetic Operations with Scalars
Scalar operations apply an operation element-wise to each element of the array.
```python
# Squaring each element in a1
a1 ** 2
```

#### Relational Operations with Scalars
```python
# Checking which elements in a2 are equal to 15
a2 == 15
```

### Vectorized Operations
Vectorized operations perform element-wise operations between two arrays of the same shape.

#### Arithmetic Operations Between Arrays
```python
# Element-wise exponentiation
a1 ** a2
```
This means that each element in `a1` is raised to the power of the corresponding element in `a2`.

Similarly, other arithmetic operations can be performed:
```python
# Element-wise addition
a1 + a2

# Element-wise subtraction
a1 - a2

# Element-wise multiplication
a1 * a2

# Element-wise division
a1 / a2  # Ensure no division by zero errors
```

---

## Array Functions in NumPy

### Generating Random Arrays
```python
# Creating a 3x3 matrix with random values scaled to 100
a1 = np.random.random((3,3))
a1 = np.round(a1 * 100)
a1
```

### Aggregate Functions
These functions compute values across the entire array or along a specified axis.

#### Maximum, Minimum, Sum, Product
```python
# Product of elements along columns (axis=0)
np.prod(a1, axis=0)
```
Similarly:
```python
np.max(a1, axis=0)  # Maximum values along columns
np.min(a1, axis=1)  # Minimum values along rows
np.sum(a1, axis=0)  # Sum along columns
```

#### Mean, Median, Standard Deviation, Variance
```python
np.var(a1, axis=1)  # Variance along rows
```
Similarly:
```python
np.mean(a1, axis=0)  # Mean along columns
np.median(a1, axis=1)  # Median along rows
np.std(a1, axis=0)  # Standard deviation along columns
```

### Trigonometric Functions
```python
np.sin(a1)
```
Though trigonometric functions are available in NumPy, they are rarely used in data science applications.

### Matrix Operations
#### Dot Product
```python
# Creating two matrices for dot product
a2 = np.arange(12).reshape(3,4)
a3 = np.arange(12,24).reshape(4,3)

# Matrix multiplication (dot product)
np.dot(a2, a3)
```

### Logarithmic and Exponential Functions
```python
np.exp(a1)  # Exponential function
np.log(a1)  # Natural logarithm
np.log10(a1)  # Base-10 logarithm
```

### Rounding Functions
```python
np.round(a1)  # Rounds to the nearest integer
np.floor(a1)  # Rounds down to the nearest integer
np.ceil(a1)   # Rounds up to the nearest integer
```
These functions are useful when working with floating-point data and require integer approximations.

---

This structured breakdown enhances clarity and provides a deeper explanation of NumPy's array operations and functions.



In [72]:
import numpy as np

a1 = np.arange(12).reshape(3, 4)
print(a1)

print(a1.max(axis=1))

#just another way of doing the same thing as above
print(np.max(a1, axis=1))

a2 = np.linspace(1, 100, 12, dtype=int)
print(f"a2: {a2}\n")

a3 = a2.reshape(3, 4)
print(f"a3:\n {a3}\n")

a4 = a1 ** 2
print(f"a4:\n {a4}\n")

a5 = a3 + a4
print(f"a5:\n {a5}\n")

print(f"a5.var: {a5.var(axis=1)}\n")
print(f"a5.mean: {a5.mean(axis=1)}\n")
print(f"a5.std: {a5.std(axis=1)}\n")

arr = np.arange(1, 11)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ 3  7 11]
[ 3  7 11]
a2: [  1  10  19  28  37  46  55  64  73  82  91 100]

a3:
 [[  1  10  19  28]
 [ 37  46  55  64]
 [ 73  82  91 100]]

a4:
 [[  0   1   4   9]
 [ 16  25  36  49]
 [ 64  81 100 121]]

a5:
 [[  1  11  23  37]
 [ 53  71  91 113]
 [137 163 191 221]]

a5.var: [181. 501. 981.]

a5.mean: [ 18.  82. 178.]

a5.std: [13.45362405 22.38302929 31.32091953]



Here's the refined and well-structured explanation of your notes:

---

## **Indexing and Slicing in NumPy**

NumPy allows for flexible indexing and slicing, making it easy to extract subsets of data from arrays.

### **Creating Sample Arrays**
```python
import numpy as np

a1 = np.arange(10)  # 1D array from 0 to 9
a2 = np.arange(12).reshape(3, 4)  # 2D array (3x4)
a3 = np.arange(8).reshape(2, 2, 2)  # 3D array (2x2x2)

print(a3)
print(a2)
```

---

### **Indexing Examples**
Indexing is used to access specific elements from an array.

```python
print(a3[1, 0, 1])  # Access element from a 3D array
print(a2[1, 0])  # Access element from a 2D array
```

---

### **Slicing in NumPy**
Slicing helps extract subarrays based on specified indices.

#### **Slicing in 1D Array**
```python
print(a1[2:5:2])  # Extract elements from index 2 to 5 with step 2
```

#### **Slicing in 2D Arrays**
```python
print(a2[::2, 1::2])  # Select alternate rows and columns from index 1 with step 2
print(a2[1:, 1:3])    # Extract subarray from row index 1 onwards, column index 1 to 2
print(a2[::2, 1:4:2]) # Select alternate rows and columns from index 1 to 3 with step 2
print(a2[0:2, 1:4])   # Select first two rows and columns from index 1 to 3
```

---

### **Advanced Slicing in 3D Arrays**
```python
a3 = np.arange(27).reshape(3, 3, 3)  # Creating a 3x3x3 array
print(a3)
```

#### **Extracting elements using complex slicing**
```python
print(a3[::2, 0, ::2])  # Select alternate slices, row 0, alternate columns
print(a3[2, 1:, 1:])    # Select last depth slice, last two rows, last two columns
print(a3[0, 1, :])      # Select row 1 of first depth slice
```

#### **Tricky Question: Extracting Specific Values (0, 2, 18, 20)**
```python
print(a3[0::2, 0, 0::2])
```
- `0::2` â†’ Selects alternate slices (0, 2)
- `0` â†’ First row of each selected slice
- `0::2` â†’ Alternate columns (0, 2)

---

## **Iterating Over NumPy Arrays**
Iteration is used to traverse through elements of an array.

### **Iterating Over a 1D Array**
```python
for i in a1:
    print(i)
```
- Prints each element in `a1`.

### **Iterating Over a 2D Array**
```python
for i in a2:
    print(i)
```
- Prints each row in `a2`.

### **Iterating Over a 3D Array**
```python
for i in a3:
    print(i)
```
- Prints each **2D slice** in `a3`.

### **Using `np.nditer()` for Iterating**
```python
for i in np.nditer(a3):
    print(i)
```
#### **Explanation of `np.nditer()`**
- `np.nditer()` is used to iterate over elements of any **n-dimensional array** as if it were a **flattened 1D array**.
- It ensures that elements are accessed one by one without requiring manual flattening.

---

This structured explanation ensures clarity while covering all aspects of indexing, slicing, and iteration in NumPy. Let me know if you need any refinements! ðŸš€

In [2]:
import numpy as np

a1 = np.arange(10)  # 1D array from 0 to 9
a2 = np.arange(12).reshape(3, 4)  # 2D array (3x4)
a3 = np.arange(8).reshape(2, 2, 2)  # 3D array (2x2x2)

print(a3[1, 0, 1])  # Access element from a 3D array
print(a2[1, 0])  # Access element from a 2D array

print(a2[::2, 1::2])  # Select alternate rows and columns from index 1 with step 2
print(a2[1:, 1:3])    # Extract subarray from row index 1 onwards, column index 1 to 2
print(a2[::2, 1:4:2]) # Select alternate rows and columns from index 1 to 3 with step 2
print(a2[0:2, 1:4])   # Select first two rows and columns from index 1 to 3

a3 = np.arange(27).reshape(3, 3, 3)  # Creating a 3x3x3 array
print(a3)

print(a3[::2, 0, ::2])  # Select alternate slices, row 0, alternate columns
print(a3[2, 1:, 1:])    # Select last depth slice, last two rows, last two columns
print(a3[0, 1, :])      # Select row 1 of first depth slice

print(a3[0::2, 0, 0::2])

5
4
[[ 1  3]
 [ 9 11]]
[[ 5  6]
 [ 9 10]]
[[ 1  3]
 [ 9 11]]
[[1 2 3]
 [5 6 7]]
[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]

 [[18 19 20]
  [21 22 23]
  [24 25 26]]]
[[ 0  2]
 [18 20]]
[[22 23]
 [25 26]]
[3 4 5]
[[ 0  2]
 [18 20]]


Here's a well-structured and detailed explanation of your topics:

---

## **Reshaping in NumPy**
### **Understanding `reshape()`**
The `reshape()` function in NumPy allows changing the dimensions of an array without altering its data. It is useful when we need to organize data into a specific shape.

```python
import numpy as np

a1 = np.arange(12)  # 1D array from 0 to 11
a2 = np.arange(12).reshape(3, 4)  # Reshaped into 3 rows and 4 columns

print(a2)
```
- The total number of elements must remain the same before and after reshaping.
- The `-1` parameter can be used to infer one dimension automatically.

```python
a2.reshape(4, -1)  # Reshape into 4 rows and let NumPy determine columns
```

---

### **Transpose in NumPy**
The `transpose()` function flips an arrayâ€™s rows and columns. It is particularly useful in:
1. **Linear Algebra** (Matrix operations)
2. **Data Manipulation** (Rearranging datasets)
3. **Image Processing** (Flipping images)

```python
print(a2.T)  # Transpose of a2
```
- Converts a 3Ã—4 matrix into a 4Ã—3 matrix.

---

### **Flattening with `ravel()`**
The `ravel()` function converts an N-dimensional array into a 1D array while maintaining the original order of elements.

```python
a3 = np.arange(8).reshape(2, 2, 2)  # Creating a 3D array
print(a3.ravel())  # Flattening a3
```

---

## **Stacking in NumPy**
### **Understanding Stacking**
Stacking refers to joining multiple arrays together. This method is useful when we need to combine datasets coming from different sources.

There are two main types of stacking:
1. **Horizontal Stacking (`hstack`)** â†’ Joins arrays along columns.
2. **Vertical Stacking (`vstack`)** â†’ Joins arrays along rows.

#### **Example: Horizontal and Vertical Stacking**
```python
a4 = np.arange(12).reshape(3, 4)
a5 = np.arange(12, 24).reshape(3, 4)

print(a4)
print(a5)
```

##### **Horizontal Stacking (`hstack`)**
Joins arrays **side by side** (along columns).
```python
np.hstack((a4, a5))
```
- Combines `a4` and `a5` column-wise.
- Resulting shape: `(3, 8)`

##### **Vertical Stacking (`vstack`)**
Joins arrays **on top of each other** (along rows).
```python
np.vstack((a4, a5))
```
- Combines `a4` and `a5` row-wise.
- Resulting shape: `(6, 4)`

---

## **Splitting in NumPy**
### **Understanding Splitting**
Splitting is the opposite of stacking. It divides an array into multiple smaller arrays.

Splitting is useful for:
1. **Preprocessing datasets** (Splitting training and test sets)
2. **Data Manipulation** (Extracting specific portions of data)

Two types of splitting:
1. **Horizontal Splitting (`hsplit`)** â†’ Splits an array column-wise.
2. **Vertical Splitting (`vsplit`)** â†’ Splits an array row-wise.

#### **Example: Splitting a 2D Array**
```python
print(a4)
np.hsplit(a4, 2)  # Splitting into 2 equal parts along columns
```
- Divides `a4` into two sub-arrays with **equal column count**.

```python
print(a5)
np.vsplit(a5, 3)  # Splitting into 3 equal parts along rows
```
- Divides `a5` into three sub-arrays with **equal row count**.

---

## **Key Takeaways**
- **Reshaping** â†’ Changes array shape without altering data.
- **Transpose** â†’ Swaps rows and columns (useful in matrix operations).
- **Ravel** â†’ Flattens an array into 1D.
- **Stacking** (`hstack`, `vstack`) â†’ Joins arrays either horizontally or vertically.
- **Splitting** (`hsplit`, `vsplit`) â†’ Splits arrays into smaller arrays.

Let me know if you need any refinements! ðŸš€