## **1. Array Shape Manipulation**
Often, you'll need to change the shape of your data to make it compatible with other data or algorithms.
- **reshape(new_shape):** Reshapes an array without changing its data. The new shape must have the same number of elements as the original.
- **T attribute or transpose():** Transposes an array, swapping its axes (rows become columns and columns become rows).
- **ravel() or flatten():** Converts a multi-dimensional array into a 1D array.
    - **ravel()** creates a view (faster, memory-efficient) if possible.
    - **flatten()** always creates a copy.

In [2]:
import numpy as np

# reshape
arr = np.arange(12) # [0, 1, ..., 11]
print(f"Original array: {arr}")
reshaped_arr = arr.reshape(3, 4)
print(f"\nReshaped array (3x4):\n{reshaped_arr}")

# transpose
print(f"\nTransposed array (4x3):\n{reshaped_arr.T}")

# ravel
raveled_arr = reshaped_arr.ravel()
print(f"\nRaveled array: {raveled_arr}")
raveled_arr[0] = 100 # Modifying the raveled array
print(f"Original reshaped array is also modified:\n{reshaped_arr}")

# flatten
arr2 = np.arange(12).reshape(3, 4)
flattened_arr = arr2.flatten()
print(f"\nFlattened array: {flattened_arr}")
flattened_arr[0] = 999 # Modifying the flattened array
print(f"Original array is NOT modified:\n{arr2}")

Original array: [ 0  1  2  3  4  5  6  7  8  9 10 11]

Reshaped array (3x4):
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Transposed array (4x3):
[[ 0  4  8]
 [ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]]

Raveled array: [ 0  1  2  3  4  5  6  7  8  9 10 11]
Original reshaped array is also modified:
[[100   1   2   3]
 [  4   5   6   7]
 [  8   9  10  11]]

Flattened array: [ 0  1  2  3  4  5  6  7  8  9 10 11]
Original array is NOT modified:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


## **2. Combining and Splitting Arrays**
- **Combining:**
    - np.concatenate((arr1, arr2, ...), axis=...): Joins a sequence of arrays along an existing axis.
    - np.vstack((arr1, arr2)): Stacks arrays vertically (row-wise). Shortcut for concatenate with axis=0.
    - np.hstack((arr1, arr2)): Stacks arrays horizontally (column-wise). Shortcut for concatenate with axis=1.
- **Splitting:**
    - np.split(arr, num_sections, axis=...): Splits an array into multiple sub-arrays.
    - np.hsplit(arr, num_sections) and np.vsplit(arr, num_sections) are convenient shortcuts.

In [2]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Vertical stacking
vstacked = np.vstack((arr1, arr2))
print(f"Vertically stacked:\n{vstacked}")

# Horizontal stacking
hstacked = np.hstack((arr1, arr2))
print(f"\nHorizontally stacked:\n{hstacked}")

# Splitting
arr_to_split = np.arange(16).reshape(4, 4)
print(f"\nArray to split:\n{arr_to_split}")

# Vertical split into 2 equal parts
top, bottom = np.vsplit(arr_to_split, 2)
print(f"\nTop part:\n{top}")
print(f"\nBottom part:\n{bottom}")

Vertically stacked:
[[1 2]
 [3 4]
 [5 6]
 [7 8]]

Horizontally stacked:
[[1 2 5 6]
 [3 4 7 8]]

Array to split:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]

Top part:
[[0 1 2 3]
 [4 5 6 7]]

Bottom part:
[[ 8  9 10 11]
 [12 13 14 15]]


## **3. Aggregations and Statistical Functions Along Axes**
We've seen .sum(), .min(), .max(). By default, they operate on the entire array. However, you can specify an axis to perform the operation along.
- **axis=0:** Operates "down" the columns.
- **axis=1:** Operates "across" the rows.
This is a critical concept for data analysis.

In [4]:
data = np.arange(1, 10).reshape(3, 3)
print(f"Original data:\n{data}")

# Overall sum (no axis)
print(f"\nOverall sum: {data.sum()}")

# Sum of each column (axis=0)
print(f"Sum of each column (axis=0): {data.sum(axis=0)}")

# Sum of each row (axis=1)
print(f"Sum of each row (axis=1): {data.sum(axis=1)}")

# Other common functions: .mean(), .std() (standard deviation), .var() (variance)
print(f"\nMean of each column (axis=0): {data.mean(axis=0)}")
print(f"Standard deviation of each row (axis=1): {data.std(axis=1)}")

Original data:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Overall sum: 45
Sum of each column (axis=0): [12 15 18]
Sum of each row (axis=1): [ 6 15 24]

Mean of each column (axis=0): [4. 5. 6.]
Standard deviation of each row (axis=1): [0.81649658 0.81649658 0.81649658]


## **4. Broadcasting**
This is a powerful mechanism that allows NumPy to work with arrays of different shapes when performing arithmetic operations. In short, NumPy "broadcasts" the smaller array across the larger array so that they have compatible shapes.
- **Rule 1:** If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
- **Rule 2:** If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
- **Rule 3:** If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

In [5]:
# Simple example: adding a scalar to an array
arr = np.array([1, 2, 3])
scalar_add = arr + 5 # The scalar 5 is "broadcast" to each element
print(f"Adding a scalar (broadcasting):\n{scalar_add}")

# More complex example: adding a 1D array to a 2D array
matrix = np.ones((3, 3)) # A 3x3 matrix of ones
vector = np.arange(3)    # A 1D array [0, 1, 2]

print(f"\nMatrix:\n{matrix}")
print(f"Vector: {vector}")

# Add the vector to each row of the matrix
# The vector's shape (3,) is broadcast across the matrix's shape (3, 3)
result = matrix + vector
print(f"\nMatrix + Vector (broadcasting):\n{result}")

Adding a scalar (broadcasting):
[6 7 8]

Matrix:
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
Vector: [0 1 2]

Matrix + Vector (broadcasting):
[[1. 2. 3.]
 [1. 2. 3.]
 [1. 2. 3.]]


## **Exercises**

**1. Reshaping and Stacking:**
- Create two 1D NumPy arrays: arr_a = np.arange(10) and arr_b = np.arange(10, 20).
- Reshape both arrays into a (2, 5) shape.
- Stack them vertically (vstack) to create a single (4, 5) array.
- Stack them horizontally (hstack) to create a single (2, 10) array.
- Print all resulting arrays with descriptive labels.

In [8]:
arr_a = np.arange(10)
arr_b = np.arange(10, 20)

reshape_arr_a = arr_a.reshape(2,5)
print(f"Reshaped arr_a:\n{reshape_arr_a}")

reshape_arr_b = arr_b.reshape(2,5)
print(f"\nReshaped arr_b:\n{reshape_arr_b}")

vstacked = np.vstack((reshape_arr_a, reshape_arr_b))
print(f"\nVstacked array:\n{vstacked}")

hstacked = np.hstack((reshape_arr_a, reshape_arr_b))
print(f"\nHstacked array:\n{hstacked}")

Reshaped arr_a:
[[0 1 2 3 4]
 [5 6 7 8 9]]

Reshaped arr_b:
[[10 11 12 13 14]
 [15 16 17 18 19]]

Vstacked array:
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]

Hstacked array:
[[ 0  1  2  3  4 10 11 12 13 14]
 [ 5  6  7  8  9 15 16 17 18 19]]


**2. Axis-wise Operations:**
- Create a 4x3 array of random integers between 0 and 100.
- Calculate and print the mean of each column.
- Calculate and print the minimum value in each row.
- Find the index of the maximum value in the entire flattened array (Hint: use .argmax()).

In [22]:
arr = np.random.randint(0,101, size = (4,3))

print(f"4x3 Array:\n{arr}")
print(f"\nMean of each column: {arr.mean(axis = 0)}")
print(f"\nMinimum value of each row: {arr.min(axis = 1)}")

flattened_arr = arr.flatten()
print(f"\nFlattened Array: {flattened_arr}")
print(f"\nIndex of the maximum value of the flattened array: {flattened_arr.argmax()}")

4x3 Array:
[[87 88 90]
 [85 84 65]
 [14 93 47]
 [26 16 51]]

Mean of each column: [53.   70.25 63.25]

Minimum value of each row: [87 65 14 16]

Flattened Array: [87 88 90 85 84 65 14 93 47 26 16 51]

Index of the maximum value of the flattened array: 7


**3. Broadcasting Challenge:**
- Create a 1D NumPy array representing 3 data points: data_points = np.array([10, 20, 30]).
- Create a 2D NumPy array representing 3 weights for each data point: weights = np.array([[0.2, 0.3, 0.5], [0.1, 0.8, 0.1], [0.4, 0.4, 0.2]]). The shape should be (3, 3).
- Your goal is to "normalize" the data_points by subtracting the mean of the data points from each point, and then multiply these centered points by the weights matrix.
- Step 1: Calculate the mean of data_points.
- Step 2: Use broadcasting to subtract the mean from data_points to create a new centered_points array.
- Step 3: This is tricky. The shapes are (3, 3) for weights and (3,) for centered_points. To multiply them element-wise, you need to reshape centered_points to (3, 1) so it can broadcast across the columns of weights. Use .reshape(3, 1).
- Step 4: Multiply the weights matrix by the reshaped centered_points array.
- Print the mean, centered_points, and the final result.

In [35]:
data_points = np.array([10, 20, 30])
weights = np.array([[0.2, 0.3, 0.5], [0.1, 0.8, 0.1], [0.4, 0.4, 0.2]])

mean_data_points = data_points.mean()
print(f"Mean of the data points: {mean_data_points}")

centered_points = data_points - mean_data_points
print(f"\nCentered points:\n{centered_points}")
reshaped_centered_points = centered_points.reshape(3,1)
print(f"\nReshaped centered points:\n{reshaped_centered_points}")
result = reshaped_centered_points * weights
print(f"\nResult after multiplying reshaped_centered_points with weights:\n{result}")

Mean of the data points: 20.0

Centered points:
[-10.   0.  10.]

Reshaped centered points:
[[-10.]
 [  0.]
 [ 10.]]

Result after multiplying reshaped_centered_points with weights:
[[-2. -3. -5.]
 [ 0.  0.  0.]
 [ 4.  4.  2.]]
