# üì¶ Stage 1.5: Stacking & Splitting Arrays

## üéØ Objective
Learn how to combine and break arrays along different axes

---

## Table of Contents
1. Stacking Arrays (Combining)
2. Splitting Arrays (Breaking)
3. Practical Examples
4. Practice Exercises

In [2]:
import numpy as np

---

## 1. Stacking Arrays (Combining)

### üìö Theory

**Stacking** means joining arrays together along an axis.

| Function | Description | Direction |
|----------|-------------|----------|
| `np.vstack()` | Vertical stack | Row-wise (‚Üì) |
| `np.hstack()` | Horizontal stack | Column-wise (‚Üí) |
| `np.stack()` | Stack along new axis | Creates new dimension |
| `np.concatenate()` | General concatenation | Along specified axis |

### ‚úÖ Daily Use:
- Merging datasets
- Combining features
- Batch processing
- Building data matrices

### Vertical Stack (vstack) - Stack Rows

In [3]:
# Stack arrays vertically (row-wise)
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Array a:", a)
print("Array b:", b)
print()

# Vertical stack
vstacked = np.vstack((a, b))
print("Vertically stacked:\n", vstacked)
print(f"Shape: {vstacked.shape}")

# Visualize:
# [1, 2, 3]  ‚Üê array a
# [4, 5, 6]  ‚Üê array b

Array a: [1 2 3]
Array b: [4 5 6]

Vertically stacked:
 [[1 2 3]
 [4 5 6]]
Shape: (2, 3)


In [4]:
# Vertical stack with 2D arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

print("Array 1:\n", arr1)
print("\nArray 2:\n", arr2)
print()

vstacked_2d = np.vstack((arr1, arr2))
print("Vertically stacked 2D:\n", vstacked_2d)
print(f"Shape: {vstacked_2d.shape}")

Array 1:
 [[1 2]
 [3 4]]

Array 2:
 [[5 6]
 [7 8]]

Vertically stacked 2D:
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]
Shape: (4, 2)


### Horizontal Stack (hstack) - Stack Columns

In [5]:
# Stack arrays horizontally (column-wise)
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Array a:", a)
print("Array b:", b)
print()

# Horizontal stack
hstacked = np.hstack((a, b))
print("Horizontally stacked:", hstacked)
print(f"Shape: {hstacked.shape}")

# Result: [1, 2, 3, 4, 5, 6]

Array a: [1 2 3]
Array b: [4 5 6]

Horizontally stacked: [1 2 3 4 5 6]
Shape: (6,)


In [6]:
# Horizontal stack with 2D arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

print("Array 1:\n", arr1)
print("\nArray 2:\n", arr2)
print()

hstacked_2d = np.hstack((arr1, arr2))
print("Horizontally stacked 2D:\n", hstacked_2d)
print(f"Shape: {hstacked_2d.shape}")

# Visualize:
# [1, 2, 5, 6]
# [3, 4, 7, 8]

Array 1:
 [[1 2]
 [3 4]]

Array 2:
 [[5 6]
 [7 8]]

Horizontally stacked 2D:
 [[1 2 5 6]
 [3 4 7 8]]
Shape: (2, 4)


### Stack Along New Axis

In [7]:
# Stack arrays along a new axis
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Array a:", a, "- Shape:", a.shape)
print("Array b:", b, "- Shape:", b.shape)
print()

# Stack along axis 0 (creates new first dimension)
stacked_axis0 = np.stack((a, b), axis=0)
print("Stacked along axis 0:\n", stacked_axis0)
print(f"Shape: {stacked_axis0.shape}")
print()

# Stack along axis 1 (creates new second dimension)
stacked_axis1 = np.stack((a, b), axis=1)
print("Stacked along axis 1:\n", stacked_axis1)
print(f"Shape: {stacked_axis1.shape}")

Array a: [1 2 3] - Shape: (3,)
Array b: [4 5 6] - Shape: (3,)

Stacked along axis 0:
 [[1 2 3]
 [4 5 6]]
Shape: (2, 3)

Stacked along axis 1:
 [[1 4]
 [2 5]
 [3 6]]
Shape: (3, 2)


### Concatenate - General Purpose Stacking

In [8]:
# Concatenate gives more control
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

print("Array 1:\n", arr1)
print("\nArray 2:\n", arr2)
print()

# Concatenate along axis 0 (rows)
concat_axis0 = np.concatenate((arr1, arr2), axis=0)
print("Concatenate along axis 0 (rows):\n", concat_axis0)
print()

# Concatenate along axis 1 (columns)
concat_axis1 = np.concatenate((arr1, arr2), axis=1)
print("Concatenate along axis 1 (columns):\n", concat_axis1)

Array 1:
 [[1 2]
 [3 4]]

Array 2:
 [[5 6]
 [7 8]]

Concatenate along axis 0 (rows):
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]

Concatenate along axis 1 (columns):
 [[1 2 5 6]
 [3 4 7 8]]


---

## 2. Splitting Arrays (Breaking)

### üìö Theory

**Splitting** breaks arrays into smaller chunks.

| Function | Description | Direction |
|----------|-------------|----------|
| `np.split()` | Split into N equal parts | Along axis |
| `np.vsplit()` | Vertical split | Row-wise (‚Üì) |
| `np.hsplit()` | Horizontal split | Column-wise (‚Üí) |
| `np.array_split()` | Split unequally if needed | Along axis |

### ‚úÖ Daily Use:
- Train-test split
- Batch processing
- Data partitioning
- Cross-validation

### Horizontal Split (hsplit) - Split Columns

In [9]:
# Horizontal split - split by columns
arr = np.array([[1, 2, 3, 4, 5, 6],
                [7, 8, 9, 10, 11, 12]])

print("Original array:\n", arr)
print(f"Shape: {arr.shape}")
print()

# Split into 3 equal parts
split_result = np.hsplit(arr, 3)

print("After horizontal split into 3 parts:")
for i, sub_arr in enumerate(split_result):
    print(f"\nPart {i+1}:\n", sub_arr)

Original array:
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
Shape: (2, 6)

After horizontal split into 3 parts:

Part 1:
 [[1 2]
 [7 8]]

Part 2:
 [[ 3  4]
 [ 9 10]]

Part 3:
 [[ 5  6]
 [11 12]]


In [10]:
# Split at specific column indices
arr = np.array([[1, 2, 3, 4, 5, 6],
                [7, 8, 9, 10, 11, 12]])

print("Original array:\n", arr)
print()

# Split at column indices 2 and 4
split_result = np.hsplit(arr, [2, 4])

print("Split at indices [2, 4]:")
print("Part 1 (cols 0-1):\n", split_result[0])
print("\nPart 2 (cols 2-3):\n", split_result[1])
print("\nPart 3 (cols 4-5):\n", split_result[2])

Original array:
 [[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]

Split at indices [2, 4]:
Part 1 (cols 0-1):
 [[1 2]
 [7 8]]

Part 2 (cols 2-3):
 [[ 3  4]
 [ 9 10]]

Part 3 (cols 4-5):
 [[ 5  6]
 [11 12]]


### Vertical Split (vsplit) - Split Rows

In [11]:
# Vertical split - split by rows
arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9],
                [10, 11, 12]])

print("Original array:\n", arr)
print(f"Shape: {arr.shape}")
print()

# Split into 2 equal parts
split_result = np.vsplit(arr, 2)

print("After vertical split into 2 parts:")
print("\nPart 1:\n", split_result[0])
print("\nPart 2:\n", split_result[1])

Original array:
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
Shape: (4, 3)

After vertical split into 2 parts:

Part 1:
 [[1 2 3]
 [4 5 6]]

Part 2:
 [[ 7  8  9]
 [10 11 12]]


### General Split

In [12]:
# General split with axis parameter
arr = np.arange(12).reshape(3, 4)

print("Original array:\n", arr)
print()

# Split along axis 0 (rows)
split_axis0 = np.split(arr, 3, axis=0)
print("Split along axis 0 (rows):")
for i, sub in enumerate(split_axis0):
    print(f"Part {i+1}:\n", sub)
    print()

# Split along axis 1 (columns)
split_axis1 = np.split(arr, 2, axis=1)
print("Split along axis 1 (columns):")
for i, sub in enumerate(split_axis1):
    print(f"Part {i+1}:\n", sub)
    print()

Original array:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Split along axis 0 (rows):
Part 1:
 [[0 1 2 3]]

Part 2:
 [[4 5 6 7]]

Part 3:
 [[ 8  9 10 11]]

Split along axis 1 (columns):
Part 1:
 [[0 1]
 [4 5]
 [8 9]]

Part 2:
 [[ 2  3]
 [ 6  7]
 [10 11]]



### Array Split (for unequal divisions)

In [13]:
# array_split handles arrays that can't be divided evenly
arr = np.arange(10)

print("Original array:", arr)
print()

# Try to split 10 elements into 3 parts (not evenly divisible)
# np.split would give error, but array_split handles it
split_result = np.array_split(arr, 3)

print("Split into 3 parts (unequal):")
for i, sub in enumerate(split_result):
    print(f"Part {i+1}: {sub}")

Original array: [0 1 2 3 4 5 6 7 8 9]

Split into 3 parts (unequal):
Part 1: [0 1 2 3]
Part 2: [4 5 6]
Part 3: [7 8 9]


---

## 3. Practical Examples

### Example 1: Combining Features for ML

In [14]:
# Simulate different features
age = np.array([[25], [30], [35], [40]])
salary = np.array([[50000], [60000], [70000], [80000]])
experience = np.array([[2], [5], [8], [10]])

print("Age:\n", age)
print("\nSalary:\n", salary)
print("\nExperience:\n", experience)
print()

# Combine all features horizontally
features = np.hstack((age, salary, experience))
print("Combined features (Age, Salary, Experience):\n", features)
print(f"Shape: {features.shape}")

Age:
 [[25]
 [30]
 [35]
 [40]]

Salary:
 [[50000]
 [60000]
 [70000]
 [80000]]

Experience:
 [[ 2]
 [ 5]
 [ 8]
 [10]]

Combined features (Age, Salary, Experience):
 [[   25 50000     2]
 [   30 60000     5]
 [   35 70000     8]
 [   40 80000    10]]
Shape: (4, 3)


### Example 2: Train-Test Split

In [15]:
# Create sample dataset
data = np.arange(100).reshape(10, 10)
print("Full dataset (10 samples):\n", data)
print()

# Split into train (80%) and test (20%)
train_data, test_data = np.vsplit(data, [8])  # Split at index 8

print("Training data (8 samples):\n", train_data)
print(f"Shape: {train_data.shape}")
print()

print("Test data (2 samples):\n", test_data)
print(f"Shape: {test_data.shape}")

Full dataset (10 samples):
 [[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]

Training data (8 samples):
 [[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]]
Shape: (8, 10)

Test data (2 samples):
 [[80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]
Shape: (2, 10)


### Example 3: Batch Processing

In [16]:
# Large dataset to process in batches
data = np.arange(1000)
print(f"Total data points: {len(data)}")
print()

# Split into batches of 100
batches = np.array_split(data, 10)

print(f"Number of batches: {len(batches)}")
print(f"Batch size: {len(batches[0])}")
print()

# Process each batch
for i, batch in enumerate(batches):
    print(f"Batch {i+1}: {len(batch)} items, Range: {batch[0]}-{batch[-1]}")

Total data points: 1000

Number of batches: 10
Batch size: 100

Batch 1: 100 items, Range: 0-99
Batch 2: 100 items, Range: 100-199
Batch 3: 100 items, Range: 200-299
Batch 4: 100 items, Range: 300-399
Batch 5: 100 items, Range: 400-499
Batch 6: 100 items, Range: 500-599
Batch 7: 100 items, Range: 600-699
Batch 8: 100 items, Range: 700-799
Batch 9: 100 items, Range: 800-899
Batch 10: 100 items, Range: 900-999


---

## 4. üß™ Practice Exercises

### Exercise 1: Stack arrays vertically

In [17]:
# Create two 1D arrays and stack them vertically
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Your code here
result = np.vstack((arr1, arr2))
print("Vertically stacked:\n", result)

Vertically stacked:
 [[1 2 3]
 [4 5 6]]


### Exercise 2: Split a 10x2 array into 2 parts

In [18]:
# Create a 10x2 array
arr = np.arange(20).reshape(10, 2)
print("Original array:\n", arr)
print()

# Split into 2 equal parts
part1, part2 = np.vsplit(arr, 2)

print("Part 1:\n", part1)
print("\nPart 2:\n", part2)

Original array:
 [[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]
 [12 13]
 [14 15]
 [16 17]
 [18 19]]

Part 1:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]

Part 2:
 [[10 11]
 [12 13]
 [14 15]
 [16 17]
 [18 19]]


### Exercise 3: Count occurrences

In [19]:
# Create an array with repeated values
arr = np.array([1, 2, 3, 3, 3, 4, 5, 3, 6, 3])
print("Array:", arr)

# Count how many times 3 appears
count = np.sum(arr == 3)
print(f"\nNumber 3 appears {count} times")

# Alternative method
count_alt = len(arr[arr == 3])
print(f"Count (alternative method): {count_alt}")

Array: [1 2 3 3 3 4 5 3 6 3]

Number 3 appears 5 times
Count (alternative method): 5


### Exercise 4: Combine and split
1. Create 3 arrays of shape (3, 2)
2. Stack them horizontally
3. Split the result back into 3 parts

In [20]:
# Step 1: Create 3 arrays
arr1 = np.arange(6).reshape(3, 2)
arr2 = np.arange(6, 12).reshape(3, 2)
arr3 = np.arange(12, 18).reshape(3, 2)

print("Array 1:\n", arr1)
print("\nArray 2:\n", arr2)
print("\nArray 3:\n", arr3)
print()

# Step 2: Stack horizontally
combined = np.hstack((arr1, arr2, arr3))
print("Combined (horizontal stack):\n", combined)
print()

# Step 3: Split back into 3 parts
split1, split2, split3 = np.hsplit(combined, 3)
print("After splitting back:")
print("Part 1:\n", split1)
print("\nPart 2:\n", split2)
print("\nPart 3:\n", split3)

Array 1:
 [[0 1]
 [2 3]
 [4 5]]

Array 2:
 [[ 6  7]
 [ 8  9]
 [10 11]]

Array 3:
 [[12 13]
 [14 15]
 [16 17]]

Combined (horizontal stack):
 [[ 0  1  6  7 12 13]
 [ 2  3  8  9 14 15]
 [ 4  5 10 11 16 17]]

After splitting back:
Part 1:
 [[0 1]
 [2 3]
 [4 5]]

Part 2:
 [[ 6  7]
 [ 8  9]
 [10 11]]

Part 3:
 [[12 13]
 [14 15]
 [16 17]]


---

## üìù Key Takeaways

### Stacking (Combining):
1. ‚úÖ `np.vstack()` - Stack rows (vertical)
2. ‚úÖ `np.hstack()` - Stack columns (horizontal)
3. ‚úÖ `np.stack()` - Stack along new axis
4. ‚úÖ `np.concatenate()` - General purpose with axis control

### Splitting (Breaking):
1. ‚úÖ `np.vsplit()` - Split rows (vertical)
2. ‚úÖ `np.hsplit()` - Split columns (horizontal)
3. ‚úÖ `np.split()` - General split with axis
4. ‚úÖ `np.array_split()` - Handles uneven divisions

### Real-World Uses:
- ‚úÖ Merging datasets
- ‚úÖ Train-test splitting
- ‚úÖ Batch processing
- ‚úÖ Feature engineering

