# üß± Section 4 ‚Äî Stacking, Splitting, and Combining Arrays

In the previous section, we explored **vectorized computation** and **broadcasting** ‚Äî powerful tools for fast, element-wise math.  
Now we‚Äôll look at how to **organize and restructure data** in NumPy: stacking, splitting, and concatenating arrays.

These operations are essential for building datasets, merging results, or preparing data for machine learning models.

## üéØ Learning Objectives
After this section, you‚Äôll be able to:
- Concatenate and stack arrays vertically, horizontally, and along custom axes.
- Split arrays into equal or specified parts.
- Use helper functions like `np.vstack`, `np.hstack`, and `np.concatenate`.
- Handle dimension mismatches gracefully.
- Apply these techniques in realistic data manipulation tasks.

In [None]:
import numpy as np

# Example: two small 2D arrays representing weekly sales data
week1 = np.array([[120, 135, 150], [100, 110, 120]])  # store A, B
week2 = np.array([[155, 160, 165], [130, 140, 150]])

print("Week 1:\n", week1)
print("Week 2:\n", week2)

## üîó Concatenation with `np.concatenate`

The most general method for joining arrays is **`np.concatenate`**.  
It takes a sequence of arrays and joins them along an existing axis.

Syntax:
```python
np.concatenate((a1, a2, ...), axis=0)
```
- `axis=0`: stack **vertically** (rows increase)
- `axis=1`: stack **horizontally** (columns increase)

The arrays must have the same shape except along the concatenation axis.

In [None]:
# Vertical concatenation (row-wise)
vertical = np.concatenate((week1, week2), axis=0)
print("Vertical concat (axis=0):\n", vertical)

# Horizontal concatenation (column-wise)
horizontal = np.concatenate((week1, week2), axis=1)
print("\nHorizontal concat (axis=1):\n", horizontal)

## üß© Convenience Wrappers: `vstack`, `hstack`, and `dstack`

NumPy provides convenience functions that wrap `concatenate` with predefined axes:
- `np.vstack` ‚Üí vertical (axis=0)
- `np.hstack` ‚Üí horizontal (axis=1)
- `np.dstack` ‚Üí depth (new 3rd dimension)

These make your intent clearer when reading code.

In [None]:
v = np.vstack((week1, week2))
h = np.hstack((week1, week2))
d = np.dstack((week1, week2))

print("vstack shape:", v.shape)
print("hstack shape:", h.shape)
print("dstack shape:", d.shape)

### üîç Quick Comparison

| Function | Axis | Resulting Shape | Example Use |
|-----------|------|----------------|--------------|
| `concatenate` | Custom | Flexible | General-purpose joining |
| `vstack` | 0 | Rows added | Combine same-width data vertically |
| `hstack` | 1 | Columns added | Combine same-height data horizontally |
| `dstack` | 2 | New depth layer | Combine multiple 2D arrays as layers |

Each one ensures your intent is clear without manually specifying `axis`.

## üß± Adding New Dimensions with `np.newaxis` or `np.expand_dims`

Sometimes you want to **add a new dimension** before stacking ‚Äî e.g., when you have multiple 1D arrays but need to create a 2D or 3D dataset.

You can do this using either:
- `array[:, np.newaxis]` (or `array[np.newaxis, :]`)
- `np.expand_dims(array, axis)`

In [None]:
sales_a = np.array([100, 110, 120])
sales_b = np.array([130, 140, 150])

# Add new axis to make them 2D columns
A_col = sales_a[:, np.newaxis]
B_col = sales_b[:, np.newaxis]

print("A_col shape:", A_col.shape)

# Stack side-by-side
combined = np.hstack((A_col, B_col))
print("Combined array:\n", combined)

## ‚úÇÔ∏è Splitting Arrays

The opposite of stacking is **splitting** ‚Äî dividing arrays into multiple subarrays.

The main functions are:
- `np.split(array, indices_or_sections, axis=0)`
- `np.vsplit` ‚Üí vertical split (by rows)
- `np.hsplit` ‚Üí horizontal split (by columns)

These return a list of subarrays.

In [None]:
# Example: splitting vertically (by rows)
data = np.arange(16).reshape(4, 4)
print("Original data:\n", data)

top, bottom = np.vsplit(data, 2)
print("Top half:\n", top)
print("Bottom half:\n", bottom)

# Horizontal split (by columns)
left, right = np.hsplit(data, 2)
print("Left half:\n", left)
print("Right half:\n", right)

### üìè Custom Split Points
You can pass indices to specify where to cut, e.g. `np.split(data, [1,3], axis=1)`.
This returns slices `[:1]`, `[1:3]`, and `[3:]`.

In [None]:
a, b, c = np.split(data, [1, 3], axis=1)
print("a:\n", a)
print("b:\n", b)
print("c:\n", c)

## ‚öôÔ∏è Practical Example: Combining Sales Reports

Let‚Äôs simulate combining and splitting real data ‚Äî imagine monthly revenue from 3 stores, each month‚Äôs data stored separately.

We‚Äôll:
1. Stack them vertically to create a master dataset.
2. Compute summaries.
3. Split the array back into monthly chunks.

In [None]:
jan = np.array([[100, 110, 120], [90, 95, 100], [130, 125, 140]])
feb = np.array([[105, 115, 125], [92, 98, 104], [135, 130, 145]])
mar = np.array([[110, 118, 128], [96, 101, 108], [140, 136, 150]])

# 1Ô∏è‚É£ Combine vertically (stores x months)
quarter = np.vstack((jan, feb, mar))
print("Quarterly Data (shape):", quarter.shape)
print(quarter)

# 2Ô∏è‚É£ Compute mean revenue per store
mean_revenue = quarter.mean(axis=1)
print("\nMean revenue per store:", mean_revenue)

# 3Ô∏è‚É£ Split back into months (3 rows per month)
jan_out, feb_out, mar_out = np.vsplit(quarter, 3)
print("\nFebruary block:\n", feb_out)

## üßÆ Combining 1D Arrays into 2D Data

If you have multiple 1D arrays (like time series), use `column_stack` or `row_stack` for clean alignment.

- `np.column_stack` ‚Üí each input becomes a column
- `np.row_stack` ‚Üí each input becomes a row

In [None]:
days = np.arange(1, 8)
store_a = np.array([100, 110, 105, 115, 120, 130, 128])
store_b = np.array([90, 95, 100, 105, 110, 115, 120])

sales_data = np.column_stack((days, store_a, store_b))
print(sales_data)

## ‚ö†Ô∏è Common Pitfalls

üö´ **Shape mismatches**: Arrays must align in all dimensions except the one you‚Äôre joining.

‚úÖ **Fix**: Use `reshape()` or `np.newaxis` to make dimensions compatible.

üö´ **Copy vs View Confusion**: `concatenate` returns a new array ‚Äî modifications don‚Äôt affect the originals.

‚úÖ **Fix**: If memory is a concern, consider views (`np.r_` / `np.c_`) for simple indexing operations.

üö´ **Unequal Splits**: `np.split` requires exact divisibility.

‚úÖ **Fix**: Use `np.array_split` if the sections aren‚Äôt equal.

## üß† Review & Recap

**Key functions summary:**

| Purpose | Function | Description |
|----------|-----------|--------------|
| Join arrays | `np.concatenate`, `vstack`, `hstack`, `dstack` | Merge arrays along chosen axes |
| Add dimensions | `np.newaxis`, `expand_dims` | Create singleton dimensions for alignment |
| Split arrays | `split`, `vsplit`, `hsplit`, `array_split` | Divide arrays into sub-blocks |

**Key takeaways:**
- Stacking = combining arrays together.
- Splitting = breaking arrays apart.
- Always check array shapes before combining.
- `np.newaxis` is your friend for matching dimensions.

## üß© Challenge Exercise ‚Äî ‚ÄúRegional Sales Merge‚Äù

**Scenario:**  
You have 3 regions‚Äô weekly sales stored separately as 2D arrays (rows = stores, columns = days):

```python
north = np.array([[12, 15, 14], [10, 11, 13]])
south = np.array([[8, 9, 10], [11, 12, 13]])
west  = np.array([[14, 16, 15], [13, 14, 15]])
```

**Tasks:**
1. Stack all regions vertically into one combined dataset.
2. Compute total daily sales across all regions (hint: `axis=0`).
3. Split the result into individual region blocks again.
4. Add one new column for weekly averages (use `hstack`).
5. Try the same using Python lists ‚Äî compare speed.

üí° *Bonus:* Plot the daily totals using Matplotlib.

‚úÖ **Next Up:**  
In **Section 5**, we‚Äôll dive into **Indexing, Slicing, and Masking**, where you‚Äôll learn how to access, modify, and filter data efficiently using NumPy‚Äôs indexing power.

# --- End of Section 4 ‚Äî Continue to Section 5 ---