# üéØ Section 5 ‚Äî Indexing, Slicing, and Masking in NumPy

In this section, we dive into **how to access, modify, and filter** NumPy arrays efficiently.

Indexing and slicing are fundamental operations in data manipulation. They allow you to:
- Extract subsets of arrays efficiently (no loops needed)
- Filter data conditionally using **boolean masks**
- Modify elements in place
- Combine multiple selection techniques for powerful workflows

Understanding this section deeply is key to becoming *fluent* in NumPy and working effectively with large datasets.

In [None]:
import numpy as np

# Sample 2D dataset ‚Äî weekly sales (rows = stores, cols = products)
sales = np.array([
    [120, 135, 150],
    [100, 110, 120],
    [155, 160, 165],
    [130, 140, 150]
])

print("Sales Data:\n", sales)
print("Shape:", sales.shape)

## üéØ 1. Basic Indexing

Indexing in NumPy works much like lists ‚Äî but extends elegantly to multiple dimensions.

- 1D arrays use integer indices (e.g. `arr[0]`)
- 2D arrays use a comma-separated pair `(row, col)` ‚Äî e.g. `arr[1, 2]`

‚ö° Important: NumPy uses **zero-based indexing**, and slicing boundaries are **exclusive** (like Python).

In [None]:
# Access a single element: row 2, col 1
print("Element at row 2, col 1:", sales[1, 0])

# Modify an element
sales[1, 0] = 105
print("Updated Sales Data:\n", sales)

# Access an entire row
print("Row 0:", sales[0])

# Access an entire column
print("Column 1:", sales[:, 1])

### üß© 2. Slicing Syntax Refresher

The slicing syntax is:
```python
array[start:stop:step]
```
- `start` (inclusive)
- `stop` (exclusive)
- `step` (default 1)

Slicing in NumPy creates **views**, not copies, meaning modifying the slice also modifies the original array (unless explicitly copied).

In [None]:
# Slice a 2x2 block (rows 0‚Äì1, cols 0‚Äì1)
subset = sales[0:2, 0:2]
print("Subset (view):\n", subset)

# Modify a slice ‚Üí original changes too!
subset[0, 0] = 999
print("\nModified subset:\n", subset)
print("Original sales updated:\n", sales)

# To avoid modifying the original array, use .copy()
safe_copy = sales[0:2, 0:2].copy()
safe_copy[0, 0] = 0
print("\nSafe copy doesn't affect original:\n", sales)

## ‚öôÔ∏è 3. Fancy Indexing

Fancy indexing uses **integer arrays** or lists to extract arbitrary elements or rows.

Unlike slices, fancy indexing **always returns a copy**, not a view.

In [None]:
# Select specific rows (e.g., 0 and 2)
selected_rows = sales[[0, 2]]
print("Rows 0 and 2:\n", selected_rows)

# Select specific elements
elements = sales[[0, 1, 2], [1, 1, 2]]  # (0,1), (1,1), (2,2)
print("Selected elements:", elements)

## üéØ 4. Boolean Masking (Conditional Selection)

Masking is one of NumPy‚Äôs most powerful tools: you can filter data **based on conditions**.

A mask is just a boolean array (`True`/`False`) of the same shape as your data. You can use it to extract or modify elements meeting a condition.

In [None]:
# Create a boolean mask for high sales (>130)
mask = sales > 130
print("Mask:\n", mask)

# Apply the mask
high_sales = sales[mask]
print("High sales (>130):", high_sales)

# Modify all high sales in place
sales[mask] = 999
print("Updated sales after masking:\n", sales)

## üßÆ 5. Combining Conditions (Logical Operations)

You can combine multiple boolean expressions using:
- `&` (AND)
- `|` (OR)
- `~` (NOT)

‚ö†Ô∏è Don‚Äôt use `and`/`or` ‚Äî they don‚Äôt work elementwise!

In [None]:
sales = np.array([
    [120, 135, 150],
    [100, 110, 120],
    [155, 160, 165],
    [130, 140, 150]
])

mask_combo = (sales > 120) & (sales < 160)
print("Combined mask:\n", mask_combo)
print("Filtered values:\n", sales[mask_combo])

## üß† 6. Under the Hood: Views vs Copies

NumPy often tries to return **views** (shared memory) for performance reasons ‚Äî meaning modifications affect the original array.

| Operation | Returns View? | Notes |
|------------|----------------|-------|
| Basic slicing (`arr[1:3]`) | ‚úÖ Yes | Shares memory |
| Boolean / Fancy indexing | ‚ùå No | Always copies data |
| `reshape` (compatible shapes) | ‚úÖ Often | Depends on memory layout |

Use `arr.base is None` to check if an array owns its data (copy) or references another (view).

In [None]:
arr = np.arange(10)
view = arr[2:5]
copy = arr[[2, 3, 4]]

print("View shares memory?", view.base is arr)
print("Copy shares memory?", copy.base is arr)

## ‚ö†Ô∏è Best Practices and Pitfalls

‚úÖ **Best Practices:**
- Use slicing whenever possible for performance (views).
- Always use `copy()` when you need an independent array.
- Leverage boolean masking for expressive, readable filtering.
- Combine masks with `&`, `|`, `~` instead of nested loops.

üö´ **Pitfalls:**
- Forgetting parentheses in masks, e.g. `a > 3 & a < 5` ‚Üí **wrong precedence!**
  Use `(a > 3) & (a < 5)`.
- Boolean and fancy indexing can use **lots of memory** on large datasets.
- Avoid chained assignments like `arr[arr > 10] = arr[arr > 10] * 2` ‚Äî create a mask once.

## üß© Challenge Exercise ‚Äî ‚ÄúSelective Discounts‚Äù

**Scenario:**  
You manage a 4√ó3 sales matrix for 4 stores and 3 products:
```python
sales = np.array([
    [120, 135, 150],
    [100, 110, 120],
    [155, 160, 165],
    [130, 140, 150]
])
```

**Tasks:**
1. Extract all sales above 140 using a boolean mask.
2. Increase all sales below 110 by 10% (in place).
3. Retrieve the last column (product 3) for stores 1 and 3 only.
4. Create a new array of just the middle row and middle column (using slicing).
5. Verify which of your arrays are views vs. copies.

üí° *Bonus:* Chain slicing and masking to find all values between 120‚Äì150.

‚úÖ **Next Up:**  
In **Section 6**, we‚Äôll move from static data to **vectorized computations and aggregation** ‚Äî learning how to summarize and transform data efficiently with universal functions (ufuncs) and reductions.

# --- End of Section 5 ‚Äî Continue to Section 6 ---