# NumPy Array Subsetting

## What is Subsetting?
Subsetting is selecting specific elements from a NumPy array. It's faster than loops and essential for working with large datasets.

## 4 Main Methods

| Method | Description | Example |
|--------|-------------|---------|
| **Indexing** | Get elements by position | `arr[0]` gets first element |
| **Boolean Indexing** | Get elements meeting conditions | `arr[arr > 5]` gets values > 5 |
| **Slicing** | Get a range of elements | `arr[1:4]` gets elements 1-3 |
| **Multiple Indexing** | Get elements at specific positions | `arr[[0,2,4]]` gets 1st, 3rd, 5th |

### Why Use Subsetting?
- ✅ **Fast**: No need for loops
- ✅ **Efficient**: Works with large datasets
- ✅ **Flexible**: Multiple ways to select data

In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])



## 2D Array Slicing Explanation

Your array looks like this:
```
[[10, 20, 30],   <- Row 0
 [40, 50, 60]]   <- Row 1
```

**`arr[0:1, 0:2]`** means:
- **Rows**: From 0 to 1 (exclusive) = Row 0 only
- **Columns**: From 0 to 2 (exclusive) = Columns 0,1
- **Result**: `[[10, 20]]`

**`arr[0:1, 0:3:2]`** means:
- **Rows**: From 0 to 1 (exclusive) = Row 0 only  
- **Columns**: From 0 to 3 (exclusive) with step 2 = Columns 0,2 (skip column 1)
- **Result**: `[[10, 30]]`

The **step value** (`:2`) means "take every 2nd element" - so it skips column 1 and takes columns 0 and 2.

In [1]:
import numpy as np
arr = np.array([[10, 20, 30], 
                [40, 50, 60]])

# Slicing with Default Step Value
print(arr[0:1, 0:2])

# Slicing with Step Value of 2
print(arr[0:1, 0:3:2])

[[10 20]]
[[10 30]]


In [9]:
# 3D arrays
import numpy as np
arr = np.array([[[19, 28, 37], 
                 [91, 82, 73]], 
                 
                [[46, 55, 10],
                 [64, 90, 1]]])
                 
print(arr[1:2,0:2,0:3:2])

[[[46 10]
  [64  1]]]


In [2]:
# Indexing

import numpy as np

arr = np.array([1,2,3,4,5,6,7,8,9])
# when [[]], this [] will consider index only
# print(arr[1]) only prints 1 index value
print(arr[[2,4,7]])




[3 5 8]


In [3]:
import numpy as np

arr = np.array([[1, 2, 3, 4, 5],
                [10, 20, 30, 40, 50],
                [100, 200, 300, 400, 500]])

print(arr[[0, 0, 1],[1, 4, 3]].tolist())

# arr[[0, 0, 1],[1, 4, 3]] means:
# - Rows:    [0, 0, 1]
# - Columns: [1, 4, 3]
# It picks:
#   arr[0,1] -> 2
#   arr[0,4] -> 5
#   arr[1,3] -> 40

# So, the output is: [ 2  5 40 ]

[2, 5, 40]


In [22]:
import numpy as np
arr = np.array([[[1, 2, 3, 4],
                 [10, 20, 30, 40]], 
                
                [[11, 22, 33, 44],
                 [100, 200, 300, 400]], 
                 
                [[12, 23, 34, 45],
                 [56, 67, 78, 89]]])
                 
print(arr[[0,1,2,2],[0,0,1,1],[1,3,3,2]])

# Explanation
# Value picks: arr[0,0,1], arr[1,0,3], arr[2,1,3], and arr [2,1,2]

[ 2 44 89 78]


## Boolean Indexing Example and Explanation

Suppose you have the following array:
```
array([[[ 1,  2,  3,  4],
    [10, 20, 30, 40]],

       [[12, 23, 34, 45],
    [56, 67, 78, 89]]])
```

### Understanding the Syntax: `arr[arr <= 60]`

**Square Brackets `[ ]`**: Used for **indexing/selecting** elements from the array
- `[arr<= 60]` means "if value really is <= 60, return True"
- `arr[arr<= 60]`means to return the actual value where it's true, not the index
- `arr.where[arr<=60]`means to return the index value

**Result:**
```
[ 1  2  3  4 10 20 30 40 12 23 34 45 56]
```

Boolean indexing is a powerful way to filter data based on conditions!

In [29]:
import numpy as np

arr = np.array([[[1, 2, 3, 4],
                 [10, 20, 30, 40]], 
                
                [[12, 23, 34, 45],
                 [56, 67, 78, 89]]])

print(arr[arr <= 60])

[ 1  2  3  4 10 20 30 40 12 23 34 45 56]


In [30]:
# Divisible by Two. 
import numpy as np

arr = np.array([[[1, 2, 3, 4],
                 [5, 6, 7, 8]], 
                 
                [[9, 10, 11, 12],
                 [13, 14, 15, 16]]])

print(arr[arr%2==0])

[ 2  4  6  8 10 12 14 16]


In [2]:
# Reshaping arrays
import numpy as np

arr = np.array([[2,4,6,8,3,6,9,12],[1,2,3,4,4,8,12,16],[5,10,15,20,6,12,18,24]])

print(arr.reshape(3,2,4))

# 3 means amount of planes, 2 means two rows, 4 means four columns

[[[ 2  4  6  8]
  [ 3  6  9 12]]

 [[ 1  2  3  4]
  [ 4  8 12 16]]

 [[ 5 10 15 20]
  [ 6 12 18 24]]]


In [3]:
import numpy as np

arr = np.array([[10, 20, 30, 40, 50],
                [60, 70, 80, 90, 100]])

arrReshaped = arr.reshape((10,))
# 10, is to create a 1D dimension 
arrfilter = arrReshaped[3:10:3]
arrfilter

array([ 40,  70, 100])