# 03 - Array Operations and Indexing

## Introduction

Indexing and slicing are fundamental operations for accessing and modifying array elements. Understanding these operations is crucial for data manipulation.

## What You'll Learn

- Array indexing (single element, slicing)
- Boolean indexing
- Fancy indexing
- Array slicing and views vs copies
- Modifying arrays
- Array comparison operations


In [2]:
import numpy as np


## 1D Array Indexing

Indexing in NumPy is similar to Python lists, but with more powerful features.


In [2]:
# Create a 1D array
arr = np.array([10, 20, 30, 40, 50])
print("Array:", arr)

# Access single element (0-indexed)
print(f"First element: {arr[0]}")
print(f"Last element: {arr[-1]}")
print(f"Third element: {arr[2]}")


Array: [10 20 30 40 50]
First element: 10
Last element: 50
Third element: 30


## Array Slicing

Slicing allows you to extract portions of arrays. The syntax is `start:stop:step`.


In [3]:
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80])
print("Array:", arr)

# Basic slicing: arr[start:stop]
print(f"First 3 elements: {arr[0:3]}")
print(f"Elements 2 to 5: {arr[2:6]}")
print(f"From index 3 to end: {arr[3:]}")
print(f"Up to index 4: {arr[:4]}")
print(f"All elements: {arr[:]}")

# Slicing with step: arr[start:stop:step]
print(f"Every 2nd element: {arr[::2]}")
print(f"Reverse array: {arr[::-1]}")


Array: [10 20 30 40 50 60 70 80]
First 3 elements: [10 20 30]
Elements 2 to 5: [30 40 50 60]
From index 3 to end: [40 50 60 70 80]
Up to index 4: [10 20 30 40]
All elements: [10 20 30 40 50 60 70 80]
Every 2nd element: [10 30 50 70]
Reverse array: [80 70 60 50 40 30 20 10]


## 2D Array Indexing

For 2D arrays, use `[row, column]` syntax.


In [4]:
# Create a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("2D Array:")
print(arr2d)

# Access single element: [row, column]
print(f"Element at [0, 0]: {arr2d[0, 0]}")
print(f"Element at [1, 2]: {arr2d[1, 2]}")
print(f"Element at [2, 1]: {arr2d[2, 1]}")

# Access entire row
print(f"First row: {arr2d[0, :]}")
print(f"Second row: {arr2d[1, :]}")

# Access entire column
print(f"First column: {arr2d[:, 0]}")
print(f"Second column: {arr2d[:, 1]}")


2D Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Element at [0, 0]: 1
Element at [1, 2]: 6
Element at [2, 1]: 8
First row: [1 2 3]
Second row: [4 5 6]
First column: [1 4 7]
Second column: [2 5 8]


In [5]:
# 2D slicing
print("Original array:")
print(arr2d)

# Slice rows and columns
print(f"\nFirst 2 rows, first 2 columns:")
print(arr2d[0:2, 0:2])

print(f"\nLast row:")
print(arr2d[-1, :])

print(f"\nMiddle column:")
print(arr2d[:, 1])


Original array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

First 2 rows, first 2 columns:
[[1 2]
 [4 5]]

Last row:
[7 8 9]

Middle column:
[2 5 8]


## Boolean Indexing

Boolean indexing allows you to select elements based on conditions. This is extremely powerful for filtering data.


In [3]:
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80])
print("Array:", arr)

# Create boolean mask
mask = arr > 40
print(f"Boolean mask (arr > 40): {mask}")

# Use mask to filter
filtered = arr[mask]
print(f"Elements > 40: {filtered}")

# Direct boolean indexing
print(f"Elements > 40: {arr[arr > 40]}")
print(f"Elements <= 30: {arr[arr <= 30]}")


Array: [10 20 30 40 50 60 70 80]
Boolean mask (arr > 40): [False False False False  True  True  True  True]
Elements > 40: [50 60 70 80]
Elements > 40: [50 60 70 80]
Elements <= 30: [10 20 30]


In [7]:
# Multiple conditions (use & for AND, | for OR, ~ for NOT)
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80])
print("Array:", arr)

# AND condition: elements between 30 and 60 (inclusive)
result = arr[(arr >= 30) & (arr <= 60)]
print(f"Elements between 30 and 60: {result}")

# OR condition: elements < 20 or > 70
result = arr[(arr < 20) | (arr > 70)]
print(f"Elements < 20 or > 70: {result}")

# NOT condition: elements not equal to 40
result = arr[~(arr == 40)]
print(f"Elements not equal to 40: {result}")


Array: [10 20 30 40 50 60 70 80]
Elements between 30 and 60: [30 40 50 60]
Elements < 20 or > 70: [10 80]
Elements not equal to 40: [10 20 30 50 60 70 80]


## Fancy Indexing

Fancy indexing allows you to select elements using arrays of indices.


In [8]:
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80])
print("Array:", arr)

# Select specific indices
indices = [0, 2, 4, 6]
print(f"Elements at indices {indices}: {arr[indices]}")

# Using NumPy array for indices
indices_arr = np.array([1, 3, 5])
print(f"Elements at indices {indices_arr}: {arr[indices_arr]}")


Array: [10 20 30 40 50 60 70 80]
Elements at indices [0, 2, 4, 6]: [10 30 50 70]
Elements at indices [1 3 5]: [20 40 60]


In [9]:
# Fancy indexing for 2D arrays
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print("2D Array:")
print(arr2d)

# Select specific rows
print(f"\nRows 0 and 2:")
print(arr2d[[0, 2]])

# Select specific elements
print(f"\nElements at [0,1], [2,0], [3,2]:")
print(arr2d[[0, 2, 3], [1, 0, 2]])


2D Array:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

Rows 0 and 2:
[[1 2 3]
 [7 8 9]]

Elements at [0,1], [2,0], [3,2]:
[ 2  7 12]


## Views vs Copies

**Important concept**: Slicing creates a **view** (not a copy) of the array. Modifying a view modifies the original array. Use `.copy()` to create an independent copy.


In [10]:
# View (slicing)
arr = np.array([1, 2, 3, 4, 5])
view = arr[1:4]  # This is a view
print(f"Original: {arr}")
print(f"View: {view}")

view[0] = 99  # Modify view
print(f"\nAfter modifying view:")
print(f"Original: {arr}")  # Original is also modified!
print(f"View: {view}")


Original: [1 2 3 4 5]
View: [2 3 4]

After modifying view:
Original: [ 1 99  3  4  5]
View: [99  3  4]


In [11]:
# Copy (independent)
arr = np.array([1, 2, 3, 4, 5])
copy = arr[1:4].copy()  # This is a copy
print(f"Original: {arr}")
print(f"Copy: {copy}")

copy[0] = 99  # Modify copy
print(f"\nAfter modifying copy:")
print(f"Original: {arr}")  # Original is NOT modified
print(f"Copy: {copy}")


Original: [1 2 3 4 5]
Copy: [2 3 4]

After modifying copy:
Original: [1 2 3 4 5]
Copy: [99  3  4]


## Modifying Arrays

You can modify array elements using indexing.


In [12]:
arr = np.array([1, 2, 3, 4, 5])
print("Original:", arr)

# Modify single element
arr[0] = 10
print("After arr[0] = 10:", arr)

# Modify multiple elements
arr[1:3] = [20, 30]
print("After arr[1:3] = [20, 30]:", arr)

# Modify with boolean indexing
arr[arr > 20] = 0
print("After arr[arr > 20] = 0:", arr)


Original: [1 2 3 4 5]
After arr[0] = 10: [10  2  3  4  5]
After arr[1:3] = [20, 30]: [10 20 30  4  5]
After arr[arr > 20] = 0: [10 20  0  4  5]


## Array Comparison Operations

NumPy provides element-wise comparison operations that return boolean arrays.


In [4]:
arr = np.array([1, 2, 3, 4, 5])
print("Array:", arr)

# Comparison operations
print(f"arr > 3: {arr > 3}")
print(f"arr >= 3: {arr >= 3}")
print(f"arr < 3: {arr < 3}")
print(f"arr == 3: {arr == 3}")
print(f"arr != 3: {arr != 3}")

# Use with any() and all()
print(f"\nAny element > 3: {np.any(arr > 3)}")
print(f"All elements > 3: {np.all(arr > 3)}")
print(f"All elements > 0: {np.all(arr > 0)}")


Array: [1 2 3 4 5]
arr > 3: [False False False  True  True]
arr >= 3: [False False  True  True  True]
arr < 3: [ True  True False False False]
arr == 3: [False False  True False False]
arr != 3: [ True  True False  True  True]

Any element > 3: True
All elements > 3: False
All elements > 0: True


In [6]:
import pyspark
print(pyspark.__version__)

3.5.1


## Summary

In this notebook, you learned:

1. **1D indexing**: Accessing single elements and slices
2. **2D indexing**: Using [row, column] syntax
3. **Boolean indexing**: Filtering arrays based on conditions
4. **Fancy indexing**: Selecting elements using arrays of indices
5. **Views vs copies**: Understanding when modifications affect the original
6. **Modifying arrays**: Changing array elements
7. **Comparison operations**: Element-wise comparisons returning boolean arrays

**Key Takeaways**:
- Slicing creates views (modifications affect original)
- Use `.copy()` when you need an independent copy
- Boolean indexing is powerful for filtering data
- Fancy indexing allows flexible element selection

**Next Steps**: In the next notebook, we'll explore mathematical operations and universal functions (ufuncs).
