# NumPy Complete Master Guide

A comprehensive summary of all NumPy topics covering:
- Why NumPy is superior
- Array basics and operations
- Arithmetic operations
- Rounding and constants
- Slicing and reshaping
- Statistical measures (mean, median, mode, std)
- Dot product
- Boolean indexing

---

## 1. Why NumPy is Superior to Python Lists

### Problem with Python Lists

Lists require explicit loops to manipulate elements.

In [1]:
# Python list - SLOW (needs loop)
my_list = [1, 2, 3, 4]
my_list = [x * 2 for x in my_list]  # List comprehension required
print(my_list)  # [2, 4, 6, 8]

# Also note:
my_list = [1, 2, 3, 4]
my_list = my_list * 2  # This just repeats the list!
print(my_list)  # [1, 2, 3, 4, 1, 2, 3, 4]

[2, 4, 6, 8]
[1, 2, 3, 4, 1, 2, 3, 4]


### Solution with NumPy

NumPy arrays support element-wise operations WITHOUT loops (vectorized).

In [2]:
import numpy as np

# NumPy array - FAST (no loop needed)
array = np.array([1, 2, 3, 4])
array = array * 2  # Instant multiplication
print(array)  # [2 4 6 8]

# Add 5 to every element
array = np.array([10, 20, 30, 40])
array = array + 5
print(array)  # [15 25 35 45]

[2 4 6 8]
[15 25 35 45]


**Key Benefit**: NumPy is:
- ✓ Faster (C implementation)
- ✓ Cleaner code (no loops)
- ✓ Less memory
- ✓ Better for large datasets

---

## 2. Scalar Arithmetic Operations

Perform mathematical operations on every element in an array.

In [3]:
import numpy as np

array = np.array([1, 2, 3])

print("Addition:      ", array + 1)      # [2 3 4]
print("Subtraction:   ", array - 2)      # [-1  0  1]
print("Multiplication:", array * 3)      # [3 6 9]
print("Division:      ", array / 4)      # [0.25 0.5  0.75]
print("Power:         ", array ** 5)     # [1 32 243]

Addition:       [2 3 4]
Subtraction:    [-1  0  1]
Multiplication: [3 6 9]
Division:       [0.25 0.5  0.75]
Power:          [  1  32 243]


---

## 3. Vectorized Operations

Apply built-in functions to entire arrays at once.

In [4]:
import numpy as np

array = np.array([1, 2, 3])

# Square root
print("Square root:", np.sqrt(array))     # [1.         1.41421356 1.73205081]

Square root: [1.         1.41421356 1.73205081]


---

## 4. Rounding Operations

In [5]:
import numpy as np

array = np.array([1.01, 2.5, 3.99])

print("Original:  ", array)
print("Round:     ", np.round(array))    # [1. 2. 4.]
print("Floor:     ", np.floor(array))    # [1. 2. 3.] (round down)
print("Ceil:      ", np.ceil(array))     # [2. 3. 4.] (round up)

Original:   [1.01 2.5  3.99]
Round:      [1. 2. 4.]
Floor:      [1. 2. 3.]
Ceil:       [2. 3. 4.]


---

## 5. Mathematical Constants

In [6]:
import numpy as np

print("Pi (π):", np.pi)  # 3.141592653589793

# Example: Calculate circle area
radii = np.array([1, 2, 3])
areas = np.pi * radii ** 2
print("Circle areas:", areas)  # [3.14159265 12.56637061 28.27433388]

Pi (π): 3.141592653589793
Circle areas: [ 3.14159265 12.56637061 28.27433388]


---

## 6. Element-Wise Operations (Two Arrays)

Operations between corresponding elements of two arrays.

In [7]:
import numpy as np

array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])

print("Addition:      ", array1 + array2)      # [5 7 9]
print("Subtraction:   ", array1 - array2)      # [-3 -3 -3]
print("Multiplication:", array1 * array2)      # [4 10 18]
print("Division:      ", array1 / array2)      # [0.25 0.4 0.5]
print("Power:         ", array1 ** array2)     # [1 32 729]

Addition:       [5 7 9]
Subtraction:    [-3 -3 -3]
Multiplication: [ 4 10 18]
Division:       [0.25 0.4  0.5 ]
Power:          [  1  32 729]


---

## 7. Boolean Indexing

Select or modify elements based on conditions.

In [8]:
import numpy as np

scores = np.array([91, 55, 100, 73, 82, 64])

# Check which scores equal 100
print(scores == 100)  # [False False  True False False False]

# Check which scores are less than 60
print(scores < 60)    # [False  True False False False False]

# Replace scores < 60 with "FAIL"
scores_obj = np.array([91, 55, 100, 73, 82, 64], dtype=object)
scores_obj[scores_obj < 60] = "FAIL"
print(scores_obj)  # [91 'FAIL' 100 73 82 64]

[False False  True False False False]
[False  True False False False False]
[91 'FAIL' 100 73 82 64]


---

## 8. Array Slicing (1D and 2D)

Extract subsets of arrays using `array[start:end:step]`.

In [9]:
import numpy as np

# 2D Array
array = np.array([[1, 2, 3, 4],
                  [5, 6, 7, 8],
                  [9, 10, 11, 12],
                  [13, 14, 15, 16]])

print("First row:       ", array[0])      # [1 2 3 4]
print("Rows 0-2:        ", array[0:3])    # [[1 2 3 4] [5 6 7 8] [9 10 11 12]]
print("Every other row: ", array[0:4:2])  # [[1 2 3 4] [9 10 11 12]]

First row:        [1 2 3 4]
Rows 0-2:         [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
Every other row:  [[ 1  2  3  4]
 [ 9 10 11 12]]


---

## 9. Reshape

Change array dimensions without changing data.

In [10]:
import numpy as np

array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

# Reshape to 2D: 1 row, 12 columns
reshaped = array.reshape(1, -1)
print("Shape (1, -1):  ", reshaped.shape)  # (1, 12)

# Reshape to 2D: 3 rows, 4 columns
reshaped = array.reshape(3, 4)
print("Shape (3, 4):  ", reshaped.shape)  # (3, 4)
print(reshaped)

# Use -1 to auto-calculate one dimension
reshaped = array.reshape(-1, 3)  # 4 rows, 3 columns
print("Shape (-1, 3): ", reshaped.shape)  # (4, 3)

Shape (1, -1):   (1, 12)
Shape (3, 4):   (3, 4)
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
Shape (-1, 3):  (4, 3)


**Note**: `-1` means "figure out this dimension automatically"

---

## 10. Statistical Measures

### MEAN (Average)

In [11]:
import numpy as np

array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

# Manual calculation
mean_manual = np.sum(array) / len(array)
print(f"Manual: {mean_manual}")  # 5.0

# Using np.mean()
mean_func = np.mean(array)
print(f"np.mean(): {mean_func}")  # 5.0

Manual: 5.0
np.mean(): 5.0


### MEDIAN (Middle Value)

In [12]:
import numpy as np

array = np.array([9, 8, 6, 5, 7, 1, 3, 2, 4])

# Always sort first!
sorted_array = np.sort(array)
print(f"Sorted: {sorted_array}")  # [1 2 3 4 5 6 7 8 9]

median = np.median(sorted_array)
print(f"Median: {median}")  # 5.0

Sorted: [1 2 3 4 5 6 7 8 9]
Median: 5.0


### MODE (Most Frequent Value)

In [13]:
import numpy as np

array = np.array([1, 1, 2, 2, 2, 3, 3])

# Step 1: Count occurrences
counts = np.bincount(array)
print(f"Counts: {counts}")  # [0 2 3 2]

# Step 2: Find which index has the highest count
mode = np.argmax(counts)
print(f"Mode: {mode}")  # 2 (appears 3 times)

Counts: [0 2 3 2]
Mode: 2


### MODE with Multiple Values (Ties)

In [14]:
import numpy as np

array = np.array([9, 9, 9, 9, 8, 3, 4, 5, 2, 8, 2, 7, 2, 6, 7, 7, 4, 5, 5, 3, 1, 1, 5, 8, 8, 4, 6, 9, 7, 6, 5])

counts = np.bincount(array)
max_count = np.max(counts)  # Find the highest count
modes = np.where(counts == max_count)[0]  # Find ALL indices with that count

print(f"Modes: {modes}")      # [5 7 8 9]
print(f"Frequency: {max_count}")  # 4 (all appear 4 times)

Modes: [5 9]
Frequency: 5


### STANDARD DEVIATION (Spread of Data)

In [15]:
import numpy as np

# Low variation (tight cluster)
tight = np.array([20, 20, 20, 20])
print(f"Tight std: {np.std(tight)}")  # 0.0

# High variation (scattered)
scattered = np.array([17, 18, 19, 2.5])
print(f"Scattered std: {np.std(scattered)}")  # 7.49 (large)

# With ddof parameter
print(f"Population std (ddof=0): {np.std(scattered, ddof=0)}")  # 7.49
print(f"Sample std (ddof=1): {np.std(scattered, ddof=1)}")      # 8.66

Tight std: 0.0
Scattered std: 6.748842493346544
Population std (ddof=0): 6.748842493346544
Sample std (ddof=1): 7.792892060504025


---

## 11. Dot Product

Multiply corresponding elements and sum the results.

In [16]:
import numpy as np

# Scalar dot product
result = np.dot(5, 3)
print(f"5 · 3 = {result}")  # 15

# Vector dot product
a = np.array([2, 4, 6])
b = np.array([-1, 2, 3])
result = np.dot(a, b)
# (2 × -1) + (4 × 2) + (6 × 3) = -2 + 8 + 18 = 24
print(f"Vector dot product: {result}")  # 24

5 · 3 = 15
Vector dot product: 24


### Matrix Multiplication

In [17]:
import numpy as np

A = np.array([[4, 2, 3, 8],
              [2, 1, 3, 5],
              [1, 1, 2, 6]])

B = np.array([[1, 3],
              [2, 2],
              [8, 2],
              [5, 3]])

result = np.dot(A, B)  # (3×4) · (4×2) = (3×2)
print("Matrix multiplication result:")
print(result)

Matrix multiplication result:
[[72 46]
 [53 29]
 [49 27]]


---

## 12. Data Types (dtype)

In [18]:
import numpy as np

# Default: integers only
int_array = np.array([1, 2, 3])
print(f"Default dtype: {int_array.dtype}")  # int64

# Mixed integers and strings
mixed = np.array([91, 55, 100], dtype=object)
mixed[mixed < 60] = "FAIL"
print(f"Mixed array: {mixed}")  # [91 'FAIL' 100]

Default dtype: int64
Mixed array: [91 'FAIL' 100]


---

## Quick Reference Table

| Function | Purpose |
|----------|----------|
| `np.array()` | Create array |
| `np.mean()` | Average |
| `np.median()` | Middle value (sort first) |
| `np.std()` | Standard deviation |
| `np.sum()` | Total of all elements |
| `np.max()` | Largest value |
| `np.min()` | Smallest value |
| `np.round()` | Round to nearest integer |
| `np.floor()` | Round down |
| `np.ceil()` | Round up |
| `np.sqrt()` | Square root |
| `np.dot()` | Dot product / Matrix multiply |
| `np.bincount()` | Count occurrences |
| `np.argmax()` | Index of max value |
| `np.where()` | Find indices matching condition |
| `np.sort()` | Sort array |
| `np.reshape()` | Change dimensions |

## Machine Learning Applications

1. **Data Normalization**: Use mean and std for scaling
2. **Feature Engineering**: Element-wise operations for new features
3. **Statistical Analysis**: Mean, median, std for data exploration
4. **Linear Algebra**: Dot product for matrix operations
5. **Filtering**: Boolean indexing to clean data
6. **Reshaping**: Prepare data for ML models