# Notebook 6: NumPy Fundamentals

Welcome to your sixth Python notebook! Now you'll learn about NumPy (Numerical Python) - the foundation of data science in Python. NumPy provides powerful tools for working with arrays and mathematical operations.

**Learning Objectives:**
- Understand what NumPy is and why it's important
- Create and manipulate NumPy arrays
- Perform mathematical operations on arrays
- Use array indexing and slicing
- Apply NumPy to data science problems

In [1]:
# Import NumPy (standard convention)
import numpy as np

# Check NumPy version
print(f"NumPy version: {np.__version__}")

# Create a simple array
arr = np.array([1, 2, 3, 4, 5])
print("NumPy array:", arr)
print("Array type:", type(arr))

NumPy version: 2.3.1
NumPy array: [1 2 3 4 5]
Array type: <class 'numpy.ndarray'>


## Array Operations - The Power of NumPy

NumPy's real strength is performing operations on entire arrays at once:

In [2]:
# Real data science example: Temperature data
temperatures_celsius = np.array([20, 25, 30, 15, 35, 28, 22])
print("Temperatures (°C):", temperatures_celsius)

# Convert all to Fahrenheit in one operation!
temperatures_fahrenheit = temperatures_celsius * 9/5 + 32
print("Temperatures (°F):", temperatures_fahrenheit)

# Boolean operations - very common in ML!
hot_days = temperatures_celsius > 25
print("Hot days (>25°C):", hot_days)
print("Hot temperatures:", temperatures_celsius[hot_days])

# Statistical operations
print(f"\nStatistics:")
print(f"Mean temperature: {np.mean(temperatures_celsius):.1f}°C")
print(f"Max temperature: {np.max(temperatures_celsius)}°C")
print(f"Standard deviation: {np.std(temperatures_celsius):.1f}°C")

# Array slicing - YOU'LL SEE THIS EVERYWHERE IN ML!
print(f"\nFirst 3 temperatures: {temperatures_celsius[0:3]}")  # This pattern is everywhere!
print(f"Last 2 temperatures: {temperatures_celsius[-2:]}")

Temperatures (°C): [20 25 30 15 35 28 22]
Temperatures (°F): [68.  77.  86.  59.  95.  82.4 71.6]
Hot days (>25°C): [False False  True False  True  True False]
Hot temperatures: [30 35 28]

Statistics:
Mean temperature: 25.0°C
Max temperature: 35°C
Standard deviation: 6.2°C

First 3 temperatures: [20 25 30]
Last 2 temperatures: [28 22]


## Matrix Operations - Foundation of Machine Learning

These operations are the mathematical foundation of all ML algorithms:

In [3]:
# Create data matrix (like you'll see in ML notebooks)
# Each row = one data sample, each column = one feature
X = np.array([[1, 2, 3],      # Sample 1: features [1, 2, 3]
              [4, 5, 6],      # Sample 2: features [4, 5, 6]  
              [7, 8, 9]])     # Sample 3: features [7, 8, 9]

print("Data matrix X:")
print(X)
print(f"Shape: {X.shape} (3 samples, 3 features)")

# The most important operation in ML: matrix transpose
X_T = X.T  # You'll see this notation EVERYWHERE!
print(f"\nX transpose (.T):")
print(X_T)
print(f"Shape: {X_T.shape}")

# Matrix multiplication - core of neural networks!
result = X.dot(X_T)  # or X @ X_T
print(f"\nMatrix multiplication X @ X.T:")
print(result)

# Simulate ML prediction pattern
weights = np.array([0.5, 0.3, 0.2])  # Model weights
predictions = X.dot(weights)  # This is how predictions are made!
print(f"\nWeights: {weights}")
print(f"Predictions: {predictions}")

# Generate random data (very common in ML)
np.random.seed(42)  # For reproducible results
random_data = np.random.randn(5, 3)  # 5 samples, 3 features
print(f"\nRandom data shape: {random_data.shape}")
print("First 3 samples:")
print(random_data[0:3])  # This slicing pattern again!

Data matrix X:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Shape: (3, 3) (3 samples, 3 features)

X transpose (.T):
[[1 4 7]
 [2 5 8]
 [3 6 9]]
Shape: (3, 3)

Matrix multiplication X @ X.T:
[[ 14  32  50]
 [ 32  77 122]
 [ 50 122 194]]

Weights: [0.5 0.3 0.2]
Predictions: [1.7 4.7 7.7]

Random data shape: (5, 3)
First 3 samples:
[[ 0.49671415 -0.1382643   0.64768854]
 [ 1.52302986 -0.23415337 -0.23413696]
 [ 1.57921282  0.76743473 -0.46947439]]
