# Comprehensive NumPy Tutorial

## Overview
NumPy (Numerical Python) is the fundamental package for scientific computing in Python. It provides:
- A powerful N-dimensional array object (`ndarray`)
- Sophisticated broadcasting functions
- Tools for integrating C/C++ and Fortran code
- Useful linear algebra, Fourier transform, and random number capabilities

### What You'll Learn
1. Creating NumPy arrays
2. Array properties and indexing
3. Array operations and mathematical functions
4. Array manipulation (reshape, transpose, concatenate)
5. Broadcasting
6. Statistical operations
7. Linear algebra
8. Random number generation

## 1. Setup and Import

First, let's import NumPy and check the version.

In [None]:
import numpy as np

# Check NumPy version
print(f"NumPy version: {np.__version__}")

# Set print options for better readability
np.set_printoptions(precision=3, suppress=True)

## 2. Creating NumPy Arrays

There are many ways to create NumPy arrays.

In [None]:
# From Python lists
arr1 = np.array([1, 2, 3, 4, 5])
print("1D array from list:")
print(arr1)

# 2D array (matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D array (matrix):")
print(arr2)

# 3D array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\n3D array:")
print(arr3)

### Arrays with Initial Placeholder Values

In [None]:
# Array of zeros
zeros = np.zeros((3, 4))
print("Array of zeros (3x4):")
print(zeros)

# Array of ones
ones = np.ones((2, 3, 4))
print("\nArray of ones (2x3x4):")
print(ones)

# Array with a constant value
full = np.full((3, 3), 7)
print("\nArray filled with 7:")
print(full)

# Identity matrix
identity = np.eye(4)
print("\nIdentity matrix (4x4):")
print(identity)

# Empty array (uninitialized)
empty = np.empty((2, 2))
print("\nEmpty array (values are uninitialized):")
print(empty)

### Arrays with Ranges and Sequences

In [None]:
# Using arange (similar to Python's range)
range_arr = np.arange(0, 10, 2)  # start, stop, step
print("Array with arange (0 to 10, step 2):")
print(range_arr)

# Using linspace (linearly spaced values)
linspace_arr = np.linspace(0, 1, 11)  # 11 values from 0 to 1
print("\nArray with linspace (11 values from 0 to 1):")
print(linspace_arr)

# Using logspace (logarithmically spaced values)
logspace_arr = np.logspace(0, 2, 5)  # 5 values from 10^0 to 10^2
print("\nArray with logspace (5 values from 10^0 to 10^2):")
print(logspace_arr)

## 3. Array Properties and Data Types

NumPy arrays have several important properties.

In [None]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

print("Array:")
print(arr)
print(f"\nShape: {arr.shape}")           # Dimensions (rows, columns)
print(f"Number of dimensions: {arr.ndim}")  # Number of axes
print(f"Size (total elements): {arr.size}") # Total number of elements
print(f"Data type: {arr.dtype}")           # Type of elements
print(f"Item size (bytes): {arr.itemsize}") # Size of each element
print(f"Total bytes: {arr.nbytes}")        # Total memory used

### Data Types

In [None]:
# Explicit data types
int_arr = np.array([1, 2, 3], dtype=np.int32)
float_arr = np.array([1, 2, 3], dtype=np.float64)
complex_arr = np.array([1+2j, 3+4j], dtype=np.complex128)
bool_arr = np.array([True, False, True], dtype=np.bool_)

print(f"Integer array dtype: {int_arr.dtype}")
print(f"Float array dtype: {float_arr.dtype}")
print(f"Complex array dtype: {complex_arr.dtype}")
print(f"Boolean array dtype: {bool_arr.dtype}")

# Type conversion
converted = int_arr.astype(np.float64)
print(f"\nConverted to float: {converted}, dtype: {converted.dtype}")

## 4. Array Indexing and Slicing

Accessing and modifying array elements.

In [None]:
# 1D array indexing
arr = np.array([10, 20, 30, 40, 50])
print("Original array:", arr)
print(f"First element: {arr[0]}")
print(f"Last element: {arr[-1]}")
print(f"Slice [1:4]: {arr[1:4]}")
print(f"Every second element: {arr[::2]}")
print(f"Reversed: {arr[::-1]}")

In [None]:
# 2D array indexing
arr2d = np.array([[1, 2, 3, 4],
                  [5, 6, 7, 8],
                  [9, 10, 11, 12]])

print("2D Array:")
print(arr2d)
print(f"\nElement at [1, 2]: {arr2d[1, 2]}")
print(f"First row: {arr2d[0, :]}")
print(f"Second column:\n{arr2d[:, 1]}")
print(f"\nSubarray [0:2, 1:3]:\n{arr2d[0:2, 1:3]}")

### Boolean Indexing and Fancy Indexing

In [None]:
arr = np.array([10, 20, 30, 40, 50, 60, 70])

# Boolean indexing
mask = arr > 35
print("Boolean mask (arr > 35):")
print(mask)
print("\nFiltered values (arr > 35):")
print(arr[mask])

# Shorthand
print("\nValues between 25 and 55:")
print(arr[(arr > 25) & (arr < 55)])

# Fancy indexing (using arrays of indices)
indices = np.array([0, 2, 4])
print("\nElements at indices [0, 2, 4]:")
print(arr[indices])

## 5. Array Operations

NumPy supports vectorized operations (element-wise operations without loops).

In [None]:
# Arithmetic operations
a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

print("Array a:", a)
print("Array b:", b)
print(f"\na + b = {a + b}")
print(f"a - b = {a - b}")
print(f"a * b = {a * b}")
print(f"b / a = {b / a}")
print(f"a ** 2 = {a ** 2}")
print(f"b % 3 = {b % 3}")

In [None]:
# Scalar operations
arr = np.array([1, 2, 3, 4, 5])
print("Original array:", arr)
print(f"arr + 10 = {arr + 10}")
print(f"arr * 2 = {arr * 2}")
print(f"arr ** 2 = {arr ** 2}")
print(f"1 / arr = {1 / arr}")

### Mathematical Functions

In [None]:
arr = np.array([1, 4, 9, 16, 25])

print("Array:", arr)
print(f"\nSquare root: {np.sqrt(arr)}")
print(f"Exponential: {np.exp([1, 2, 3])}")
print(f"Natural log: {np.log(arr)}")
print(f"Log base 10: {np.log10(arr)}")
print(f"Absolute value: {np.abs([-1, -2, 3, -4])}")

# Trigonometric functions
angles = np.array([0, np.pi/4, np.pi/2, np.pi])
print(f"\nSine: {np.sin(angles)}")
print(f"Cosine: {np.cos(angles)}")
print(f"Tangent: {np.tan(angles)}")

### Comparison Operations

In [None]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([5, 4, 3, 2, 1])

print("Array a:", a)
print("Array b:", b)
print(f"\na > 3: {a > 3}")
print(f"a == b: {a == b}")
print(f"a < b: {a < b}")

# Aggregate comparison
print(f"\nAny element > 4: {np.any(a > 4)}")
print(f"All elements > 0: {np.all(a > 0)}")
print(f"Arrays are equal: {np.array_equal(a, b)}")

## 6. Array Manipulation

Reshaping, transposing, and combining arrays.

In [None]:
# Reshaping
arr = np.arange(12)
print("Original 1D array:")
print(arr)

reshaped = arr.reshape(3, 4)
print("\nReshaped to 3x4:")
print(reshaped)

# -1 means "infer this dimension"
auto_reshaped = arr.reshape(4, -1)
print("\nReshaped to 4x? (auto):")
print(auto_reshaped)

# Flatten back to 1D
flattened = reshaped.flatten()
print("\nFlattened:")
print(flattened)

# ravel (similar to flatten but returns a view when possible)
raveled = reshaped.ravel()
print("\nRaveled:")
print(raveled)

### Transpose and Swapping Axes

In [None]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

print("Original array (2x3):")
print(arr)

# Transpose
transposed = arr.T
print("\nTransposed (3x2):")
print(transposed)

# For 3D arrays
arr3d = np.arange(24).reshape(2, 3, 4)
print("\n3D array shape:", arr3d.shape)
swapped = arr3d.swapaxes(0, 2)
print("After swapping axes 0 and 2:", swapped.shape)

### Concatenating and Splitting Arrays

In [None]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

print("Array a:")
print(a)
print("\nArray b:")
print(b)

# Concatenate vertically (stack rows)
v_stack = np.vstack((a, b))
print("\nVertical stack (vstack):")
print(v_stack)

# Concatenate horizontally (stack columns)
h_stack = np.hstack((a, b))
print("\nHorizontal stack (hstack):")
print(h_stack)

# General concatenate
concat_axis0 = np.concatenate((a, b), axis=0)
print("\nConcatenate along axis 0:")
print(concat_axis0)

concat_axis1 = np.concatenate((a, b), axis=1)
print("\nConcatenate along axis 1:")
print(concat_axis1)

In [None]:
# Splitting arrays
arr = np.arange(16).reshape(4, 4)
print("Original array:")
print(arr)

# Split vertically
upper, lower = np.vsplit(arr, 2)
print("\nUpper half:")
print(upper)
print("\nLower half:")
print(lower)

# Split horizontally
left, right = np.hsplit(arr, 2)
print("\nLeft half:")
print(left)
print("\nRight half:")
print(right)

## 7. Broadcasting

Broadcasting allows NumPy to perform operations on arrays of different shapes.

In [None]:
# Example 1: Scalar and array
arr = np.array([1, 2, 3, 4])
scalar = 10
print("Array:", arr)
print("Scalar:", scalar)
print("arr + scalar:", arr + scalar)

# Example 2: 1D array and 2D array
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
row = np.array([10, 20, 30])

print("\nMatrix:")
print(matrix)
print("\nRow to add:")
print(row)
print("\nMatrix + row:")
print(matrix + row)

In [None]:
# Example 3: Column broadcasting
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
column = np.array([[100],
                   [200],
                   [300]])

print("Matrix:")
print(matrix)
print("\nColumn to add:")
print(column)
print("\nMatrix + column:")
print(matrix + column)

# Example 4: Broadcasting with both dimensions
a = np.array([[1, 2, 3]])  # 1x3
b = np.array([[10], [20], [30]])  # 3x1

print("\nArray a (1x3):")
print(a)
print("\nArray b (3x1):")
print(b)
print("\na + b (broadcasts to 3x3):")
print(a + b)

## 8. Statistical Operations

NumPy provides many statistical functions.

In [None]:
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

print("Data:", data)
print(f"\nSum: {np.sum(data)}")
print(f"Mean: {np.mean(data)}")
print(f"Median: {np.median(data)}")
print(f"Standard deviation: {np.std(data)}")
print(f"Variance: {np.var(data)}")
print(f"Minimum: {np.min(data)}")
print(f"Maximum: {np.max(data)}")
print(f"Range (max - min): {np.ptp(data)}")

# Percentiles
print(f"\n25th percentile: {np.percentile(data, 25)}")
print(f"50th percentile (median): {np.percentile(data, 50)}")
print(f"75th percentile: {np.percentile(data, 75)}")

### Statistical Operations Along Axes

In [None]:
matrix = np.array([[1, 2, 3, 4],
                   [5, 6, 7, 8],
                   [9, 10, 11, 12]])

print("Matrix:")
print(matrix)

print(f"\nSum of all elements: {np.sum(matrix)}")
print(f"Sum along axis 0 (columns): {np.sum(matrix, axis=0)}")
print(f"Sum along axis 1 (rows): {np.sum(matrix, axis=1)}")

print(f"\nMean along axis 0: {np.mean(matrix, axis=0)}")
print(f"Mean along axis 1: {np.mean(matrix, axis=1)}")

print(f"\nMax along axis 0: {np.max(matrix, axis=0)}")
print(f"Max along axis 1: {np.max(matrix, axis=1)}")

# Argmax and Argmin (indices)
print(f"\nIndex of max in flattened array: {np.argmax(matrix)}")
print(f"Index of max along axis 0: {np.argmax(matrix, axis=0)}")
print(f"Index of max along axis 1: {np.argmax(matrix, axis=1)}")

### Cumulative Operations

In [None]:
arr = np.array([1, 2, 3, 4, 5])

print("Array:", arr)
print(f"Cumulative sum: {np.cumsum(arr)}")
print(f"Cumulative product: {np.cumprod(arr)}")

# Differences
print(f"\nDifferences (consecutive): {np.diff(arr)}")
print(f"Second differences: {np.diff(arr, n=2)}")

## 9. Linear Algebra

NumPy provides comprehensive linear algebra operations.

In [None]:
# Matrix multiplication
A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)

# Element-wise multiplication
print("\nElement-wise multiplication (A * B):")
print(A * B)

# Matrix multiplication (dot product)
print("\nMatrix multiplication (A @ B):")
print(A @ B)

# Alternative: np.dot
print("\nMatrix multiplication (np.dot(A, B)):")
print(np.dot(A, B))

# Matrix multiplication with np.matmul
print("\nMatrix multiplication (np.matmul(A, B)):")
print(np.matmul(A, B))

In [None]:
# More linear algebra operations
matrix = np.array([[1, 2],
                   [3, 4]])

print("Matrix:")
print(matrix)

# Determinant
det = np.linalg.det(matrix)
print(f"\nDeterminant: {det}")

# Inverse
inv = np.linalg.inv(matrix)
print("\nInverse:")
print(inv)

# Verify: matrix @ inverse = identity
print("\nMatrix @ Inverse (should be identity):")
print(matrix @ inv)

# Trace (sum of diagonal elements)
trace = np.trace(matrix)
print(f"\nTrace: {trace}")

# Diagonal elements
diag = np.diag(matrix)
print(f"Diagonal elements: {diag}")

In [None]:
# Eigenvalues and eigenvectors
matrix = np.array([[4, 2],
                   [1, 3]])

print("Matrix:")
print(matrix)

eigenvalues, eigenvectors = np.linalg.eig(matrix)
print("\nEigenvalues:")
print(eigenvalues)
print("\nEigenvectors:")
print(eigenvectors)

# Verify: A @ v = λ @ v
print("\nVerification (A @ v1):")
print(matrix @ eigenvectors[:, 0])
print("\nλ1 * v1:")
print(eigenvalues[0] * eigenvectors[:, 0])

In [None]:
# Solving linear systems: Ax = b
A = np.array([[3, 1],
              [1, 2]])
b = np.array([9, 8])

print("Solving Ax = b")
print("A:")
print(A)
print("\nb:")
print(b)

x = np.linalg.solve(A, b)
print("\nSolution x:")
print(x)

# Verify
print("\nVerification (A @ x):")
print(A @ x)
print("Should equal b:")
print(b)

## 10. Random Number Generation

NumPy's random module provides various distributions and random operations.

In [None]:
# Set seed for reproducibility
np.random.seed(42)

# Random floats between 0 and 1
random_floats = np.random.random(5)
print("Random floats [0, 1):")
print(random_floats)

# Random integers
random_ints = np.random.randint(1, 100, size=10)
print("\nRandom integers [1, 100):")
print(random_ints)

# Random choice from array
choices = np.random.choice(['A', 'B', 'C', 'D'], size=10)
print("\nRandom choices:")
print(choices)

# Shuffle an array
arr = np.arange(10)
np.random.shuffle(arr)
print("\nShuffled array:")
print(arr)

### Random Distributions

In [None]:
# Normal (Gaussian) distribution
normal = np.random.normal(loc=0, scale=1, size=1000)  # mean=0, std=1
print("Normal distribution (mean, std):")
print(f"Mean: {np.mean(normal):.3f}")
print(f"Std: {np.std(normal):.3f}")

# Uniform distribution
uniform = np.random.uniform(low=0, high=10, size=1000)
print("\nUniform distribution [0, 10):")
print(f"Min: {np.min(uniform):.3f}")
print(f"Max: {np.max(uniform):.3f}")
print(f"Mean: {np.mean(uniform):.3f}")

# Binomial distribution
binomial = np.random.binomial(n=10, p=0.5, size=1000)  # 10 trials, 50% probability
print("\nBinomial distribution (n=10, p=0.5):")
print(f"Mean: {np.mean(binomial):.3f}")

# Poisson distribution
poisson = np.random.poisson(lam=3, size=1000)  # lambda=3
print("\nPoisson distribution (λ=3):")
print(f"Mean: {np.mean(poisson):.3f}")

# Exponential distribution
exponential = np.random.exponential(scale=2, size=1000)
print("\nExponential distribution (scale=2):")
print(f"Mean: {np.mean(exponential):.3f}")

### Modern Random Generator (Recommended)

In [None]:
# New recommended way using Generator
rng = np.random.default_rng(seed=42)

# Generate random numbers
rand_floats = rng.random(5)
print("Random floats with Generator:")
print(rand_floats)

rand_ints = rng.integers(1, 100, size=10)
print("\nRandom integers with Generator:")
print(rand_ints)

normal_dist = rng.normal(0, 1, size=5)
print("\nNormal distribution with Generator:")
print(normal_dist)

## 11. Performance Tips and Best Practices

Some tips for efficient NumPy usage.

In [None]:
import time

# Vectorization vs loops
size = 1000000

# Using Python loop
python_list = list(range(size))
start = time.time()
result_list = [x**2 for x in python_list]
python_time = time.time() - start

# Using NumPy vectorization
numpy_array = np.arange(size)
start = time.time()
result_array = numpy_array**2
numpy_time = time.time() - start

print(f"Python list comprehension: {python_time:.4f} seconds")
print(f"NumPy vectorization: {numpy_time:.4f} seconds")
print(f"NumPy is {python_time/numpy_time:.1f}x faster")

In [None]:
# Memory efficiency: views vs copies
original = np.arange(10)
print("Original array:")
print(original)

# Slicing creates a view (shares memory)
view = original[2:5]
print("\nView (slice):")
print(view)

# Modifying the view affects the original
view[0] = 999
print("\nOriginal after modifying view:")
print(original)

# Create an explicit copy
original = np.arange(10)
copy = original[2:5].copy()
copy[0] = 999
print("\nOriginal after modifying copy:")
print(original)
print("Copy:")
print(copy)

## 12. Advanced Topics Preview

A quick look at some advanced NumPy features.

In [None]:
# Universal functions (ufuncs)
arr = np.array([1, 2, 3, 4, 5])

# Custom ufunc
def my_function(x):
    return x**2 + 2*x + 1

vectorized_func = np.vectorize(my_function)
result = vectorized_func(arr)
print("Custom vectorized function:")
print(result)

# Where (conditional selection)
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
result = np.where(arr > 5, 'high', 'low')
print("\nConditional selection with where:")
print(result)

# Clip values
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
clipped = np.clip(arr, 3, 7)  # Limit values between 3 and 7
print("\nClipped values [3, 7]:")
print(clipped)

In [None]:
# Structured arrays (records)
# Define a structured dtype
dt = np.dtype([('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

# Create structured array
data = np.array([('Alice', 25, 55.5),
                 ('Bob', 30, 75.0),
                 ('Charlie', 35, 80.2)], dtype=dt)

print("Structured array:")
print(data)
print("\nAccess by field name:")
print(data['name'])
print(data['age'])

# Sort by age
sorted_data = np.sort(data, order='age')
print("\nSorted by age:")
print(sorted_data)

## Summary

In this comprehensive NumPy tutorial, we covered:

1. **Array Creation**: Various methods including from lists, zeros, ones, ranges, and random generation
2. **Array Properties**: Shape, dimensions, size, data types
3. **Indexing and Slicing**: Basic, boolean, and fancy indexing
4. **Operations**: Arithmetic, mathematical functions, comparisons
5. **Manipulation**: Reshaping, transposing, concatenating, splitting
6. **Broadcasting**: Understanding how NumPy handles arrays of different shapes
7. **Statistics**: Mean, median, std, percentiles, and axis-specific operations
8. **Linear Algebra**: Matrix operations, solving systems, eigenvalues
9. **Random Numbers**: Various distributions and random operations
10. **Performance**: Vectorization benefits and memory management
11. **Advanced Topics**: ufuncs, conditional operations, structured arrays

### Next Steps
- Explore NumPy with real datasets
- Combine NumPy with Pandas for data analysis
- Use NumPy for machine learning preprocessing
- Learn about scipy for advanced scientific computing

### Resources
- [Official NumPy Documentation](https://numpy.org/doc/)
- [NumPy User Guide](https://numpy.org/doc/stable/user/index.html)
- [NumPy for MATLAB Users](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html)