# NumPy Tutorial: Numerical Computing with Python

NumPy (Numerical Python) is the fundamental package for scientific computing in Python. It provides:
- A powerful N-dimensional array object
- Sophisticated broadcasting functions
- Tools for integrating C/C++ and Fortran code
- Useful linear algebra, Fourier transform, and random number capabilities

## Environment Setup

First, let's check our Python version and ensure NumPy is properly installed.

In [2]:
!python --version

Python 3.12.10


### Install Required Packages

If NumPy is not installed, run the following command:

In [1]:
!pip install numpy




[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


## 1. Importing NumPy

The standard convention is to import NumPy with the alias `np`:

In [3]:
# Import NumPy library
import numpy as np

# Check NumPy version
print(f"NumPy version: {np.__version__}")

# Verify installation
print("NumPy imported successfully!")

NumPy version: 2.3.4
NumPy imported successfully!


## 2. Creating NumPy Arrays

NumPy's main object is the homogeneous multidimensional array (`ndarray`). Let's explore different ways to create arrays:

In [17]:
# Creating arrays from lists
my_array = np.array([1, 2, 3, 4])
print("1D Array:", my_array)
print("Type:", type(my_array))
print("Data type:", my_array.dtype)
print("Shape:", my_array.shape)
print("Size:", my_array.size)


1D Array: [1 2 3 4]
Type: <class 'numpy.ndarray'>
Data type: int64
Shape: (4,)
Size: 4


### Understanding `dtype` vs `type` in NumPy

This is a common source of confusion. Let's clarify the difference:

In [18]:
# Transpose of the 1d array is also a 1d array
transposed_array = my_array.T
print("\nTransposed Array:", transposed_array)
print("Shape of Transposed Array:", transposed_array.shape)
print("Type:", type(transposed_array))
print("Data type:", transposed_array.dtype)
print("Size:", transposed_array.size)


Transposed Array: [1 2 3 4]
Shape of Transposed Array: (4,)
Type: <class 'numpy.ndarray'>
Data type: int64
Size: 4


In [21]:
# 2D Array
matrix = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D Array:")
print(matrix)
print("Shape:", matrix.shape)
print("Type:", type(matrix))
print("Data type:", matrix.dtype)
print("Size:", matrix.size)


2D Array:
[[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Type: <class 'numpy.ndarray'>
Data type: int64
Size: 6


In [22]:
# Transpose of the matrix
transposed = matrix.T
print("\nTransposed Matrix:")
print(transposed)   
print("Shape of Transposed Matrix:", transposed.shape)
print("Type:", type(transposed))
print("Data type:", transposed.dtype)
print("Size:", transposed.size)



Transposed Matrix:
[[1 4]
 [2 5]
 [3 6]]
Shape of Transposed Matrix: (3, 2)
Type: <class 'numpy.ndarray'>
Data type: int64
Size: 6


In [26]:
# Creating matrix using arrays as rows
row1 = np.array([1, 2, 3])
row2 = np.array([4, 5, 6])
matrix = np.array([row1, row2])
print("\nMatrix from rows:")
print(matrix)   
print("Shape:", matrix.shape)
print("Type:", type(matrix))    
print("Data type:", matrix.dtype)
print("Size:", matrix.size)


Matrix from rows:
[[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Type: <class 'numpy.ndarray'>
Data type: int64
Size: 6


In [27]:
# Creating matrix using arrays as columns
col1 = np.array([1, 2, 3])
col2 = np.array([4, 5, 6])
matrix = np.array([col1, col2]).T
print("\nMatrix from columns:")
print(matrix) 


Matrix from columns:
[[1 4]
 [2 5]
 [3 6]]


In [37]:
# Different ways to create arrays
print("=== Array Creation Methods ===")

# Using built-in functions
zeros_array = np.zeros(5)
print("Zeros array:", zeros_array)

zeros_matrix = np.zeros((3, 4))  # 3 rows, 4 columns
print("Zeros matrix:\n", zeros_matrix)

zeros_3d = np.zeros((2, 3, 4))  # 2 layers, 3 rows, 4 columns
print("Zeros 3D array:\n", zeros_3d)

zeros_int = np.zeros((2, 3), dtype=int) # 2 rows, 3 columns, integer type
print("Zeros integer array:\n", zeros_int)

existing_array = np.array([[1, 2], [3, 4]]) # 2x2 array
zeros_like = np.zeros_like(existing_array)
print("Zeros like array:\n", zeros_like)


=== Array Creation Methods ===
Zeros array: [0. 0. 0. 0. 0.]
Zeros matrix:
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
Zeros 3D array:
 [[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]
Zeros integer array:
 [[0 0 0]
 [0 0 0]]
Zeros like array:
 [[0 0]
 [0 0]]
Ones array:
 [[1. 1. 1.]
 [1. 1. 1.]]


In [45]:
ones_array = np.ones((1, 1))  # 1 row, 1 column
print("Ones array:\n", ones_array)

ones_array = np.ones((1, 2))  # 1 row, 2 columns
print("Ones array:\n", ones_array)

ones_array = np.ones((1, 3))  # 1 row, 3 columns
print("Ones array:\n", ones_array)

ones_array = np.ones((2, 3))  # 2 rows, 3 columns
print("Ones array:\n", ones_array)

ones_array = np.ones((3, 3))  # 3 rows, 3 columns
print("Ones array:\n", ones_array)

Ones array:
 [[1.]]
Ones array:
 [[1. 1.]]
Ones array:
 [[1. 1. 1.]]
Ones array:
 [[1. 1. 1.]
 [1. 1. 1.]]
Ones array:
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


In [49]:
# Identity Matrices
identity_matrix = np.eye(3)  # 3x3 identity matrix
print("Identity matrix:\n", identity_matrix)

identity_matrix = np.eye(4)  # 4x4 identity matrix  
print("Identity matrix:\n", identity_matrix)

identity_matrix = np.eye(5)  # 5x5 identity matrix
print("Identity matrix:\n", identity_matrix)

identity_matrix = np.eye(6)  # 6x6 identity matrix
print("Identity matrix:\n", identity_matrix)

Identity matrix:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Identity matrix:
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
Identity matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Identity matrix:
 [[1. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 1.]]


In [51]:
# Range arrays
range_array = np.arange(0, 10, 2)
print("Range array:", range_array)

# Linear space
linspace_array = np.linspace(0, 1, 5)
print("Linspace array:", linspace_array)

# Random arrays
random_array = np.random.random((2, 2))
print("Random array:\n", random_array)

Range array: [0 2 4 6 8]
Linspace array: [0.   0.25 0.5  0.75 1.  ]
Random array:
 [[0.24609668 0.0888877 ]
 [0.39615611 0.49885936]]


## 3. Array Operations and Mathematical Functions

In [52]:
# Basic arithmetic operations
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

print("Array a:", a)
print("Array b:", b)
print("Addition:", a + b)
print("Subtraction:", a - b)
print("Multiplication:", a * b)
print("Division:", a / b)
print("Power:", a ** 2)

# Mathematical functions
print("\n=== Mathematical Functions ===")
print("Square root:", np.sqrt(a))
print("Exponential:", np.exp(a))
print("Logarithm:", np.log(a))
print("Sin:", np.sin(a))
print("Sum:", np.sum(a))
print("Mean:", np.mean(a))
print("Standard deviation:", np.std(a))

Array a: [1 2 3 4]
Array b: [5 6 7 8]
Addition: [ 6  8 10 12]
Subtraction: [-4 -4 -4 -4]
Multiplication: [ 5 12 21 32]
Division: [0.2        0.33333333 0.42857143 0.5       ]
Power: [ 1  4  9 16]

=== Mathematical Functions ===
Square root: [1.         1.41421356 1.73205081 2.        ]
Exponential: [ 2.71828183  7.3890561  20.08553692 54.59815003]
Logarithm: [0.         0.69314718 1.09861229 1.38629436]
Sin: [ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
Sum: 10
Mean: 2.5
Standard deviation: 1.118033988749895


## 4. Array Indexing and Slicing

In [53]:
# Array indexing and slicing
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print("Original array:", arr)

# Basic indexing
print("First element:", arr[0])
print("Last element:", arr[-1])

# Slicing
print("First 3 elements:", arr[:3])
print("Elements 3 to 7:", arr[3:8])
print("Every second element:", arr[::2])

# 2D array indexing
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n2D Array:")
print(matrix)
print("Element at (1,2):", matrix[1, 2])
print("First row:", matrix[0, :])
print("Second column:", matrix[:, 1])

# Boolean indexing
bool_index = arr > 5
print("\nBoolean indexing (elements > 5):")
print(arr[bool_index])

Original array: [0 1 2 3 4 5 6 7 8 9]
First element: 0
Last element: 9
First 3 elements: [0 1 2]
Elements 3 to 7: [3 4 5 6 7]
Every second element: [0 2 4 6 8]

2D Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Element at (1,2): 6
First row: [1 2 3]
Second column: [2 5 8]

Boolean indexing (elements > 5):
[6 7 8 9]


## 5. Array Reshaping and Manipulation

In [None]:
# Array reshaping
arr = np.arange(12)
print("Original array:", arr)
print("Shape:", arr.shape)

# Reshape to 2D
reshaped = arr.reshape(3, 4)
print("\nReshaped to 3x4:")
print(reshaped)

# Reshape to 3D
reshaped_3d = arr.reshape(2, 2, 3)
print("\nReshaped to 2x2x3:")
print(reshaped_3d)

# Flatten array
flattened = reshaped.flatten()
print("\nFlattened:", flattened)

# Transpose
print("\nTranspose:")
print(reshaped.T)

# Array stacking
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("\nVertical stack:")
print(np.vstack((a, b)))

print("\nHorizontal stack:")
print(np.hstack((a, b)))

## 6. Linear Algebra and Statistics

In [None]:
# Linear algebra operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)

# Matrix multiplication
print("\nMatrix multiplication (A @ B):")
print(A @ B)

# Dot product
print("\nDot product:")
print(np.dot(A, B))

# Determinant
print("\nDeterminant of A:", np.linalg.det(A))

# Eigenvalues and eigenvectors
eigenvals, eigenvecs = np.linalg.eig(A)
print("\nEigenvalues:", eigenvals)
print("Eigenvectors:\n", eigenvecs)

# Statistical functions
data = np.random.normal(0, 1, 1000)  # Generate random data
print("\n=== Statistical Analysis ===")
print(f"Mean: {np.mean(data):.3f}")
print(f"Median: {np.median(data):.3f}")
print(f"Standard deviation: {np.std(data):.3f}")
print(f"Variance: {np.var(data):.3f}")
print(f"Min: {np.min(data):.3f}")
print(f"Max: {np.max(data):.3f}")
print(f"25th percentile: {np.percentile(data, 25):.3f}")
print(f"75th percentile: {np.percentile(data, 75):.3f}")

## 7. Practical Example: Data Analysis

Let's apply NumPy to a real-world scenario - analyzing student grades.

In [None]:
# Student grades analysis
np.random.seed(42)  # For reproducible results

# Generate sample data: 50 students, 5 subjects
students = 50
subjects = 5
grades = np.random.normal(75, 15, (students, subjects))  # Mean=75, StdDev=15
grades = np.clip(grades, 0, 100)  # Ensure grades are between 0-100

subject_names = ['Math', 'Physics', 'Chemistry', 'Biology', 'English']

print("=== Student Grades Analysis ===")
print(f"Data shape: {grades.shape}")
print(f"Sample of first 5 students:")
print(grades[:5])

# Calculate statistics for each subject
print("\n=== Subject-wise Statistics ===")
for i, subject in enumerate(subject_names):
    print(f"\n{subject}:")
    print(f"  Mean: {np.mean(grades[:, i]):.2f}")
    print(f"  Std:  {np.std(grades[:, i]):.2f}")
    print(f"  Min:  {np.min(grades[:, i]):.2f}")
    print(f"  Max:  {np.max(grades[:, i]):.2f}")

# Student performance analysis
student_averages = np.mean(grades, axis=1)
print(f"\n=== Student Performance ===")
print(f"Class average: {np.mean(student_averages):.2f}")
print(f"Best student average: {np.max(student_averages):.2f}")
print(f"Lowest student average: {np.min(student_averages):.2f}")

# Find students above average
above_average = student_averages > np.mean(student_averages)
print(f"Students above class average: {np.sum(above_average)}")

# Subject correlations (simplified)
correlation_matrix = np.corrcoef(grades.T)
print(f"\n=== Subject Correlation Matrix ===")
for i, subj1 in enumerate(subject_names):
    print(f"{subj1:10}", end=" ")
    for j, subj2 in enumerate(subject_names):
        print(f"{correlation_matrix[i,j]:6.3f}", end=" ")
    print()

## 8. Performance Tips and Best Practices

### Key NumPy Advantages:
- **Vectorized Operations**: Much faster than Python loops
- **Memory Efficient**: Contiguous memory layout
- **Broadcasting**: Automatic handling of different array shapes

### Best Practices:
1. Use vectorized operations instead of loops
2. Prefer built-in functions over manual calculations
3. Use appropriate data types to save memory
4. Take advantage of broadcasting for operations on different shaped arrays

In [None]:
# Performance comparison: NumPy vs Pure Python
import time

# Create large arrays for testing
size = 1000000
a = np.random.random(size)
b = np.random.random(size)

# Convert to Python lists for comparison
a_list = a.tolist()
b_list = b.tolist()

print("=== Performance Comparison ===")

# NumPy vectorized operation
start = time.time()
result_numpy = a + b
numpy_time = time.time() - start
print(f"NumPy addition: {numpy_time:.6f} seconds")

# Pure Python loop
start = time.time()
result_python = [a_list[i] + b_list[i] for i in range(size)]
python_time = time.time() - start
print(f"Python loop: {python_time:.6f} seconds")

print(f"NumPy is {python_time/numpy_time:.1f}x faster!")

# Memory usage comparison
print(f"\nMemory usage:")
print(f"NumPy array: {a.nbytes} bytes")
print(f"Python list: {len(a_list) * 28} bytes (approximate)")  # Python objects overhead

# Broadcasting example
matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])

print(f"\n=== Broadcasting Example ===")
print("Matrix:")
print(matrix)
print("Vector:", vector)
print("Matrix + Vector (broadcasted):")
print(matrix + vector)

## Summary

This notebook covered the essential concepts of NumPy:

1. **Array Creation**: Different methods to create arrays
2. **Mathematical Operations**: Arithmetic and advanced functions
3. **Indexing and Slicing**: Accessing and modifying array elements
4. **Reshaping**: Changing array dimensions and structure
5. **Linear Algebra**: Matrix operations and statistical functions
6. **Practical Applications**: Real-world data analysis example
7. **Performance**: Why NumPy is faster than pure Python

### Next Steps:
- Explore more advanced NumPy functions
- Learn about pandas for data manipulation
- Study matplotlib for data visualization
- Practice with real datasets

### Resources:
- [NumPy Documentation](https://numpy.org/doc/)
- [NumPy User Guide](https://numpy.org/doc/stable/user/)
- [Scientific Python Ecosystem](https://scipy.org/)