# 🧑‍💻 NumPy Complete Guided Project
**Instructor / Student Colab Notebook** – covers *all* key concepts from `Numpy‑1` to `Numpy‑5`.

*Generated: 08 Aug 2025*


**Table of Contents**

1. [Setup](#setup)  
2. [Array Creation & Dtypes](#creation)  
3. [Array Attributes & Inspection](#attributes)  
4. [Indexing, Slicing, Fancy Indexing](#indexing)  
5. [Reshaping, Transpose & Copies vs Views](#reshape)  
6. [Joining, Splitting, Set & Sorting Ops](#join)  
7. [Arithmetic Ops, Universal Functions](#arithmetic)  
8. [Broadcasting (Rules + Examples)](#broadcast)  
9. [Statistics & Aggregations](#stats)  
10. [Random Numbers & Reproducibility](#random)  
11. [Structured / Recarrays](#structured)  
12. [Linear Algebra Essentials](#linalg)  
13. [File I/O (`npy`, `npz`, `txt`)](#io)  
14. [Datetime64 & Timedelta64](#datetime)  
15. [Masked Arrays & NaNs](#mask)  
16. [Mini‑Project — Fitness Data Analysis](#project)  
17. [Conclusion & Next Steps](#conclusion)  


## <a name='setup'></a>1️⃣ Setup

In [2]:
import numpy as np, math, os, pathlib, types, textwrap, random
print('NumPy version:', np.__version__)

NumPy version: 2.0.2


## <a name='creation'></a>2️⃣ Array Creation & Dtypes

Key functions: `np.array`, `np.arange`, `np.linspace`, `zeros`, `ones`, `full`, `eye`, `identity`, `diag`, `empty`

In [3]:
# EXAMPLE
arr1 = np.array([1, 2, 3], dtype=np.int32)
arr2 = np.linspace(0, 1, 6)
arr3 = np.full((2,3), 7.5)
print(arr1, arr2, arr3, sep="\n")
print("dtypes:", arr1.dtype, arr2.dtype)


[1 2 3]
[0.  0.2 0.4 0.6 0.8 1. ]
[[7.5 7.5 7.5]
 [7.5 7.5 7.5]]
dtypes: int32 float64


In [20]:
# 🖊️ TODO: create a 10×10 chessboard pattern using zeros & ones
a1=np.ones((10,10),dtype=np.int32)
a1[::2,0::2]=0
a1[1::2,1::2]=0
print(a1)


[[0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]]


## <a name='attributes'></a>3️⃣ Array Attributes & Inspection

`shape`, `ndim`, `size`, `dtype`, `itemsize`, `nbytes`

In [17]:
M = np.arange(12).reshape(3,4)
print('shape', M.shape, 'ndim', M.ndim, 'size', M.size, 'itemsize', M.itemsize, 'total bytes', M.nbytes)


shape (3, 4) ndim 2 size 12 itemsize 8 total bytes 96


In [22]:
# 🖊️ TODO: check memory footprint of a 1000×1000 float64 array
M = np.arange(1000000).reshape(1000,1000)
print('shape', M.shape, 'ndim', M.ndim, 'size', M.size, 'itemsize', M.itemsize, 'total bytes', M.nbytes)

shape (1000, 1000) ndim 2 size 1000000 itemsize 8 total bytes 8000000


## <a name='indexing'></a>4️⃣ Indexing, Slicing & Fancy Indexing

In [21]:
a = np.arange(1,26).reshape(5,5)
print(a[:, 0])     # first column
print(a[::2, ::2]) # every 2nd row/col
mask = (a % 3 == 0)
print('multiples of 3:', a[mask])


[ 1  6 11 16 21]
[[ 1  3  5]
 [11 13 15]
 [21 23 25]]
multiples of 3: [ 3  6  9 12 15 18 21 24]


In [32]:
# 🖊️ TODO: use fancy indexing to swap first and last rows of `a`
a=np.arange(1,26).reshape(5,5)
a[[0,4]]=a[[4,0]]
print(a)

[[ 1  2  3  4  5]
 [16 17 18 19 20]]
[[21 22 23 24 25]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [ 1  2  3  4  5]]


## <a name='reshape'></a>5️⃣ Reshaping, Transpose & Copies vs Views

In [33]:
b = np.arange(8)
B = b.reshape(2,4)
B[0,0] = 99
print('b is modified:', b)
C = b.reshape(2,4).copy()
C[0,0] = -1
print('b unchanged with copy:', b)


b is modified: [99  1  2  3  4  5  6  7]
b unchanged with copy: [99  1  2  3  4  5  6  7]


In [47]:
# 🖊️ TODO: Flatten a 3‑D array into 1‑D using both `ravel` and `flatten`; observe copy vs view.
b=np.arange(24).reshape(2,3,4)
print(b.ravel())
print(b.flatten())

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]


## <a name='join'></a>6️⃣ Joining, Splitting, Set & Sorting Ops

In [None]:
x = np.array([1,3,5]); y = np.array([2,4,6])
xy = np.concatenate([x,y])
print('union', np.union1d(x,y))
print('intersect', np.intersect1d(xy,[1,2,10]))
print('sorted descending', np.sort(xy)[::-1])


In [42]:
# 🖊️ TODO: split `xy` back into two equal halves using `np.array_split`
x=np.array([1,3,5,2,4,6])
print(np.array_split(x,2))

[array([1, 3, 5]), array([2, 4, 6])]


## <a name='arithmetic'></a>7️⃣ Arithmetic Ops & Universal Functions

In [43]:
v = np.arange(5)
print('exp', np.exp(v))
print('sin', np.sin(v))
print('vectorised addition', v + 10)


exp [ 1.          2.71828183  7.3890561  20.08553692 54.59815003]
sin [ 0.          0.84147098  0.90929743  0.14112001 -0.7568025 ]
vectorised addition [10 11 12 13 14]


In [46]:
# 🖊️ TODO: given degrees [0,30,45,60,90], compute radians and sin values.
v=np.array([0,30,45,60,90])
print(np.sin(v))

[ 0.         -0.98803162  0.85090352 -0.30481062  0.89399666]


## <a name='broadcast'></a>8️⃣ Broadcasting Rules

Rules: compare dimensions from right → left; stretch size 1 dims; mismatch error.

In [None]:
row = np.arange(5)
col = np.arange(3).reshape(3,1)
matrix = row + col  # broadcast to 3×5
print(matrix)


In [None]:
# 🖊️ TODO: use broadcasting to create a 10×10 multiplication table.
a=np.arange(1,11)
b=np.arange(1,11).reshape(10,1)
print(a*b)

## <a name='stats'></a>9️⃣ Statistics & Aggregations

In [None]:
data = np.random.default_rng(0).integers(1, 100, size=(5,4))
print('data\n', data)
print('row sums', data.sum(axis=1))
print('col means', data.mean(axis=0))


In [None]:
# 🖊️ TODO: compute `np.percentile` (25th, 50th, 75th) of flattened `data`.
a=np.percentile(data,25)
b=np.percentile(data,50)
c=np.percentile(data,75)
print(a,b,c)

## <a name='random'></a>🔟 Random Numbers & Reproducibility

In [None]:
rng = np.random.default_rng(42)
rand_floats = rng.random(5)
rand_ints = rng.integers(low=10, high=50, size=5)
print(rand_floats, rand_ints)
rng2 = np.random.default_rng(42)
assert np.allclose(rand_floats, rng2.random(5))


In [None]:
# 🖊️ TODO: simulate rolling a fair six‑sided die 100 times; estimate proportion of 6s.
a=np.random.default_rng(42).integers(1,7,size=100)
print(np.count_nonzero(a==6)/100)

## <a name='structured'></a>1️⃣1️⃣ Structured / Record Arrays

In [None]:
people = np.array([('Alice', 25, 55.0), ('Bob', 30, 85.5)],
                   dtype=[('name','U10'), ('age','i4'), ('weight','f4')])
print(people['name'], people['age'].mean())


In [66]:
# 🖊️ TODO: add a new field 'height' to the structured array using `np.lib.recfunctions.append_fields` (hint: pip install?).
a=np.array([('Alice', 25, 55.0), ('Bob', 30, 85.5)],
                   dtype=[('name','U10'), ('age','i4'), ('weight','f4')])

## <a name='linalg'></a>1️⃣2️⃣ Linear Algebra Essentials

In [None]:
A = np.random.random((3,3))
b = np.random.random(3)
x = np.linalg.solve(A, b)
print('A·x ≈ b?', np.allclose(A.dot(x), b))


In [65]:
# 🖊️ TODO: compute eigenvalues of `A` using `np.linalg.eig`.
a=np.random.random((3,3))
print(np.linalg.eig(a))

EigResult(eigenvalues=array([ 1.53780157, -0.32216775, -0.55249817]), eigenvectors=array([[-0.47069545, -0.60707724,  0.0707326 ],
       [-0.55323183,  0.78484461, -0.77160626],
       [-0.68729931, -0.12440322,  0.63215559]]))


## <a name='io'></a>1️⃣3️⃣ File I/O (`npy`, `npz`, `txt`)

In [None]:
np.save('array.npy', A)
loaded = np.load('array.npy')
print('loaded equals A?', np.allclose(loaded, A))
np.savez('multi_arrays.npz', A=A, b=b)


In [64]:
# 🖊️ TODO: Use `np.savetxt` to write `data` (from stats section) to CSV then reload with `np.loadtxt`.
a=np.random.default_rng(0).integers(1, 100, size=(5,4))
np.savetxt('array.csv',a)
print(np.loadtxt('array.csv'))

[[85. 64. 51. 27.]
 [31.  5.  8.  2.]
 [18. 81. 65. 91.]
 [50. 61. 97. 73.]
 [63. 54. 56. 93.]]


## <a name='datetime'></a>1️⃣4️⃣ Datetime64 & Timedelta64

In [61]:
dates = np.arange('2023-01', '2023-04', dtype='datetime64[D]')
delta = dates[1:] - dates[:-1]
print(dates[:5], delta[0])


['2023-01-01' '2023-01-02' '2023-01-03' '2023-01-04' '2023-01-05'] 1 days


In [67]:
# 🖊️ TODO: find how many Mondays appear in `dates` array.
a=np.arange('2023-01', '2023-04', dtype='datetime64[D]')
print(np.count_nonzero(a.weekday==0))

## <a name='mask'></a>1️⃣5️⃣ Masked Arrays & NaNs

In [59]:
arr = np.array([1, 2, np.nan, 4, np.nan])
masked = np.ma.masked_invalid(arr)
print(masked.mean())


2.3333333333333335


In [58]:
# 🖊️ TODO: replace NaNs with column means in a 2‑D array containing NaNs.
a=np.array([[1, 2, np.nan, 4, np.nan],[1, 2, 3, 4, 5]])

## <a name='project'></a>1️⃣6️⃣ Mini‑Project: Fitness Data Analysis

Load `fitness.txt` (tab‑separated) then follow prompts.

In [53]:

fitness = np.genfromtxt('fitness.txt', delimiter='\t', dtype=None, encoding=None, names=True)
print('columns:', fitness.dtype.names, 'rows:', len(fitness))

columns: ('date', 'step_count', 'mood', 'calories_burned', 'hours_of_sleep', 'bool_of_active', 'weight_kg') rows: 96


In [56]:
# 🖊️ TODO: Monthly step count, sleep vs mood correlation, weekly summary, etc.
a=np.genfromtxt('fitness.txt', delimiter='\t', dtype=None, encoding=None, names=True)
print(a)

[('06-10-2017', 5464, 200, 181, 5,   0, 66)
 ('07-10-2017', 6041, 100, 197, 8,   0, 66)
 ('08-10-2017',   25, 100,   0, 5,   0, 66)
 ('09-10-2017', 5461, 100, 174, 4,   0, 66)
 ('10-10-2017', 6915, 200, 223, 5, 500, 66)
 ('11-10-2017', 4545, 100, 149, 6,   0, 66)
 ('12-10-2017', 4340, 100, 140, 6,   0, 66)
 ('13-10-2017', 1230, 100,  38, 7,   0, 66)
 ('14-10-2017',   61, 100,   1, 5,   0, 66)
 ('15-10-2017', 1258, 100,  40, 6,   0, 65)
 ('16-10-2017', 3148, 100, 101, 8,   0, 65)
 ('17-10-2017', 4687, 100, 152, 5,   0, 65)
 ('18-10-2017', 4732, 300, 150, 6, 500, 65)
 ('19-10-2017', 3519, 100, 113, 7,   0, 65)
 ('20-10-2017', 1580, 100,  49, 5,   0, 65)
 ('21-10-2017', 2822, 100,  86, 6,   0, 65)
 ('22-10-2017',  181, 100,   6, 8,   0, 65)
 ('23-10-2017', 3158, 200,  99, 5,   0, 65)
 ('24-10-2017', 4383, 200, 143, 4,   0, 64)
 ('25-10-2017', 3881, 200, 125, 5,   0, 64)
 ('26-10-2017', 4037, 200, 129, 6,   0, 64)
 ('27-10-2017',  202, 200,   6, 8,   0, 64)
 ('28-10-2017',  292, 200,   9, 

## <a name='conclusion'></a>1️⃣7️⃣ Conclusion & Further Practice
Congrats on covering **all core NumPy topics** from your five lecture notebooks!

*Keep experimenting, read the official docs, and try converting your NumPy pipelines into Pandas or JAX for more fun.*