# 🧑‍💻 NumPy Complete Guided Project
**Instructor / Student Colab Notebook** – covers *all* key concepts from `Numpy‑1` to `Numpy‑5`.

*Generated: 08 Aug 2025*


**Table of Contents**

1. [Setup](#setup)  
2. [Array Creation & Dtypes](#creation)  
3. [Array Attributes & Inspection](#attributes)  
4. [Indexing, Slicing, Fancy Indexing](#indexing)  
5. [Reshaping, Transpose & Copies vs Views](#reshape)  
6. [Joining, Splitting, Set & Sorting Ops](#join)  
7. [Arithmetic Ops, Universal Functions](#arithmetic)  
8. [Broadcasting (Rules + Examples)](#broadcast)  
9. [Statistics & Aggregations](#stats)  
10. [Random Numbers & Reproducibility](#random)  
11. [Structured / Recarrays](#structured)  
12. [Linear Algebra Essentials](#linalg)  
13. [File I/O (`npy`, `npz`, `txt`)](#io)  
14. [Datetime64 & Timedelta64](#datetime)  
15. [Masked Arrays & NaNs](#mask)  
16. [Mini‑Project — Fitness Data Analysis](#project)  
17. [Conclusion & Next Steps](#conclusion)  


## <a name='setup'></a>1️⃣ Setup

In [None]:
import numpy as np, math, os, pathlib, types, textwrap, random
print('NumPy version:', np.__version__)

NumPy version: 2.0.2


## <a name='creation'></a>2️⃣ Array Creation & Dtypes

Key functions: `np.array`, `np.arange`, `np.linspace`, `zeros`, `ones`, `full`, `eye`, `identity`, `diag`, `empty`

In [None]:
# EXAMPLE
arr1 = np.array([1, 2, 3], dtype=np.int32)
arr2 = np.linspace(0, 1, 6)
arr3 = np.full((2,3), 7.5)
print(arr1, arr2, arr3, sep="\n")
print("dtypes:", arr1.dtype, arr2.dtype)


[1 2 3]
[0.  0.2 0.4 0.6 0.8 1. ]
[[7.5 7.5 7.5]
 [7.5 7.5 7.5]]
dtypes: int32 float64


In [None]:
# 🖊️ TODO: create a 10×10 chessboard pattern using zeros & ones
import numpy as np
chessboard = np.zeros((10, 10), dtype=int)
chessboard[1::2, ::2] = 1
chessboard[::2, 1::2] = 1
print(chessboard)

[[0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]
 [0 1 0 1 0 1 0 1 0 1]
 [1 0 1 0 1 0 1 0 1 0]]


## <a name='attributes'></a>3️⃣ Array Attributes & Inspection

`shape`, `ndim`, `size`, `dtype`, `itemsize`, `nbytes`

In [None]:
M = np.arange(12).reshape(3,4)
print('shape', M.shape, 'ndim', M.ndim, 'size', M.size, 'itemsize', M.itemsize, 'total bytes', M.nbytes)


shape (3, 4) ndim 2 size 12 itemsize 8 total bytes 96


In [None]:
# 🖊️ TODO: check memory footprint of a 1000×1000 float64 array
import numpy as np
arr = np.zeros((1000, 1000), dtype=np.float64)
print(arr.nbytes, "bytes")

8000000 bytes


## <a name='indexing'></a>4️⃣ Indexing, Slicing & Fancy Indexing

In [None]:
a = np.arange(1,26).reshape(5,5)
print(a[:, 0])     # first column
print(a[::2, ::2]) # every 2nd row/col
mask = (a % 3 == 0)
print('multiples of 3:', a[mask])


[ 1  6 11 16 21]
[[ 1  3  5]
 [11 13 15]
 [21 23 25]]
multiples of 3: [ 3  6  9 12 15 18 21 24]


In [None]:
# 🖊️ TODO: use fancy indexing to swap first and last rows of `a`
import numpy as np
a = np.arange(1, 13).reshape(4, 3)
a[[0, -1]] = a[[-1, 0]]
print(a)

[[10 11 12]
 [ 4  5  6]
 [ 7  8  9]
 [ 1  2  3]]


## <a name='reshape'></a>5️⃣ Reshaping, Transpose & Copies vs Views

In [None]:
b = np.arange(8)
B = b.reshape(2,4)
B[0,0] = 99
print('b is modified:', b)
C = b.reshape(2,4).copy()
C[0,0] = -1
print('b unchanged with copy:', b)


b is modified: [99  1  2  3  4  5  6  7]
b unchanged with copy: [99  1  2  3  4  5  6  7]


In [None]:
# 🖊️ TODO: Flatten a 3‑D array into 1‑D using both `ravel` and `flatten`; observe copy vs view.
import numpy as np
arr = np.arange(24).reshape(2, 3, 4)
r = arr.ravel()
f = arr.flatten()
r[0] = 999
f[1] = 888
print("Original array after ravel change:\n", arr)
print("Flatten result:\n", f)

Original array after ravel change:
 [[[999   1   2   3]
  [  4   5   6   7]
  [  8   9  10  11]]

 [[ 12  13  14  15]
  [ 16  17  18  19]
  [ 20  21  22  23]]]
Flatten result:
 [  0 888   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  18  19  20  21  22  23]


## <a name='join'></a>6️⃣ Joining, Splitting, Set & Sorting Ops

In [None]:
x = np.array([1,3,5]); y = np.array([2,4,6])
xy = np.concatenate([x,y])
print('union', np.union1d(x,y))
print('intersect', np.intersect1d(xy,[1,2,10]))
print('sorted descending', np.sort(xy)[::-1])


union [1 2 3 4 5 6]
intersect [1 2]
sorted descending [6 5 4 3 2 1]


In [None]:
# 🖊️ TODO: split `xy` back into two equal halves using `np.array_split`
splitarr=np.array_split(xy,2)
print(splitarr)

[array([1, 3, 5]), array([2, 4, 6])]


## <a name='arithmetic'></a>7️⃣ Arithmetic Ops & Universal Functions

In [None]:
v = np.arange(5)
print('exp', np.exp(v))
print('sin', np.sin(v))
print('vectorised addition', v + 10)


exp [ 1.          2.71828183  7.3890561  20.08553692 54.59815003]
sin [ 0.          0.84147098  0.90929743  0.14112001 -0.7568025 ]
vectorised addition [10 11 12 13 14]


In [None]:
# 🖊️ TODO: given degrees [0,30,45,60,90], compute radians and sin values.
import numpy as np
deg = np.array([0, 30, 45, 60, 90])
rad = np.deg2rad(deg)
sin_vals = np.sin(rad)
print("Radians:", rad)
print("Sin values:", sin_vals)

Radians: [0.         0.52359878 0.78539816 1.04719755 1.57079633]
Sin values: [0.         0.5        0.70710678 0.8660254  1.        ]


## <a name='broadcast'></a>8️⃣ Broadcasting Rules

Rules: compare dimensions from right → left; stretch size 1 dims; mismatch error.

In [None]:
row = np.arange(5)
col = np.arange(3).reshape(3,1)
matrix = row + col  # broadcast to 3×5
print(matrix)


[[0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]]


In [None]:
# 🖊️ TODO: use broadcasting to create a 10×10 multiplication table.
import numpy as np
x = np.arange(1, 11).reshape(-1, 1)
y = np.arange(1, 11)
table = x * y
print(table)

[[  1   2   3   4   5   6   7   8   9  10]
 [  2   4   6   8  10  12  14  16  18  20]
 [  3   6   9  12  15  18  21  24  27  30]
 [  4   8  12  16  20  24  28  32  36  40]
 [  5  10  15  20  25  30  35  40  45  50]
 [  6  12  18  24  30  36  42  48  54  60]
 [  7  14  21  28  35  42  49  56  63  70]
 [  8  16  24  32  40  48  56  64  72  80]
 [  9  18  27  36  45  54  63  72  81  90]
 [ 10  20  30  40  50  60  70  80  90 100]]


## <a name='stats'></a>9️⃣ Statistics & Aggregations

In [None]:
data = np.random.default_rng(0).integers(1, 100, size=(5,4))
print('data\n', data)
print('row sums', data.sum(axis=1))
print('col means', data.mean(axis=0))


data
 [[85 64 51 27]
 [31  5  8  2]
 [18 81 65 91]
 [50 61 97 73]
 [63 54 56 93]]
row sums [227  46 255 281 266]
col means [49.4 53.  55.4 57.2]


In [None]:
# 🖊️ TODO: compute `np.percentile` (25th, 50th, 75th) of flattened `data`.
import numpy as np
data = np.arange(1, 21).reshape(4, 5)
p = np.percentile(data, [25, 50, 75])
print(p)

[ 5.75 10.5  15.25]


## <a name='random'></a>🔟 Random Numbers & Reproducibility

In [None]:
rng = np.random.default_rng(42)
rand_floats = rng.random(5)
rand_ints = rng.integers(low=10, high=50, size=5)
print(rand_floats, rand_ints)
rng2 = np.random.default_rng(42)
assert np.allclose(rand_floats, rng2.random(5))


[0.77395605 0.43887844 0.85859792 0.69736803 0.09417735] [31 49 39 40 38]


In [None]:
# 🖊️ TODO: simulate rolling a fair six‑sided die 100 times; estimate proportion of 6s.
import numpy as np
rolls = np.random.randint(1, 7, size=100)
prop_6 = np.mean(rolls == 6)
print(prop_6)

0.17


## <a name='structured'></a>1️⃣1️⃣ Structured / Record Arrays

In [None]:
people = np.array([('Alice', 25, 55.0), ('Bob', 30, 85.5)],
                   dtype=[('name','U10'), ('age','i4'), ('weight','f4')])
print(people['name'], people['age'].mean())


['Alice' 'Bob'] 27.5


In [None]:
# 🖊️ TODO: add a new field 'height' to the structured array using `np.lib.recfunctions.append_fields` (hint: pip install?).
import numpy as np
from numpy.lib import recfunctions as rfn
data = np.array([(1, 'Alice'), (2, 'Bob')], dtype=[('id', 'i4'), ('name', 'U10')])
heights = np.array([5.5, 6.0])
new_data = rfn.append_fields(data, 'height', heights, dtypes='f8', usemask=False)
print(new_data)

[(1, 'Alice', 5.5) (2, 'Bob', 6. )]


## <a name='linalg'></a>1️⃣2️⃣ Linear Algebra Essentials

In [None]:
A = np.random.random((3,3))
b = np.random.random(3)
x = np.linalg.solve(A, b)
print('A·x ≈ b?', np.allclose(A.dot(x), b))


A·x ≈ b? True


In [None]:
# 🖊️ TODO: compute eigenvalues of `A` using `np.linalg.eig`.
import numpy as np
A = np.array([[4, 2],
              [1, 3]])
vals, vecs = np.linalg.eig(A)
print("Eigenvalues:", vals)

Eigenvalues: [5. 2.]


## <a name='io'></a>1️⃣3️⃣ File I/O (`npy`, `npz`, `txt`)

In [None]:
np.save('array.npy', A)
loaded = np.load('array.npy')
print('loaded equals A?', np.allclose(loaded, A))
np.savez('multi_arrays.npz', A=A, b=b)


loaded equals A? True


In [None]:
# 🖊️ TODO: Use `np.savetxt` to write `data` (from stats section) to CSV then reload with `np.loadtxt`.
import numpy as np
data = np.random.rand(5, 3)
np.savetxt("data.csv", data, delimiter=",")
loaded = np.loadtxt("data.csv", delimiter=",")
print(loaded)

[[0.54484797 0.36632893 0.63584566]
 [0.52915991 0.26141727 0.94513118]
 [0.42782734 0.70406508 0.49264805]
 [0.94176125 0.10075624 0.31462406]
 [0.84901994 0.31836431 0.36969893]]


## <a name='datetime'></a>1️⃣4️⃣ Datetime64 & Timedelta64

In [None]:
dates = np.arange('2023-01', '2023-04', dtype='datetime64[D]')
delta = dates[1:] - dates[:-1]
print(dates[:5], delta[0])


['2023-01-01' '2023-01-02' '2023-01-03' '2023-01-04' '2023-01-05'] 1 days


In [None]:
# 🖊️ TODO: find how many Mondays appear in `dates` array.
import numpy as np
dates = np.arange('2023-01', '2023-04', dtype='datetime64[D]')
mondays_mask = ((dates - np.datetime64('1970-01-05')) % np.timedelta64(7, 'D')) == np.timedelta64(0, 'D')
mondays = np.sum(mondays_mask)
print(mondays)
print(dates[mondays_mask])

13
['2023-01-02' '2023-01-09' '2023-01-16' '2023-01-23' '2023-01-30'
 '2023-02-06' '2023-02-13' '2023-02-20' '2023-02-27' '2023-03-06'
 '2023-03-13' '2023-03-20' '2023-03-27']


## <a name='mask'></a>1️⃣5️⃣ Masked Arrays & NaNs

In [None]:
arr = np.array([1, 2, np.nan, 4, np.nan])
masked = np.ma.masked_invalid(arr)
print(masked.mean())


2.3333333333333335


In [None]:
# 🖊️ TODO: replace NaNs with column means in a 2‑D array containing NaNs.
import numpy as np
a = np.array([[1, np.nan, 3],
              [4, 5, np.nan],
              [7, 8, 9]], dtype=float)
col_means = np.nanmean(a, axis=0)
inds = np.where(np.isnan(a))
a[inds] = np.take(col_means, inds[1])
print(a)

[[1.  6.5 3. ]
 [4.  5.  6. ]
 [7.  8.  9. ]]


## <a name='project'></a>1️⃣6️⃣ Mini‑Project: Fitness Data Analysis

Load `fitness.txt` (tab‑separated) then follow prompts.

In [None]:
fitness = np.genfromtxt('/content/sample_data/fitness.txt', delimiter='\t', dtype=None, encoding=None, names=True)
print('columns:', fitness.dtype.names, 'rows:', len(fitness))

columns: ('date', 'step_count', 'mood', 'calories_burned', 'hours_of_sleep', 'bool_of_active', 'weight_kg') rows: 96


In [None]:
# 🖊️ TODO: Monthly step count, sleep vs mood correlation, weekly summary, etc.
import numpy as np
import pandas as pd
f = np.genfromtxt('/content/sample_data/fitness.txt', delimiter='\t', dtype=None, encoding=None, names=True)
df = pd.DataFrame(f)
df['date'] = pd.to_datetime(df['date'], format='%d-%m-%Y')
print("Monthly step count:\n", df.groupby(df['date'].dt.month)['step_count'].sum())
print("\nSleep vs mood correlation:", df['hours_of_sleep'].corr(df['mood']))
print("\nWeekly summary:\n", df.groupby(df['date'].dt.isocalendar().week)[['step_count','hours_of_sleep','mood']].mean())

Monthly step count:
 date
1      10163
10     79051
11    103071
12     89565
Name: step_count, dtype: int64

Sleep vs mood correlation: 0.2104166644730015

Weekly summary:
        step_count  hours_of_sleep        mood
week                                         
1      833.285714        3.714286  171.428571
2     2165.000000        5.000000  250.000000
40    3843.333333        6.000000  133.333333
41    3401.428571        5.571429  114.285714
42    2952.714286        6.428571  128.571429
43    2326.142857        5.571429  214.285714
44    2942.571429        5.142857  285.714286
45    4459.571429        4.714286  285.714286
46    2530.571429        5.571429  285.714286
47    3979.857143        5.714286  300.000000
48    2671.571429        6.428571  257.142857
49    3025.428571        5.428571  214.285714
50    3232.571429        4.285714  185.714286
51    2954.571429        4.857143  142.857143
52    2688.285714        4.142857  185.714286


## <a name='conclusion'></a>1️⃣7️⃣ Conclusion & Further Practice
Congrats on covering **all core NumPy topics** from your five lecture notebooks!

*Keep experimenting, read the official docs, and try converting your NumPy pipelines into Pandas or JAX for more fun.*