# ⏱️ 08 - Timing and Performance

In data science, performance matters. Some code runs fast, some slow.  
Jupyter/IPython gives us tools to measure runtime easily.

In this notebook you will learn:
- `%time` and `%timeit` for single expressions
- `%%time` and `%%timeit` for entire cells
- Comparing loops, list comprehensions, and NumPy
- Why performance awareness is important


## 1. `%time`

In [11]:
%time sum(range(1_000_000))

CPU times: user 22.7 ms, sys: 135 µs, total: 22.8 ms
Wall time: 23.1 ms


499999500000

✅ **Your Turn**: Use `%time` to measure how long it takes to sort a list of 1 million random numbers.

In [12]:
%time sum(range(1_000_000))

CPU times: user 21.7 ms, sys: 0 ns, total: 21.7 ms
Wall time: 21.6 ms


499999500000

## 2. `%timeit`

In [13]:
numbers = list(range(1_000))
%timeit [x**2 for x in numbers]

80.8 µs ± 26.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [14]:
N = 1_000_000
data = list(range(N))
def f(x): return x * x

print("List comprehension:")
%timeit [f(x) for x in data]

print("\nFor loop:")
out = []
%timeit [out.append(f(x)) for x in data]


List comprehension:
93.6 ms ± 781 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

For loop:
136 ms ± 35.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


✅ **Your Turn**: Compare `%timeit` results for a list comprehension vs. a `for` loop that builds the same list.

## 3. `%%time` for a Whole Cell

In [15]:
%%time
total = 0
for i in range(1_000_000):
    total += i
total

CPU times: user 140 ms, sys: 1.95 ms, total: 142 ms
Wall time: 143 ms


499999500000

✅ **Your Turn**: Wrap a longer multi-line operation with `%%time` to measure its runtime.

In [16]:
import time

N = 1_000_000
data = list(range(N))
def f(x): return x * x

start = time.time()
[f(x) for x in data]
print("List comprehension:", time.time() - start, "seconds")

start = time.time()
out = []
for x in data:
    out.append(f(x))
print("For loop:", time.time() - start, "seconds")


List comprehension: 0.1171419620513916 seconds
For loop: 1.1422491073608398 seconds


## 4. Comparing Loops vs. NumPy

In [17]:
import numpy as np

numbers = np.arange(1_000_000)

# Python loop
%timeit [x**2 for x in numbers]

# NumPy vectorized
%timeit numbers**2

169 ms ± 51.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
1.13 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


✅ **Your Turn**: Try squaring numbers with a Python loop, list comprehension, and NumPy array. Compare times.

In [18]:
import numpy as np
N = 1_000_000
data = list(range(N))
arr = np.arange(N)
def f(x): return x * x


In [19]:
%%time
out = []
for x in data:
    out.append(f(x))

CPU times: user 155 ms, sys: 7.18 ms, total: 162 ms
Wall time: 164 ms


In [20]:
%time [f(x) for x in data]

CPU times: user 88.4 ms, sys: 11.9 ms, total: 100 ms
Wall time: 101 ms


[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801,
 10000,
 10201,
 10404,
 10609,
 10816,
 11025,
 11236,
 11449,
 11664,
 11881,
 12100,
 12321,
 12544,
 12769,
 12996,
 13225,
 13456,
 13689,
 13924,
 14161,
 14400,
 14641,
 14884,
 15129,
 15376,
 15625,
 15876,
 16129,
 16384,
 16641,
 16900,
 17161,
 17424,
 17689,
 17956,
 18225,
 18496,
 18769,
 19044,
 19321,
 19600,
 19881,
 20164,
 2

In [21]:
%time arr ** 2

CPU times: user 1.6 ms, sys: 2.97 ms, total: 4.57 ms
Wall time: 4.76 ms


array([           0,            1,            4, ..., 999994000009,
       999996000004, 999998000001])

## 5. Why This Matters
- Performance differences become huge with large datasets.
- Vectorized operations (like NumPy, Pandas) are usually faster.
- `%timeit` is your friend when deciding how to implement something.


---
### Summary
- `%time` and `%timeit` measure execution speed.
- `%%time` and `%%timeit` work on whole cells.
- Loops are slower than list comprehensions, which are slower than NumPy.
- Always measure performance before optimizing.
