# ⏱️ 08 - Timing and Performance

In data science, performance matters. Some code runs fast, some slow.  
Jupyter/IPython gives us tools to measure runtime easily.

In this notebook you will learn:
- `%time` and `%timeit` for single expressions
- `%%time` and `%%timeit` for entire cells
- Comparing loops, list comprehensions, and NumPy
- Why performance awareness is important


## 1. `%time`

In [None]:
%time sum(range(1_000_000))

CPU times: user 8.32 ms, sys: 247 μs, total: 8.56 ms
Wall time: 8.91 ms


499999500000

✅ **Your Turn**: Use `%time` to measure how long it takes to sort a list of 1 million random numbers.

In [4]:
import random
num  = [random.randint(1, 1_000_000) for _ in range(1000)]

%time num.sort()

CPU times: user 154 µs, sys: 19 µs, total: 173 µs
Wall time: 178 µs


## 2. `%timeit`

In [5]:
numbers = list(range(1_000))
%timeit [x**2 for x in numbers]

74.6 µs ± 23.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [6]:
%timeit nums = list(range(1_000))
%timeit nums2 = [n for n in range(1_000)]

14.6 µs ± 631 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
41.8 µs ± 9.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


✅ **Your Turn**: Compare `%timeit` results for a list comprehension vs. a `for` loop that builds the same list.

## 3. `%%time` for a Whole Cell

In [None]:
%%timeit
total = 0
for i in range(1_000_000):
    total += i
total

20.2 ms ± 765 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)


✅ **Your Turn**: Wrap a longer multi-line operation with `%%time` to measure its runtime.

In [7]:
%%time

import numpy as np

# Multi-line operation
A = np.random.rand(1_000_000)
B = np.random.rand(1_000_000)

dot = np.dot(A, B)


CPU times: user 19.2 ms, sys: 6.98 ms, total: 26.2 ms
Wall time: 25.1 ms


## 4. Comparing Loops vs. NumPy

In [8]:
import numpy as np

numbers = np.arange(1_000_000)

# Python loop
%timeit [x**2 for x in numbers]

# NumPy vectorized
%timeit numbers**2

167 ms ± 29.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
951 µs ± 173 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


✅ **Your Turn**: Try squaring numbers with a Python loop, list comprehension, and NumPy array. Compare times.

In [9]:
%%time

N = 10_000_000
result_loop = []

for i in range(N):
    result_loop.append(i * i)


CPU times: user 1.11 s, sys: 251 ms, total: 1.36 s
Wall time: 1.36 s


In [10]:
%%time

N = 10_000_000
result_lc = [i * i for i in range(N)]


CPU times: user 479 ms, sys: 244 ms, total: 723 ms
Wall time: 724 ms


In [11]:
%%time

import numpy as np

N = 10_000_000
arr = np.arange(N)
result_np = arr * arr


CPU times: user 27 ms, sys: 80 ms, total: 107 ms
Wall time: 123 ms


## 5. Why This Matters
- Performance differences become huge with large datasets.
- Vectorized operations (like NumPy, Pandas) are usually faster.
- `%timeit` is your friend when deciding how to implement something.


---
### Summary
- `%time` and `%timeit` measure execution speed.
- `%%time` and `%%timeit` work on whole cells.
- Loops are slower than list comprehensions, which are slower than NumPy.
- Always measure performance before optimizing.
