<a href="https://colab.research.google.com/github/Skidmark156/username-DataScience-2025/blob/main/completed/08_timing_and_performance.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ⏱️ 08 - Timing and Performance

In data science, performance matters. Some code runs fast, some slow.  
Jupyter/IPython gives us tools to measure runtime easily.

In this notebook you will learn:
- `%time` and `%timeit` for single expressions
- `%%time` and `%%timeit` for entire cells
- Comparing loops, list comprehensions, and NumPy
- Why performance awareness is important


## 1. `%time`

In [None]:
%time sum(range(1_000_000))

✅ **Your Turn**: Use `%time` to measure how long it takes to sort a list of 1 million random numbers.

In [None]:
import random
nums = [random.random() for _ in range(1_000_000)]
%time sorted(nums)

CPU times: user 446 ms, sys: 4.05 ms, total: 450 ms
Wall time: 450 ms


[2.6229555638579427e-08,
 1.1159273486383015e-06,
 1.8219806866559551e-06,
 2.109235958003275e-06,
 2.216270892141381e-06,
 2.60007864349987e-06,
 4.869843132304652e-06,
 5.270050427053086e-06,
 5.683740960549244e-06,
 5.894670570327776e-06,
 5.992949793309776e-06,
 6.553243364293415e-06,
 6.887074266770377e-06,
 7.430243151973492e-06,
 7.5725368100520996e-06,
 9.864071983223255e-06,
 1.0387981268733526e-05,
 1.1587753330877248e-05,
 1.1821308922965734e-05,
 1.2776202651498814e-05,
 1.2838078749410897e-05,
 1.3893738936454625e-05,
 1.4688306474330837e-05,
 1.576911006584414e-05,
 1.7413767791407686e-05,
 1.761515953047077e-05,
 1.8385188636727e-05,
 1.857562062312912e-05,
 1.9682973935064574e-05,
 2.064954478631087e-05,
 2.0690622656527324e-05,
 2.098843189890509e-05,
 2.176368939821316e-05,
 2.177428389549263e-05,
 2.2962475605403654e-05,
 2.4305948148950485e-05,
 2.434085645730555e-05,
 2.5120080703544545e-05,
 2.7854534303650702e-05,
 2.8268727022773277e-05,
 2.8608054237544422e-05,

## 2. `%timeit`

In [None]:
numbers = list(range(1_000))
%timeit [x**2 for x in numbers]

✅ **Your Turn**: Compare `%timeit` results for a list comprehension vs. a `for` loop that builds the same list.

In [None]:
numbers = list(range(1_000))

# List comprehension
%timeit [x**2 for x in numbers]

# For loop
result = []
%timeit
for x in numbers:
    result.append(x**2)

62.9 µs ± 6.88 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## 3. `%%time` for a Whole Cell

In [None]:
%%time
total = 0
for i in range(1_000_000):
    total += i
total

✅ **Your Turn**: Wrap a longer multi-line operation with `%%time` to measure its runtime.

In [None]:
%%time
squares = []
for i in range(10_000_000):
    squares.append(i ** 2)
len(squares)

CPU times: user 1.25 s, sys: 252 ms, total: 1.5 s
Wall time: 1.5 s


10000000

## 4. Comparing Loops vs. NumPy

In [None]:
import numpy as np

numbers = np.arange(1_000_000)

# Python loop
%timeit [x**2 for x in numbers]

# NumPy vectorized
%timeit numbers**2

185 ms ± 36.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
1.18 ms ± 96.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


✅ **Your Turn**: Try squaring numbers with a Python loop, list comprehension, and NumPy array. Compare times.

In [None]:
import time

start = time.time()
results = []
for x in range(1_000_000):
    results.append(x**2)
end = time.time()

print("Elapsed time:", end - start, "seconds")

Elapsed time: 0.1456315517425537 seconds


## 5. Why This Matters
- Performance differences become huge with large datasets.
- Vectorized operations (like NumPy, Pandas) are usually faster.
- `%timeit` is your friend when deciding how to implement something.


---
### Summary
- `%time` and `%timeit` measure execution speed.
- `%%time` and `%%timeit` work on whole cells.
- Loops are slower than list comprehensions, which are slower than NumPy.
- Always measure performance before optimizing.
