# ⏱️ 08 - Timing and Performance

In data science, performance matters. Some code runs fast, some slow.  
Jupyter/IPython gives us tools to measure runtime easily.

In this notebook you will learn:
- `%time` and `%timeit` for single expressions
- `%%time` and `%%timeit` for entire cells
- Comparing loops, list comprehensions, and NumPy
- Why performance awareness is important


## 1. `%time`

In [1]:
%time sum(range(1_000_000))

CPU times: user 21.7 ms, sys: 68 µs, total: 21.8 ms
Wall time: 21.7 ms


499999500000

✅ **Your Turn**: Use `%time` to measure how long it takes to sort a list of 1 million random numbers.

In [None]:
%time sum(range(1_000_000_000))

## 2. `%timeit`

In [None]:
numbers = list(range(1_000))
%timeit [x**2 for x in numbers]

✅ **Your Turn**: Compare `%timeit` results for a list comprehension vs. a `for` loop that builds the same list.

In [1]:
numbers = range(1_000_000)

print("List comprehension:")
%timeit [x**2 for x in numbers]

print("For loop:")
def for_loop_square(nums):
    result = []
    for x in nums:
        result.append(x**2)
    return result

%timeit for_loop_square(numbers)

List comprehension:
146 ms ± 36.8 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
For loop:
104 ms ± 13.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


## 3. `%%time` for a Whole Cell

In [2]:
%%time
total = 0
for i in range(1_000_000):
    total += i
total

CPU times: user 120 ms, sys: 1.79 ms, total: 122 ms
Wall time: 122 ms


499999500000

✅ **Your Turn**: Wrap a longer multi-line operation with `%%time` to measure its runtime.

In [3]:
%%time
total_sum = 0
for i in range(10_000_000):
    total_sum += i
total_sum

CPU times: user 1.17 s, sys: 1.36 ms, total: 1.17 s
Wall time: 1.18 s


49999995000000

## 4. Comparing Loops vs. NumPy

In [None]:
import numpy as np

numbers = np.arange(1_000_000)

# Python loop
%timeit [x**2 for x in numbers]

# NumPy vectorized
%timeit numbers**2

✅ **Your Turn**: Try squaring numbers with a Python loop, list comprehension, and NumPy array. Compare times.

In [4]:
import numpy as np

numbers_list = list(range(1_000_000)) # For Python list and comprehension
numbers_np = np.arange(1_000_000) # For NumPy array

print("--- Comparing squaring methods ---")

# Python for loop
def square_with_loop(nums):
    result = []
    for x in nums:
        result.append(x**2)
    return result

print("Python for loop:")
%timeit square_with_loop(numbers_list)

# List comprehension
print("List comprehension:")
%timeit [x**2 for x in numbers_list]

# NumPy vectorized operation
print("NumPy array:")
%timeit numbers_np**2

--- Comparing squaring methods ---
Python for loop:
99.1 ms ± 24.9 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
List comprehension:
77.5 ms ± 1.68 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
NumPy array:
1.04 ms ± 141 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


## 5. Why This Matters
- Performance differences become huge with large datasets.
- Vectorized operations (like NumPy, Pandas) are usually faster.
- `%timeit` is your friend when deciding how to implement something.


---
### Summary
- `%time` and `%timeit` measure execution speed.
- `%%time` and `%%timeit` work on whole cells.
- Loops are slower than list comprehensions, which are slower than NumPy.
- Always measure performance before optimizing.
