Your code will not always run as fast as you want it to.  Often, this is something you can fix.  But you have to know *why* your code is slow, first.  There are two major ways to do this:

- Just eyeball it and guess.
- Actually measure it.

Obviously, the second is the best approach.  You can start to do the first approach after you get a *lot* of Python under your belt, and you start to get a good sense for what kinds of code are fast and slow.  But even then, there is absolutely no substitute for actually measuring what parts of your code are slowing it down.  This notebook shows two big tools for doing just that.

(Note that `tqdm` can also be very useful when you have a big `for`/`while` loop in your code, but it isn't quite as robust as the two tools here).

# `timeit`: Benchmark/speed test small code snippets

You've seen some examples already of me using the `timeit` module from the standard library, but let's dive into what it actually does.  It's pretty simple: you give it a piece of code, it runs it a bunch of times, and it tells you how long it took.  `timeit` is mostly intended for running a small piece of code a lot of times, you can easily change this.

The basic recipe for timeit is:

In [1]:
from timeit import timeit
print(timeit("100 % 3"))

0.037076700000000073


Note that you need to pass `timeit.timeit()` a string, which gets interpreted as Python code and executed.  We can't pass expressions as arugments to functions--Python will always try to evaluate them and pass the result--so expressions have to be passed as strings.  It's a bit clunky, especially compared to languages like Lisp or Julia, but it allows other elements of Python (mostly on the backend, in the language's implementation) to be much simpler and more secure.

By default, timeit will:
- Run the snippet 1,000,000 times.
- Assume that any variables referenced in the snippet are also defined in the snippet.

Both of these behaviors can be changed.
- Change the number of times the code runs with the `number=` argument.
- Tell `timeit.timeit()` to use variables defined elsewhere in your code with the `globals=` argument.

The `globals=` argument needs to be given a `dict`ionary containing `{"variable_name": value}` pairs.  E.g., `{"x": 10}` will tell `timeit.timeit()` "when you see the variable `x` in the snippet, it has the value 10 (unless it gets re-defined in the snippet)."  There's a built-in Python function, `globals()`, which will give you a dictionary of anything defined *in global scope* in your current program.  We're not going to worry too much about the details of what that means, but just know that if you're using `timeit.timeit()` on a line that isn't indented at all, you should be able to pass `globals=globals()` and have everything just work.

In [2]:
from timeit import timeit

x = [1,2,3,4,5,6,7,8,9,10]
# print(timeit("5 in x")) # NameError: name 'x' is not defined
print(timeit("5 in x", globals=globals())) # 'x' is now the value just defined
print(timeit("5 in x", globals=globals(), number=1000)) # only run the snipped 1,000 times

# Multi-line strings are useful with timeit.timeit() for longer snippets.
# But becareful--the string needs to be all the way at the zero-indent level,
# or you'll get weird errors about unexpected indentation.  This makes your code
# look kind of ugly, unfortunately.
print(timeit(
"""
if 5 in x:
    found = True
else:
    found = False
""",
    globals=globals(),
    number=1000
))

0.19390820000000009
0.00022039999999989845
0.00022180000000027178


# Profiling your code: Why is my *program* slow?

`timeit` is awesome for quick testing of small snippets.  If you're writing your code and you need to know something like "will it be faster to use integer division here, or do regular division and then convert to an integer?" then `timeit` is your friend.  But, if you already have a program, and it takes a long time to run, and you want to know what parts are slow (and thus, what parts you should focus on speeding up), `timeit` won't do anything for you: you'll need to turn to a *profiler*.

A profiler, in the programming world, is just a program that:

1. Run your program.
2. Track how long your program spent doing each thing it does.
3. Tell you how long each part took to run.

They're a bit cumbersome to use at times, but they are the best way to actually figure out why your program is slow and how you can make it go faster.  Python has two profilers in the standard library: `cProfile`--the one you should generally use--and `profile`--the one you should only use if you're trying to customize how the profiler behaves (which you probably won't ever need to do).

In [3]:
import cProfile
import time

# a function we want to find the slow spots for.
# we'll add a time.sleep() command to artificially increase
# the amount of time it takes to run.
def my_function(x):
    time.sleep(5)
    return x >= 10
    

# like with timeit: give `cProfile.run()` a string to be executed.
print(cProfile.run("my_function(10)"))

         5 function calls in 5.004 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.004    5.004 1390719897.py:7(my_function)
        1    0.000    0.000    5.004    5.004 <string>:1(<module>)
        1    0.000    0.000    5.004    5.004 {built-in method builtins.exec}
        1    5.004    5.004    5.004    5.004 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


None


There is a *lot* you can customize about how the profiler runs.  You can save the results to a file and then process them with the `pstats` module in the standard library.  You can also run the profiler from the command line if you want to profile an entire program:

```bash
python -m cProfile [-o output_file] my_program.py
```

There's a lot to the Python profilers; check the documentation for more details.