Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling code #55

Open
dpshelio opened this issue Dec 8, 2022 · 1 comment
Open

Profiling code #55

dpshelio opened this issue Dec 8, 2022 · 1 comment
Labels
preparation Exercises to do before the class week10 Performance

Comments

@dpshelio
Copy link
Contributor

dpshelio commented Dec 8, 2022

Even when we measure the total time that a function takes to run (#54), that doesn't help us with knowing which parts of the code are slow!

To look into that, we need to use a different too called a profiler. Python comes with its own profiler, but we will use a more convenient tool.

Setup

This exercise will work with IPython or Jupyter notebooks, and will use two "magic" commands available there. You may need some steps to set them up first.

If you use Anaconda, you should already have access to Jupyter. If you don't, let us know on Moodle or use pip install ipython to install IPython.

The %prun magic should be already available with every installation of IPython/Jupyter. However, you may need to install the second magic (%lprun).
If you use Anaconda, run conda install line_profiler from a terminal. Otherwise, use pip install line_profiler.

Using profiling tools in IPython/Jupyter notebook

prun's magic gives us information about every function called.

  1. Open a Jupyter notebook or an IPython terminal.
  2. Add an interesting function (from Jake VanderPlas's book)
    def sum_of_lists(N):
        total = 0
        for i in range(5):
            L = [j ^ (j >> i) for j in range(N)]
            # j >> i == j // 2 ** i (shift j bits i places to the right)
            # j ^ i -> bitwise exclusive or; j's bit doesn't change if i's = 0, changes to complement if i's = 1
            total += sum(L)
        return total
  3. run %prun:
    %prun sum_of_lists(10_000_000)
  4. Look at the table of results. What information does it give you? Can you find which operation takes the most time? (You may find it useful to look at the last column first)

Using a line profiler in IPython/Jupyter

While prun presents its results by function, the lprun magic gives us line-by-line details.

  1. Load the extension on your IPython shell or Jupyter notebook
    %load_ext line_profiler
  2. Run %lprun
    %lprun -f sum_of_lists sum_of_lists(10_000_000)
  3. Can you interpret the results? On which line is most of the time spent?

Finishing up

When you are done, react to this issue using one of the available emojis, and/or comment with your findings: Which function takes the most time? Which line of the code?

@dpshelio dpshelio added week10 Performance preparation Exercises to do before the class labels Dec 8, 2022
@anda-raluca
Copy link

Line 4, L = [j ^ (j >> i) for j in range(N)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preparation Exercises to do before the class week10 Performance
Projects
None yet
Development

No branches or pull requests

2 participants