<font size="+3"><b>Chapter 3: Advanced Python</b></font>

This notebook contains advanced applications of Python and assorted Python package tutorials.

# Overview

Here, I will devote a section to each of the packages I use the most or think are important. 
All discussion related to machine learning is left for a later chapter.
Additionally, plotting tutorials (Matplotlib, Plotly, PyROOT, etc) are saved for a later chapter.
In the last section of this chapter, I provide usage examples for packages I do not deem to be essential, but which I think are useful.

## Optimizing your Python code

When using Python for any large-scale project, it is essential to know which parts of your code need to be optimized.
Python is much slower than C++ and Julia, so unoptimized Python code will give you plenty of time to rethink your choice career path in HEP (for better or for worse).
[This website][prof2] is a good resource for evaluating the performance of the Python code you've written. 
I will summarize its contents here, along other useful information. 
The required packages for this are [line-profiler][lp] and [memory-profiler][mp].


[prof1]: https://pynash.org/2013/03/06/timing-and-profiling/
[prof2]: https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html
[lp]: https://pypi.org/project/line-profiler/
[mp]: https://pypi.org/project/memory-profiler/

### Run time

#### Jupyter notebook

The ``%time`` line magic in Jupyter can be used to display the run time of a single cell, while the ``%timeit`` line magic evaluates the execution time of the line averaged over multiple runs.
The ``%lprun`` line magic shows how long it took each line in a function to run.
Let's see some examples of this.

First, let's define our test functions.

In [1]:
def testfunc_append(n):
  """Create a list of numbers using append.
  Args:
    n (int): list length
  Returns:
    l (list): list containing the integers from 0 to n-1 inclusive
  """
  l = []
  for i in range(n):
    l.append(i)
  return
  
def testfunc_comprehension(n):
  """Create a list of numbers using list comprehension.
  Args:
    n (int): list length
  Returns:
    l (list): list containing the integers from 0 to n-1 inclusive
  """
  l = [ i for i in range(n) ]
  return

Now let's evaluate their performance. First we will use the ``%time`` command.

In [None]:
%time testfunc_append(10000)
%time testfunc_comprehension(10000)

Now let's try ``%timeit``.

In [None]:
%timeit testfunc_append(10000)
%timeit testfunc_comprehension(10000)

You can also time an entire cell with the ``%%timeit`` cell magic.

In [None]:
%%timeit
total = 0
for i in range(1000):
  for j in range(1000):
    total += i * (-1) ** j

And finally we will look at ``%lprun``. Note we have to load this extension.

In [2]:
%load_ext line_profiler

In [3]:
%lprun -u 1e-6 -f testfunc_append testfunc_append(1000)

Timer unit: 1e-06 s

Total time: 0.001351 s
File: <ipython-input-1-13b230037c66>
Function: testfunc_append at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
     1                                           def testfunc_append(n):
     2                                             """Create a list of numbers using append.
     3                                             Args:
     4                                               n (int): list length
     5                                             Returns:
     6                                               l (list): list containing the integers from 0 to n-1 inclusive
     7                                             """
     8         1          6.0      6.0      0.4    l = []
     9      1001        606.0      0.6     44.9    for i in range(n):
    10      1000        738.0      0.7     54.6      l.append(i)
    11         1          1.0      1.0      0.1    return

In [4]:
%lprun -u 1e-6 -f testfunc_comprehension testfunc_comprehension(1000)

Timer unit: 1e-06 s

Total time: 0.000238 s
File: <ipython-input-1-13b230037c66>
Function: testfunc_comprehension at line 13

Line #      Hits         Time  Per Hit   % Time  Line Contents
    13                                           def testfunc_comprehension(n):
    14                                             """Create a list of numbers using list comprehension.
    15                                             Args:
    16                                               n (int): list length
    17                                             Returns:
    18                                               l (list): list containing the integers from 0 to n-1 inclusive
    19                                             """
    20         1        237.0    237.0     99.6    l = [ i for i in range(n) ]
    21         1          1.0      1.0      0.4    return

We can see for even this simple example that list comprehension is a notable improvement over using ``append``.

#### Bash

To use the run time profiler in a terminal, use the command:
```
python -m cProfile -s cumtime -o output.pstats your_script_name.py
```
Once you have your ``output.pstats`` file, you can visualize it using the following code.

In [None]:
import pstats
p = pstats.Stats('data/output.pstats')
p.sort_stats('tottime').print_stats(50)

### Memory usage

---

# numpy

# pandas

# numba

# multiprocess

# uproot and awkward

# scipy

# Other Useful Packages

## numericalunits

## argparse

## sympy

## tqdm

## ipywidgets