## Tips & techniques 

Profiling and optimizing code is a very broad subject, full of rabbit holes. The more you get into it the more you find out how much more you can optmize and how much better your tests could be done. It's also important to understand when to stop.

Careful to not fall into the over optimizing trap!

If you intend to open source your code, clarity might be much more relevant than squeezing every drop of performance of your scripts.

In this notebook I share a few things I picked up learning about scientific software performance. These were written with the scientific development in Python in mind. If you're doing something different, like, profiling web environment, for example, they might be completely irrelevant. They might still be irrelevant inside the scientific software ecossystem, as many variables play a role on which kind of treatment you should give to data, as we've seen in previous sessions.

#### #1 Generators are faster than list comprehensions

In [None]:
from sys import getsizeof

comp = [i for i in range(10000)]
gen = (i for i in range(10000))

#gives size for list comprehension
x = getsizeof(comp)
print("x = ", x)

#gives size for generator expression
y = getsizeof(gen)
print("y = ", y)

In [None]:
#List Comprehension:
import timeit

print(timeit.timeit('''list_com = [i for i in range(100) if i % 2 == 0]''', number=1000000))

In [None]:
#Generator Expression:
import timeit

print(timeit.timeit('''gen_exp = (i for i in range(100) if i % 2 == 0)''', number=1000000))

#### #3 Tuples are faster than Lists

Use Tuples as Immutable Lists.
The Python interpreter and standard library make extensive use of tuples as immutable lists, and so should you. This brings two key benefits:
Clarity: When you see a tuple in code, you know its length will never change.
Performance: A tuple uses less memory than a list of the same length, and it allows Python to do some optimizations.

#### #2 Prefer dictionaries

If you can't use `set`s the next best thing might be a `dict`, depending on your user case. The following table can be found on the [Fluent Python](https://www.fluentpython.com) website:

<br>

![ditc_set_compar](https://www.fluentpython.com/extra/internals-of-sets-and-dicts/images/table-dict-set-list-time.png)

#### #4 Chained comparisons are good

When comparing three variables with each other, instead of doing `x < y and y < z`, you can use `x < y < z`.
This should prove easier to read (more natural) and faster to run.

#### #5 When possible, sort by the key

When doing a custom sort on a list, try not to sort using a comparison function. Instead, when possible, sort by the key. This is because the key function will be called only once per item, whereas the comparison function will be called several times per item during the process. 

#### #6 Sorting can be very costly, learn about the main algorithms

[Here](https://realpython.com/sorting-algorithms-python/)'s a recommended source to learn about Python's implementation on some common algorithms.
If you prefer a more in depth approach, check the chapter on sorting of Introduction to Algorithms (Cormen, et al.).

#### #7 Sampling

Even if you have a lot of data, there might not be much advantage from using all of it. By sampling intelligently you might be able to derive the same insight from a much more manageable subset.

#### #8 I/O is very costly

- For compression youâ€™ll probably find that you drop gzip and bz2, and embrace newer systems like lz4, snappy, and Z-Standard that provide better performance and random access.
- For storage formats you may find that you want self-describing formats that are optimized for random access, metadata storage, and binary encoding like Parquet, ORC, Zarr, HDF5, and GeoTIFF.
- When working on the cloud you may find that some older formats like HDF5 may not work as well.
- You may want to partition or chunk your data in ways that align well to common queries. In Dask DataFrame this might mean choosing a column to sort by for fast selection and joins. For Dask Array this might mean choosing chunk sizes that are aligned with your access patterns and algorithms.


## Techniques

### Memoization

This is an optimization technique that saves the results of previous invocations of functions.

In [None]:
# Fluent Python's implementation of a time function
import time

def clock(func):
    def clocked(*args):
        t0 = time.time()
        result = func(*args)
        elapsed = time.time() - t0
        name = func.__name__
        arg_str = ', '.join(repr(arg) for arg in args)
        print('[%0.8fs] %s(%s) -> %r' % (elapsed, name, arg_str, result))
        return result
    return clocked

In [None]:
@clock
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 2) + fibonacci(n - 1)

print(fibonacci(4))

In [None]:
import functools 

@functools.cache
@clock
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 2) + fibonacci(n - 1)

print(fibonacci(4))

All the arguments taken by the decorated function must be hashable, because the underlying lru_cache uses a dict to store the results

`@cache` won't be available to you if your Python < 3.8, but you can still use `@lru_cache`.

`@lru_cache` contains two arguments, `maxsize` which receives an integer containing the maximum number of entries to be stored and `typed` which receives a boolean that will say if arguments ought to be stored separately.