“The real problem is that programmers have spent far too much time
worrying about efficiency in the wrong places and at the wrong times;
premature optimization is the root of all evil (or at least most of it) in programming.”
Donald Knuth

- Highlight any and all loops, especially nested loops: the higher the level of nesting, the slower they often are.
- Locate slow mathematical operations such as square roots, trigonometric functions, and so on, or anything that is not an addition or subtraction.
- Shrink the size of key data structures.
- Determine the precision (float, double, int, uint8, and so on) of often used variables.
- Avoid unnecessary calculations when possible.
- Attempt to simplify mathematical equations.

https://wiki.python.org/moin/PythonSpeed/PerformanceTips
1. Get it right.
2. Test it's right.
3. Profile if slow.
4. Optimise.
6. Repeat from 2.

If there’s a for-loop over an array, there’s a good chance we can replace it with some built-in Numpy function

## Pętla for vs. list comprehension

In [1]:
import random

`['5', 'w', ';', '3', 'O', 'A', 'j', 'X', 'E', 'F']`

r - ile serii

n - liczba powtórzeń w 1 serii

In [10]:
%%timeit -r 2 -n 3
letters = []
for n in range(0, 1_000_000):
    letters.append(chr(random.randint(48, 122)))

699 ms ± 4.36 ms per loop (mean ± std. dev. of 2 runs, 3 loops each)


In [12]:
%%timeit -r 2 -n 3
letters = [chr(random.randint(48, 122)) for n in range(0, 1_000_000)]

677 ms ± 9.22 ms per loop (mean ± std. dev. of 2 runs, 3 loops each)


## Konkatenacja

In [13]:
letters = [chr(random.randint(48, 122)) for n in range(0, 1_000_000)]

In [16]:
%%timeit -r 5 -n 5
my_str = ''
for letter in letters:
    my_str += letter  # stringi są niemutowalne i za każdym razem robi się nowy napis w pamięci...

74.2 ms ± 3.56 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)


In [18]:
%%timeit -r 5 -n 5
my_str = ''.join(letters)

6.27 ms ± 1.19 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)


In [20]:
%%timeit -r 100 -n 100
first_name = 'Piotr'
last_name = 'GG'
age = 18
my_str = 'First name: ' + first_name + ' last name: ' + last_name + ' age: ' + str(age)

The slowest run took 11.46 times longer than the fastest. This could mean that an intermediate result is being cached.
1.62 µs ± 1.72 µs per loop (mean ± std. dev. of 100 runs, 100 loops each)


In [22]:
%%timeit -r 100 -n 100
first_name = 'Piotr'
last_name = 'GG'
age = 18
my_str = f'First name: {first_name} last name: {last_name} age: {age}'

The slowest run took 49.97 times longer than the fastest. This could mean that an intermediate result is being cached.
866 ns ± 2.35 µs per loop (mean ± std. dev. of 100 runs, 100 loops each)


## Testowanie warunków

In [28]:
%%timeit -r 1000 -n 1000
my_str = 'Hello world!'

if my_str != None:
    pass
else:
    pass

The slowest run took 73.86 times longer than the fastest. This could mean that an intermediate result is being cached.
47.4 ns ± 87.7 ns per loop (mean ± std. dev. of 1000 runs, 1000 loops each)


In [29]:
%%timeit -r 1000 -n 1000
my_str = 'Hello world!'

if my_str:
    pass
else:
    pass

37.8 ns ± 40.7 ns per loop (mean ± std. dev. of 1000 runs, 1000 loops each)


## NumPy / Pandas

In [30]:
import numpy as np
import pandas as pd

In [32]:
%%timeit -r 5 -n 5
numbers = [i * 1000 for i in range(0, 1_000_000)]

61.8 ms ± 3.18 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)


In [35]:
%%timeit -r 5 -n 5
np.random.randn(1_000_000) * 1000

29.3 ms ± 1.56 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)


In [36]:
s = pd.Series(numbers)
s

0                 0
1              1000
2              2000
3              3000
4              4000
            ...    
999995    999995000
999996    999996000
999997    999997000
999998    999998000
999999    999999000
Length: 1000000, dtype: int64

In [38]:
%%timeit -r 5 -n 5
s.apply(lambda number: number / 1_000_000)

137 ms ± 2.76 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)


In [39]:
%%timeit -r 5 -n 5
s / 1_000_000

1.28 ms ± 810 µs per loop (mean ± std. dev. of 5 runs, 5 loops each)
