# High Performance II

## scanl
- `itertools.accumulate(iterable[, func, *, initial=None])`
- `numpy.ufunc.accumulate(array, axis=0, dtype=None, out=None)`

In [1]:
import itertools as it
import numpy as np

In [2]:
N=100000
v=[x for x in range(N)]
vp=np.arange(N)

In [3]:
%%timeit
it.accumulate(v)

72 ns ± 0.272 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [4]:
tmp=it.accumulate(v)

In [5]:
%%time
foo=list(tmp)

CPU times: user 0 ns, sys: 2.44 ms, total: 2.44 ms
Wall time: 2.47 ms


In [6]:
%%timeit
list(it.accumulate(v))

2.18 ms ± 34.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [7]:
%%timeit
foo=[x+1 for x in v]

1.93 ms ± 117 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [8]:
%%time
np.cumsum(vp)

CPU times: user 325 µs, sys: 51 µs, total: 376 µs
Wall time: 280 µs


array([         0,          1,          3, ..., 4999750003, 4999850001,
       4999950000])

In [9]:
%%timeit
np.cumsum(vp)

42.5 µs ± 305 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [10]:
%%timeit
np.add.accumulate(vp)

40.2 µs ± 295 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


Conclusions:
- numpy's accumulate is much faster than itertools's
- itertools is slightly faster than the list comprehension.

## foldl
- `functools.reduce(function, iterable, initializer=None)`
- `numpy.ufunc.reduce(array, axis=0, dtype=None, out=None, keepdims=False, initial=<no value>, where=True)`

In [11]:
from operator import add
from functools import reduce

v=[x for x in range(N)]
vp=np.arange(N)

In [12]:
%%timeit

reduce(add,v)

1.86 ms ± 16.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [13]:
%%timeit

s=0
for i in v:
    s+=i

2.31 ms ± 242 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [14]:
%%timeit
np.add.reduce(vp)

15 µs ± 680 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


Conclusions:
- functools's foldl is very slow, it is just slightly faster than for-loop

## map

In [15]:
v=[x for x in range(N)]
vp=np.arange(N)

In [16]:
%%timeit

map(float,v)

65.7 ns ± 1.55 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


In [17]:
%%timeit

list(map(float,v))

2.04 ms ± 59.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [18]:
%%timeit
[float(x) for x in v]

2.56 ms ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [19]:
from functools import partial
f=np.vectorize(float)

In [20]:
%%timeit

f(v)

9.58 ms ± 28.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [21]:
%%timeit
np.apply_along_axis(f, 0, vp)

7.2 ms ± 55.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Conclusions:
- python built-in map is slightly fast than the list comprehension
- not sure how to use the np.apply_along_axis effectively

## mconcat

In [22]:
v=it.repeat([1,2,3,4],N)
vl=list(it.repeat([1,2,3,4],N))

In [23]:
%%timeit
foo=list(it.chain.from_iterable(vl))

4.13 ms ± 57.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [24]:
%%timeit
foo=[y for xs in vl for y in xs]

5.76 ms ± 22.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Conclusions:
- itertools's mconcat is slightly faster than the list comprehension