<a href="https://colab.research.google.com/github/cutecat0/ArtsofData/blob/master/data_science/Python_Data_Science_Handbook_Jake_VanderPlas/06_Computation_on_NumPy_Arrays.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Computation on NumPy Arrays: Universal Functions

! Why NumPy is so important in the Python Data Science World!!

Namely, it provides an easy and flexible interface to optimized computation with arrays of data.

Computation on NumPy arrays can be very fast, or it can be very slow.

The key to making it fast is to use
`vectorized` operations, generally implemented through NumPy's `universal functions` (ufuncs).

# The Slowness of Loops

Python's default implementation (known as CPython) dose some operations very slowly.

This is part due to the `dynamic`, `interpreted` nature of the language:
the fact that types are `flexible`, so that sequeences of operations `cannot be compiled down` to `efficient machine code` as in language `like C` and Fortran.

Various attempts to to address this weakness:
1. PyPy. project: a just-in-time compiled implementation of Python; http://pypy.org/
2. Cython project, which converts Python code to compilable C code; http://cython.org/
3. Numba project: which converts snippets of Python code to fast LLVM bytecode. http://numba.pydata.org/




In [3]:
import numpy as np
np.random.seed(0)


def compute_reciprocals(values):
  output = np.empty(len(values))
  for i in range(len(values)):
    output[i] = 1.0 / values[i]
  return output

values = np.random.randint(1, 10, size=5)
print(values)
compute_reciprocals(values)

[6 1 4 4 8]


array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [None]:
big_array = np.random.randint(1, 100, size=1000000)
%timeit compute_reciprocals(big_array)

The above code runs so slow reason:
`type-checking` and `funciton dispatches` that Cpython must do at each cycle of the loop.

Each time the reciprocals is comouted, Python first examines the object's type and does a dynamic lookup of the correct function to use for that type.

If we were working in compiled code instead, this type specification would be known before the code executes and the result could be computed much more efficiently.
