# 1. Universal Functions

- Numpy is important in DS due to its vectorized operations using universal functions.
- It can make a computation very fast or very slow.
- Ufuncs are the functions which performs fast operations on each element of the array.
- It makes repeated calculations an efficient one.

## (a) Slowness of loops
- Because of dynamic and interpretated nature of python (where types are flexible), the machine code could not be compiled efficiently (which is done using C/Fortran behind Python interface).
- To overcome this slowness challenge, some projects were developed like,
  1. PyPy project
  2. Cython Project
  3. Numba Project
But they have not reached the popularity/reach of CPython (which is default implementation of Python).

- Python's slowness is visible especially where many small operations are performed repeatedly e.g. loops on elements of an array.

### Example
- Let's understand with example, we want to calculate reciprocal of an array elements using loop:

In [27]:
import numpy as np
from typing import Union
from numpy.typing import NDArray 
np.random.seed(0)

def compute_reciprocal(arr: NDArray[Union[np.int64, np.float64]]):
    # print(np.empty(len(arr)))
    output = np.empty(len(arr))
    for i in range(len(arr)):
        output[i] = 1/arr[i]
    
    return output


In [28]:
np.random.seed(0)
arr = np.random.randint(1, 10, size=5)
print(compute_reciprocal(arr))
arr


[0.16666667 1.         0.25       0.25       0.125     ]


array([6, 1, 4, 4, 8])

### Performance Check
- For a big data, the time taken by above loop could be very big.
- Let's check:


In [29]:
big_arr = np.random.randint(1,100, size=1000000)
%timeit compute_reciprocal(big_arr)

89.2 ms ± 2.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


- Its very absurd to see such a large time, when even todays cell phone also does billions of numerical operations per seconds.
- Why this happens? Because CPython check type of elements before starting of each loop to calculate reciprocal.
- **Thus We use ufuncs which are precompiled, and they do not check these dynamics.**

# 2. Introducing Ufuncs
- For many operations, Numpy has a convinient interface which is based on statically typed, compiled routines.
- Any operation on array pushes the loop to this compiled layer under Numpy, where execution becomes very fast.
- See with this example of reciprocal:

In [32]:
# Reciprocal in numpy
%timeit 1/big_arr 

689 μs ± 9.27 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


- Unfuncs can be performed between arrays, scalar, 1D, multidim, between any combination.

In [38]:
np.arange(10)/np.arange(1,11)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ,
       0.83333333, 0.85714286, 0.875     , 0.88888889, 0.9       ])

In [46]:
x = np.arange(9).reshape(3,3)
print(f"{x=}")

print(f"{2**x=}")

x=array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
2**x=array([[  1,   2,   4],
       [  8,  16,  32],
       [ 64, 128, 256]])


- This is called **Vectorized operation**, ie, applying operations on all elements of an array simulataneously, not in loop.
- **Wherever you see loop in python, check if that can be replced with vectorized operation.**

# 3. Exploring Ufuncs
- Universal Functions means which are statically defined in Numpy package.
- These are of two types
  1. Unary - operates on one value (e.g. reciprocal, exponential etc)
  2. Binary - takes two values to operate on (e.g. addition)

In [55]:
x = np.arange(4)
print(f"{x=}")
print(f"{x+5=}")
print(f"{x-5=}")
print(f"{x/2=}")
print(f"{x//2=}")  # Floor division
print(f"{x**2=}")  # Square



x=array([0, 1, 2, 3])
x+5=array([5, 6, 7, 8])
x-5=array([-5, -4, -3, -2])
x/2=array([0. , 0.5, 1. , 1.5])
x//2=array([0, 0, 1, 1])
x**2=array([0, 1, 4, 9])


- Arithmatic ops are wrappers of UFuncs like:
- + is wrapper of np.add()

In [56]:
np.add(x,2)

array([2, 3, 4, 5])

In [61]:
# Absolute value
x = np.array([-2,-1,0,1,2])
print(np.abs(x))

[2 1 0 1 2]


In [63]:
# Absolute of complex numbers
x = np.array([-2+3j,-1+1j,0,1,2])
np.abs(x)

array([3.60555128, 1.41421356, 0.        , 1.        , 2.        ])

In [70]:
# Trigonometrci funcs
theta = np.linspace(0, np.pi, 4)
print(f"{theta=}")
np.sin(theta)

theta=array([0.        , 1.04719755, 2.0943951 , 3.14159265])


array([0.00000000e+00, 8.66025404e-01, 8.66025404e-01, 1.22464680e-16])

In [78]:
# Exponent
x = [1,2,3]
np.exp(x)  # e^x
np.exp2(x) # e^2x
np.power(4,x)  #4^x

array([ 4, 16, 64])

## Specialized Funcs in Numpy

- like hyperbolic trig funcs
- bitwise arithmatic
- comparison operators etc
- scipy module provides much more specialized funcs.
- Thus, always look documentation of these packages whenever want to perform any mathematical ops.


## Advanced ufuncs Features

### (a) Output specification - 'out='

- We can store output of a operation in a specific array, which uses direct memory location that temporary arrays. 

In [86]:
x = np.arange(5)
y = np.empty(5)
np.multiply(x,10, out=y)
y

array([ 0., 10., 20., 30., 40.])

In [92]:
x = [0,1,2,3,4]
y = np.zeros(10)
np.power(2,x, out=y[::2])   #index slicing- start:stop:step
y

array([ 1.,  0.,  2.,  0.,  4.,  0.,  8.,  0., 16.,  0.])

### (b) Aggregates

- In 1D array, to perform addition of elements

```python
- .reduce()
- .accumulate()


In [98]:
x = np.arange(1,6)
np.add.reduce(x)  # Gives final output

np.int64(15)

In [100]:
np.add.accumulate(x)  # Stores intermediate results

array([ 1,  3,  6, 10, 15])

In [106]:
assert np.multiply.reduce(x) == 120
assert list(np.multiply.accumulate(x))== [1,2,6,24,120]

### (c) Outer Products
- To get all possible pairs of two arrays.
- Used to create multiplication table.

In [109]:
x = np.arange(1,6)
np.multiply.outer(x,x)

array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])

### (d) Broadcasting
- Ufuncs are used to operate on arrays of different sizes and shapes, this is called broadcasting.
- Discussed separately.