# Need for dedicated computational techniques in Numpy

Python’s default implementation (known as CPython) does some operations very
slowly. This is in part due to the dynamic, interpreted nature of the language: the fact
that types are flexible, so that sequences of operations cannot be compiled down to
efficient machine code as in languages like C and Fortran.

The relative sluggishness of Python generally manifests itself in situations where
many small operations are being repeated. To eliminate or overcome this slowness, Ufuncs were introduced.

## UFuncs
*Universal Functions* or *UFuncs* are array computation implementation in NumPy for various kinds of operations.NumPy provides a convenient interface into just this
kind of statically typed, compiled routine. This is known as a *vectorized* operation.
You can accomplish this by simply performing an operation on the array, which will
then be applied to each element. This vectorized approach is designed to push the
loop into the compiled layer that underlies NumPy, leading to much faster execution.


In [39]:
import numpy as np
np.__version__

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
np.random.seed(0)

'1.19.5'

In [40]:
%%timeit
a1=np.empty(1000000)
b1=np.random.randint(1,100,1000000)
for i in range(1000000):
    a1[i]=1.0/b1[i]

2.07 s ± 87.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [41]:
%timeit 1.0/np.random.randint(1,100,1000000)

11.3 ms ± 111 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


*<u>Its  clearly obvious from the above 2 pieces of code that the tradition loops are quite slower when working with a large amount of data.</u>*

### Aritmetic operations
Although dedicated methods exist for all basic aritematic operations, but this can be directly performed on a numpy array using unary or binary ufuncs and it is applied to each element of the array.

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-0pky">Operator</th>
    <th class="tg-0pky">Equivalent ufunc</th>
    <th class="tg-0pky">Description</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0pky">+</td>
    <td class="tg-0pky">numpy.add</td>
    <td class="tg-0pky">Addition (e.g., 1 + 1 = 2)</td>
  </tr>
  <tr>
    <td class="tg-0pky">-</td>
    <td class="tg-0pky">numpy.subtract</td>
    <td class="tg-0pky">Subtraction (e.g., 3 - 2 = 1)</td>
  </tr>
  <tr>
    <td class="tg-0pky">-</td>
    <td class="tg-0pky">numpy.negative</td>
    <td class="tg-0pky">Unary negation (e.g., -2)</td>
  </tr>
  <tr>
    <td class="tg-0pky">*</td>
    <td class="tg-0pky">numpy.multiply</td>
    <td class="tg-0pky">Multiplication (e.g., 2 * 3 = 6)</td>
  </tr>
  <tr>
    <td class="tg-0pky">/</td>
    <td class="tg-0pky">numpy.divide</td>
    <td class="tg-0pky">Division (e.g., 3 / 2 = 1.5)</td>
  </tr>
  <tr>
    <td class="tg-0pky">//</td>
    <td class="tg-0pky">numpy.floor_divide</td>
    <td class="tg-0pky">Floor division (e.g., 3 // 2 = 1)</td>
  </tr>
  <tr>
    <td class="tg-0pky">**</td>
    <td class="tg-0pky">numpy.power</td>
    <td class="tg-0pky">Exponentiation (e.g., 2 ** 3 = 8)</td>
  </tr>
  <tr>
    <td class="tg-0pky">%</td>
    <td class="tg-0pky">numpy.mod</td>
    <td class="tg-0pky">Modulus/remainder (e.g., 9 % 4 = 1)</td>
  </tr>
</tbody>
</table>

In [42]:
arr=np.arange(1,6)
print(" arr            = ",arr)
print(" arr + 1        = ",arr+1)
print(" arr - 1        = ",arr-1)
print(" arr * 5        = ",arr*5)
print(" arr / 2        = ",arr/2)
print(" arr // 3       = ",arr//3)
print("-arr            = ",-arr)
print(" arr ** 2       = ",arr**2)
print(" arr % 2        = ",arr//2)
print("-(2.0*arr**4)/2 = ",-(2.0*arr**4)/2)

 arr            =  [1 2 3 4 5]
 arr + 1        =  [2 3 4 5 6]
 arr - 1        =  [0 1 2 3 4]
 arr * 5        =  [ 5 10 15 20 25]
 arr / 2        =  [0.5 1.  1.5 2.  2.5]
 arr // 3       =  [0 0 1 1 1]
-arr            =  [-1 -2 -3 -4 -5]
 arr ** 2       =  [ 1  4  9 16 25]
 arr % 2        =  [0 1 1 2 2]
-(2.0*arr**4)/2 =  [  -1.  -16.  -81. -256. -625.]


### Other Computations

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-c3ow{border-color:inherit;text-align:center;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-c3ow">Built-in Functions\Computations</th>
    <th class="tg-c3ow">Equivalent Numpy Functions</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-c3ow">abs()</td>
    <td class="tg-c3ow">numpy.abs()<br>&nbsp;&nbsp;&nbsp;&nbsp;or <br>numpy.absolute()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.sin()</td>
    <td class="tg-c3ow">numpy.sin()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.cos()</td>
    <td class="tg-c3ow">numpy.cos()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.tan()</td>
    <td class="tg-c3ow">numpy.tan()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.asin()</td>
    <td class="tg-c3ow">numpy.arcsin()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.acos()</td>
    <td class="tg-c3ow">numpy.arccos()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.atan()</td>
    <td class="tg-c3ow">numpy.arctan()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">e**x</td>
    <td class="tg-c3ow">numpy.exp(x)</td>
  </tr>
  <tr>
    <td class="tg-c3ow">2**x</td>
    <td class="tg-c3ow">numpy.exp2(x)</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.log()</td>
    <td class="tg-c3ow">numpy.ln()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.log2()</td>
    <td class="tg-c3ow">numpy.log2()</td>
  </tr>
  <tr>
    <td class="tg-c3ow">math.log10()</td>
    <td class="tg-c3ow">numpy.log10()</td>
  </tr>
</tbody>
</table>

For large calculations, it is sometimes useful to be able to specify the array where the
result of the calculation will be stored. Rather than creating a temporary array, you
can use this to write computation results directly to the memory location where you’d like them to be. You can do this by using the optional <u>*out*</u> argument of any UFunc.

Consider the following code:


In [43]:
x = np.arange(5)
y = np.zeros(10) 
np.power(2, x, out=y[::2])
print(y)

array([ 1.,  2.,  4.,  8., 16.])

[ 1.  0.  2.  0.  4.  0.  8.  0. 16.  0.]


*If we had instead written `y[::2] = 2 ** x`, this would have resulted in the creation
of a temporary array to hold the results of 2 ** x, followed by a second operation
copying those values into the y array. This doesn’t make much of a difference for such
a small computation, but for very large arrays the memory savings from careful use of
the out argument can be significant.*

## Aggregates and their UFunc alternatives

Aggregates are special methods in Numpy which can be used to reduce an array and compute simple mathematical insights out of an array.

### UFunc Aggregates

- <u>`.reduce()`</u>: This aggregate will apply the binary ufunc between the elements of the array chronologically.<br>
    e.g. `numpy.add.reduce(numpy.arange(1,6))` will yield 15
- <u>`.accumulate()`</u>: This aggregate will return an array with the result of every chronological operation done on an array using a ufunc with the `.reduce()` aggregate.<br>
    e.g. `numpy.add.accumulate(numpy.arange(1,6))` will yield `[ 1,  3,  6, 10, 15]`
- <u>`.outer()`</u>: This aggregate can compute the output of ufunc applied on all pairs of two different inputs.<br>
    e.g. <br>
    `x=np.arange(1,6)`<br>
    `numpy.multiply.outer(x,x)`
<br><br>
    The above will yield the following:
    <br><br>

<pre>array([[ 1, 2, 3, 4, 5],
       [ 2, 4, 6, 8, 10],
       [ 3, 6, 9, 12, 15],
       [ 4, 8, 12, 16, 20],
       [ 5, 10,15, 20, 25]])</pre>

    



In [44]:
# Factorial of 10 can be computed as below:

np.multiply.reduce(np.arange(2,11))

3628800

In [45]:
# Factorial of till 10 can be computed as below:

np.multiply.accumulate(np.arange(1,11))

array([      1,       2,       6,      24,     120,     720,    5040,
         40320,  362880, 3628800], dtype=int32)

In [46]:
x=np.arange(1,6)
np.mod.outer(x,x)

array([[0, 1, 1, 1, 1],
       [0, 0, 2, 2, 2],
       [0, 1, 0, 3, 3],
       [0, 0, 1, 0, 4],
       [0, 1, 2, 1, 0]], dtype=int32)

### Numpy Aggregates

These work the same way as UFunc aggregates but are dedicated methods for those and many more functions. Here are some of the numpy aggregates.

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-7btt{border-color:inherit;font-weight:bold;text-align:center;vertical-align:top}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-7btt">Function Name</th>
    <th class="tg-7btt">Nan-safe Version</th>
    <th class="tg-7btt">Description</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0pky">numpy.sum</td>
    <td class="tg-0pky">numpy.nansum</td>
    <td class="tg-0pky">Compute sum of elements</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.prod</td>
    <td class="tg-0pky">numpy.nanprod</td>
    <td class="tg-0pky">Compute product of elements</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.mean</td>
    <td class="tg-0pky">numpy.nanmean</td>
    <td class="tg-0pky">Compute median of elements</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.std</td>
    <td class="tg-0pky">numpy.nanstd</td>
    <td class="tg-0pky">Compute standard deviation</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.var</td>
    <td class="tg-0pky">numpy.nanvar</td>
    <td class="tg-0pky">Compute variance</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.min</td>
    <td class="tg-0pky">numpy.nanamin</td>
    <td class="tg-0pky">Find minimum value</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.max</td>
    <td class="tg-0pky">numpy.nanmax</td>
    <td class="tg-0pky">Find maximum value</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.argmin</td>
    <td class="tg-0pky">numpy.nanargmin</td>
    <td class="tg-0pky">Find index of minimum value</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.argmax</td>
    <td class="tg-0pky">numpy.nanargmax</td>
    <td class="tg-0pky">Find index of maximum value</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.median</td>
    <td class="tg-0pky">numpy.nanmedian</td>
    <td class="tg-0pky">Compute median of elements</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.percentile</td>
    <td class="tg-0pky">numpy.nanpercentile</td>
    <td class="tg-0pky">Compute rank-based statistics of elements</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.any</td>
    <td class="tg-0pky">N/A</td>
    <td class="tg-0pky">Evaluate whether any elements are true</td>
  </tr>
  <tr>
    <td class="tg-0pky">numpy.all</td>
    <td class="tg-0pky">N/A</td>
    <td class="tg-0pky">Evaluate whether all elements are true</td>
  </tr>
</tbody>
</table>

In [47]:
x=np.random.randint(1,100000,10)
print("Array ->",x)
print("Maximum integer ->",np.max(x))
print("Minimum integer ->",np.min(x))
print("Standard Deviation of the array ->",np.std(x))
print("Mean of the array ->",np.mean(x))
print("Median of the array ->",np.median(x))

Array -> [24087 63426 65628 95571  4979 58139  6362 31561 93853 74612]
Maximum integer -> 95571
Minimum integer -> 4979
Standard Deviation of the array -> 31631.85672324658
Mean of the array -> 51821.8
Median of the array -> 60782.5


## Broadcasting
Broadcasting is simply a
set of rules for applying binary ufuncs (addition, subtraction, multiplication, etc.) on
arrays of different sizes.

In [48]:
# Element by element operation

np.arange(3)+5 

array([5, 6, 7])

Think of the `[0, 1, 2] + 5` operation as: 
<pre>
[0, 1, 2]
    +       =  [5, 6, 7]
[5, 5, 5]
</pre>

*Note: Remember this as a mental model for broadcasting , even though nothing of such kind happens in Numpy*

### Rules of Broadcasting

- Rule 1: If the two arrays differ in their number of dimensions, the shape of the
one with fewer dimensions is padded with ones on its leading (left) side.
- Rule 2: If the shape of the two arrays does not match in any dimension, the array
with shape equal to 1 in that dimension is stretched to match the other shape.
- Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is
raised.
<p align="center">
<img src="Assets/broadcasting.jpg",align="center">
</p>

<u>Examples:</u><br>

- *Example 1:* <br>
**array1.shape -> (2,3)**<br>
**array2.shape -> (3,)**
<br>
So, according to rule 1, we add 1 to the left of array2.shape
<br>
**array1.shape -> (2,3)**<br>
**array2.shape -> (1,3)**
<br>
Now, according to rule 2, we  match the shape of any dimension that is 1.
<br>
**array1.shape -> (2,3)**<br>
**array2.shape -> (2,3)**
<br>

Thus the resultant array will be of shape (2,3)

- *Example 2:* <br>
**array1.shape -> (3,1)**<br>
**array2.shape -> (3,)**
<br>
So, according to rule 1, we add 1 to the left of array2.shape
<br>
**array1.shape -> (3,1)**<br>
**array2.shape -> (1,3)**
<br>
Now, according to rule 2, we  match the shape of any dimension that is 1, in this case , in both arrays.
<br>
**array1.shape -> (3,3)**<br>
**array2.shape -> (3,3)**
<br>

Thus the resultant array will be of shape (3,3)

- *Example 3:* <br>
**array1.shape -> (3,2)**<br>
**array2.shape -> (3,)**
<br>
So, according to rule 1, we add 1 to the left of array2.shape
<br>
**array1.shape -> (3,2)**<br>
**array2.shape -> (1,3)**
<br>
Now, according to rule 2, we  match the shape of any dimension that is 1, in this case , in both arrays.
<br>
**array1.shape -> (3,2)**<br>
**array2.shape -> (3,3)**
<br>

<br>
Now, according to rule 3, The 2nd dimension in array1.shape disagrees with the corresponding dimension in array2.shape, therefor this will throw an error and array1 cannot be broadcasted.

In [56]:
# Example 1
# Broadcasting an array with higher dimension, e.g. 2-D array + 1-D array

np.ones((2,3))+np.arange(3)

array([[1., 2., 3.],
       [1., 2., 3.]])

In [50]:
# Example 2
# Broadcasting in cases when both arrays need broadcasting

np.arange(3)[:,np.newaxis]+np.arange(3)

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

In [55]:
# Example 3
# Broadcasting with incompatible dimensions

np.ones((3,2))+np.arange(3)

ValueError: operands could not be broadcast together with shapes (3,2) (3,) 