***
## 3.5 Numpy Operations

### Element-wise array-array operations
When we add, subtract, multiply and divide arrays with each other, the default behaviour is element-wise operations:

***
### Python3.1 Numpy Introduction
### Python3.2 Numpy DataTypes, Functions, and Random Module
### Python3.3 Numpy Iterating Over Arrays
### Python3.4 Numpy Manipulating Arrays
### Python3.5 Numpy Operations
### Python3.6 Numpy File Input and Output and Data Processing
### Python3.7 Numpy-Sort, Argsort, Nonzero, and Extract Functions
### Python3.8 Numpy BreakoutGroupExercises
### Python3.8 Numpy BreakoutGroupExercises - Solutions
***

***
## Table of Contents
### 1. Arithmetic Functions
![image.png](attachment:image.png)
### 2. Statistics Functions
![image-3.png](attachment:image-3.png)
### 3. Comparison
***
some small sample data used in this notebook:
```python
a = np.array([1,2,3])
b = np.array([(1.5,2,3), (4,5,6)], dtype = float)
c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]],  dtype = float)
```

## 1. Arithmetic

You can easily perform array with array arithmetic, or scalar with array arithmetic. Let's see some examples:

In [58]:
import numpy as np
arr = np.arange(0,10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

### 1) Array with array

In [59]:
arr + arr

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [60]:
arr * arr

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

In [61]:
arr - arr

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [62]:
# Warning on division by zero, but not an error!
# Just replaced with nan
arr/arr

  arr/arr


array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [63]:
# Python gives error
1/0

ZeroDivisionError: division by zero

In [64]:
# Numpy gives a warning, but not an error instead infinity
1/arr

  1/arr


array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111])

### 2) Array with scalars

In [65]:
arr + 100

array([100, 101, 102, 103, 104, 105, 106, 107, 108, 109])

In [66]:
arr - 100

array([-100,  -99,  -98,  -97,  -96,  -95,  -94,  -93,  -92,  -91])

In [67]:
arr**3

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729], dtype=int32)

Exercise: Use x = np.random.random((5,3)); y=np.random.random((5,3))

In [68]:
np.random.seed(345)
x = np.random.random((5,3))
np.random.seed(567)
y = np.random.random((5,3))

In [69]:
x

array([[0.37092674, 0.66488423, 0.496658  ],
       [0.60207905, 0.24418494, 0.49474806],
       [0.71054041, 0.45139238, 0.04703766],
       [0.53097772, 0.5530579 , 0.11443496],
       [0.24547424, 0.8367019 , 0.90579506]])

In [70]:
y

array([[0.30478164, 0.95303297, 0.96470869],
       [0.34376214, 0.99388576, 0.30207403],
       [0.87623109, 0.70564267, 0.68115012],
       [0.54826595, 0.57399478, 0.930455  ],
       [0.60177031, 0.54688225, 0.92238086]])

In [71]:
x[0, :]

array([0.37092674, 0.66488423, 0.496658  ])

In [72]:
# Select the second column from array a2d
x[:, 1]

array([0.66488423, 0.24418494, 0.45139238, 0.5530579 , 0.8367019 ])

In [73]:
x + y

array([[0.67570839, 1.6179172 , 1.4613667 ],
       [0.94584118, 1.23807071, 0.79682209],
       [1.58677151, 1.15703505, 0.72818778],
       [1.07924367, 1.12705269, 1.04488996],
       [0.84724456, 1.38358415, 1.82817592]])

In [74]:
x - y

array([[ 0.0661451 , -0.28814874, -0.46805069],
       [ 0.25831691, -0.74970082,  0.19267403],
       [-0.16569068, -0.25425029, -0.63411246],
       [-0.01728824, -0.02093688, -0.81602004],
       [-0.35629607,  0.28981965, -0.0165858 ]])

In [75]:
x * y

array([[0.11305166, 0.63365659, 0.4791303 ],
       [0.20697198, 0.24269194, 0.14945054],
       [0.6225976 , 0.31852172, 0.03203971],
       [0.291117  , 0.31745235, 0.10647658],
       [0.14771911, 0.45757742, 0.83548803]])

In [76]:
x / y

array([[1.21702455, 0.69765081, 0.51482692],
       [1.75144085, 0.24568713, 1.63783713],
       [0.81090527, 0.63968975, 0.06905623],
       [0.96846742, 0.96352427, 0.12298818],
       [0.40792016, 1.52994891, 0.98201849]])

In [77]:
from numpy.random import randint

In [78]:
randint(3,6)

5

### 3) Universal array functions

Numpy comes with many [universal array functions](http://docs.scipy.org/doc/numpy/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:

In [79]:
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [80]:
#Taking Square Roots
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

In [81]:
#Calcualting exponential (e^)
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

In [82]:
np.max(arr) #same as arr.max()

9

In [83]:
arr.max()

9

In [84]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

In [85]:
np.log(arr)

  np.log(arr)


array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
       1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])

### reciprocal()
This function returns the reciprocal of argument, element-wise. For elements with absolute values larger than 1, the result is always 0 because of the way in which Python handles integer division. For integer 0, an overflow warning is issued.

In [86]:
import numpy as np 
a = np.array([0.25, 1.33, 1, 0, 100])  
np.reciprocal(a) 

  np.reciprocal(a)


array([4.       , 0.7518797, 1.       ,       inf, 0.01     ])

In [87]:
b = np.array([100], dtype = int) 
np.reciprocal(b) 

array([0], dtype=int32)

### power()
This function treats elements in the first input array as base and returns it raised to the power of the corresponding element in the second input array.

In [88]:
a = np.array([10,100,1000]) 
np.power(a,2) 

array([    100,   10000, 1000000], dtype=int32)

In [89]:
b = np.array([1,2,3]) 
np.power(a,b)

array([        10,      10000, 1000000000], dtype=int32)

### mod()
This function returns the remainder of division of the corresponding elements in the input array. The function numpy.remainder() also produces the same result.

In [90]:
a = np.array([10,20,30]) 
b = np.array([3,5,7]) 
np.mod(a,b) 
np.remainder(a,b) 

array([1, 0, 2], dtype=int32)

#### The following functions are used to perform operations on array with complex numbers.

numpy.real() − returns the real part of the complex data type argument.

numpy.imag() − returns the imaginary part of the complex data type argument.

numpy.conj() − returns the complex conjugate, which is obtained by changing the sign of the imaginary part.

numpy.angle() − returns the angle of the complex argument. The function has degree parameter. If true, the angle in the degree is returned, otherwise the angle is in radians.

In [91]:
# Examples:
a = np.array([-5.6j, 0.2j, 11. , 1+1j]) 
np.real(a) 
np.imag(a) 
np.conj(a) 
np.angle(a) 
np.angle(a, deg = True)

array([-90.,  90.,   0.,  45.])

In [92]:
import numpy as np
e = np.full((2,2),7) # Create a constant array 
f = np.eye(2) # Create a 2X2 identity matrix
e.dot(f) # Dot product


array([[7., 7.],
       [7., 7.]])

## 2. Statistics Aggregate Functions:
NumPy has quite a few useful statistical functions for finding minimum, maximum, percentile standard deviation and variance, etc. from the given elements in the array. The functions are explained as follows:

- a.sum(): Array-wise sum
- a.min(): Array-wise minimum value 
- b.max(axis=0): Maximum value of an array row
- b.cumsum(axis=1): Cumulative sum of the elements
- a.mean(): Mean
- b.median(): Median
- a.corrcoef(): Correlation coefficient
- np.std(b): Standard deviation

#### amin() and amax()
These functions return the minimum and the maximum from the elements in the given array along the specified axis.

In [93]:
a

array([-0.-5.6j,  0.+0.2j, 11.+0.j ,  1.+1.j ])

In [94]:
import numpy as np 
a = np.array([[3,7,5],[8,4,3],[2,4,9]]) 
np.amin(a,axis=1)

array([3, 3, 2])

In [95]:
np.amin(a,axis=0)

array([2, 4, 3])

In [96]:
np.amax(a)

9

In [97]:
np.amax(a, axis = 0)

array([8, 7, 9])

#### `percentile()` function (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall.

`numpy.percentile(a, q, axis)`: Compute the q-th percentile of the data along the specified axis. Returns the q-th percentile(s) of the array elements.

Main Parameters: 
    - `a`: array_like, Input array or object that can be converted to an array
    - `q`: array_like of float, Percentile or sequence of percentiles to compute, which must be between 0 and 100 inclusive.
    - `axis`: {int, tuple of int, None}, optional, Axis or axes along which the percentiles are computed. The default is to compute the percentile(s) along a flattened version of the array.

- References:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.percentile.html


In [98]:
a

array([[3, 7, 5],
       [8, 4, 3],
       [2, 4, 9]])

In [99]:
np.percentile(a,50)

4.0

In [100]:
np.percentile(a,25)

3.0

#### `quantile()` function 

`numpy.quantile(a, q, axis=None)`: computes the q-th quantile of the data along the specified axis.

Main Parameters: 

    - `a`: array_like, Input array or object that can be converted to an array
    
    - `q`: array_like of float, Quantile or sequence of quantiles to compute, which must be between 0 and 1 inclusive.
    
    - `axis`: {int, tuple of int, None}, optional, Axis or axes along which the quantiles are computed. The default is to compute the quantile(s) along a flattened version of the array.
        
- References: https://docs.scipy.org/doc/numpy/reference/generated/numpy.quantile.html   

In [101]:
np.quantile(a,.5, axis = 0)

array([3., 4., 5.])

#### `numpy.median()` function: Median is defined as the value separating the higher half of a data sample from the lower half.


In [102]:
a = np.array([[30,65,70],[80,95,10],[50,90,60]]) 
np.median(a) 

65.0

In [103]:
# Applying median() function along axis 0:
np.median(a, axis = 0) 

array([50., 90., 60.])

In [104]:
# Applying median() function along axis 1:
np.median(a, axis = 1)

array([65., 80., 60.])

**Difference Between Average and Mean**: Average and mean are similar yet are different. 
- The term average is the sum of all the numbers divided by the total number of values in the set. 
- The term mean is finding of the average of a sample data. 
- Average is finding the central value in math, whereas mean is finding the central value in statistics. 
- We use average when the difference between the values is less, whereas, for the set of values that have more difference, we prefer finding the mean of the data.

### mean()
Arithmetic mean is the sum of elements along an axis divided by the number of elements. The numpy.mean() function returns the arithmetic mean of elements in the array. If the axis is mentioned, it is calculated along it.

In [105]:
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]]) 
np.mean(a) 
np.mean(a, axis = 0) 
np.mean(a, axis = 1)

array([2., 4., 5.])

### average()
Weighted average is an average resulting from the multiplication of each component by a factor reflecting its importance. The numpy.average() function computes the weighted average of elements in an array according to their respective weight given in another array. The function can have an axis parameter. If the axis is not specified, the array is flattened.

Considering an array [1,2,3,4] and corresponding weights [4,3,2,1], the weighted average is calculated by adding the product of the corresponding elements and dividing the sum by the sum of weights.

Weighted average = (1*4+2*3+3*2+4*1)/(4+3+2+1)

In [106]:
a = np.array([1,2,3,4])
np.average(a), np.mean(a)

(2.5, 2.5)

In [107]:
# this is same as mean when weight is not specified 
wts = np.array([4,3,2,1]) 
np.average(a,weights = wts) 

2.0

In [108]:
# Returns the sum of weights, if the returned parameter is set to True. 
np.average([1,2,3,4],weights = [4,3,2,1], returned = True)

(2.0, 10.0)

#### Standard Deviation
Standard deviation is the square root of the average of squared deviations from mean. The formula for standard deviation is as follows:

std = sqrt(mean(abs(x - x.mean())**2))

In [109]:
import numpy as np 
np.std([1,2,3,4])

1.118033988749895

#### Variance
Variance is the average of squared deviations, 

i.e., mean(abs(x - x.mean())**2).


In [110]:
np.var([1,2,3,4])

1.25

In [111]:
np.log([1,2,3,4])

array([0.        , 0.69314718, 1.09861229, 1.38629436])

## 3. Comparison
- Element-wise comparison

In [112]:
a = np.array([1,2,3])
b = np.array([(1.5,2,3), (4,5,6)], dtype = float)
a == b 

array([[False,  True,  True],
       [False, False, False]])

In [113]:
a < 2

array([ True, False, False])

- Array-wise comparison

In [114]:
 np.array_equal(a, b)

False

## Further reading

- http://numpy.scipy.org
- http://scipy.org/Tentative_NumPy_Tutorial
- http://scipy.org/NumPy_for_Matlab_Users - A Numpy guide for MATLAB users.

#### Note: The course materials are developed mainly based on personal experience and contributions from the Python learning community
Referred Books: 
- Learning Python, 5th Edition by Mark Lutz
- Python Data Science Handbook, Jake, VanderPlas
- Python for Data Analysis, Wes McKinney  

Copyright ©2023 Mei Najim. All rights reserved. 