# Numpy basic operations

NumPy and SciPy are open-source add-on modules to Python that provide common mathematical and numerical routines in pre-compiled, fast functions. These are growing into highly mature packages that provide functionality that meets, or perhaps exceeds, that associated with common commercial software like MatLab. The NumPy (Numeric Python) package provides basic routines for manipulating large arrays and matrices of numeric data. The SciPy (Scientific Python) package extends the functionality of NumPy with a substantial collection of useful algorithms, like minimization, Fourier transformation, regression, and other applied mathematical techniques. 

## Installation

Install numpy and scipy libaries from __File__ > __Settings__ > __Project: project name__ > __Project Interpreter__. Locate __+__ sign on the right. Open _Available Packages_ window. Type numpy/scipy and search. Click _Install Package_ appropriately.

### Importing Numpy module

The standard approach is to use a simple import statement.

In [1]:
import numpy

For large amount of calls to Numpy functions, it can become tedious to write `numpy.X` over and over again. Instead, its common to import under the brief name `np`.

In [2]:
import numpy as np

This statement will allow us to access NumPy objects using `np.X` instead of `numpy.X`. It is also possible to import NumPy directly into the current namespace so that we don't have to use dot notation at all, but rather simply call the functions as if they were built-in:

In [3]:
#from numpy import *

However, this strategy is usually frowned upon in Python programming because it starts to remove some of the nice organization that modules provide. For the remainder of this tutorial, we will assume that the `import numpy as np` has been used.

## Arrays

The central feature of NumPy is the `array` object class. Arrays are similar to `list` in Python, except that every element of an array must be of the same type, typically a numeric type like `float` or `int`. Arrays make operations with large amounts of numeric data very fast and are generally much more efficient than lists.

In [4]:
a = np.array([1, 4, 5, 8], dtype=float)
#a[-3],a[1]
# a[-3:-1]

In [5]:
type(a)

numpy.ndarray

Array elements are accessed, sliced and manipulated just like lists.

In [6]:
a[:2]

array([1., 4.])

In [7]:
a[3]

8.0

In [8]:
a[0] = 5

In [9]:
a

array([5., 4., 5., 8.])

Arrays can be multi-dimensional. Unlike lists, different axes are accessed using commas inside bracket notation. 

In [10]:
 a = np.array([[1, 2, 3], [4, 5, 6]], dtype=float)

In [11]:
a

array([[1., 2., 3.],
       [4., 5., 6.]])

In [12]:
a[0,1]

2.0

### Slicing np.array

Array slicing works with multiple dimensions in the same way as usual, applying each slice specification as a filter to a specified dimension. Use of a single __":"__ in a dimension indicates the use of everything along that dimension:

In [13]:
a[1,:]

array([4., 5., 6.])

In [14]:
a[:,2]

array([3., 6.])

In [15]:
a[-2:,-2:]

array([[2., 3.],
       [5., 6.]])

### shape

The `shape` property of an array returns a tuple with the size of each array dimension.

In [16]:
a.shape

(2, 3)

### dtype

The `dtype` property tells you what type of data values are stored by the array.

In [17]:
a.dtype

dtype('float64')

### len

When used with an array, the `len` function returns length of the first axis.

In [18]:
len(a)

2

The `in` statement can be used to test if values are present in an array.

In [19]:
2 not in a

False

In [20]:
0 in a

False

### reshape

Arrays can be reshaped using tuples that specify new dimensions. 

In [21]:
aa = np.arange(10, dtype=float)
print('Before reshape')
aa.shape

Before reshape


(10,)

In [22]:
aa = aa.reshape((5,2))
print('After reshape')
aa

After reshape


array([[0., 1.],
       [2., 3.],
       [4., 5.],
       [6., 7.],
       [8., 9.]])

In [23]:
aa.shape

(5, 2)

### copy

The `copy` function can be used to create a **new**, **separate copy** of an array in memory.

In [24]:
a = np.array([1, 2, 3], dtype=float)
b = a
c = a.copy()

In [25]:
# Edit a
a[0] = 0
a

array([0., 2., 3.])

In [26]:
b

array([0., 2., 3.])

In [27]:
c

array([1., 2., 3.])

<div class="alert alert-block alert-success">

<b> EXCERCISE: </b>

  <ul>
    <li> Can you show that b and c are different copies of the same array object a? </li>
  </ul>

</div>

In [28]:
# List from np.array
a = np.array([1, 2, 3], dtype=float)
a.tolist() # list(a) also works.

[1.0, 2.0, 3.0]

One can fill an array with a single value.

In [29]:
a = np.array([1, 2, 3], float)
a.fill(0)
a

array([0., 0., 0.])

<div class="alert alert-block alert-success">

<b> EXCERCISE: </b>

<ul>
    <li> Can you find the mean, std, var and median of <tt>a</tt> analytically? </li>
    <li> Now, write Python code and match your outputs. </li>
</ul>

</div>

### transpose

In [30]:
# Transpose of an array
a = np.array(range(9), float).reshape((3,3))
a
# matrix decompsition (QR, Cholesky, SVD)

array([[0., 1., 2.],
       [3., 4., 5.],
       [6., 7., 8.]])

In [31]:
a.transpose()

array([[0., 3., 6.],
       [1., 4., 7.],
       [2., 5., 8.]])

In [32]:
a.trace()

12.0

### flatten
One-dimensional versions of multi-dimensional arrays can be generated with `flatten()`.

In [33]:
a = np.array([[1, 2, 3], [4, 5, 6]], float)
a.flatten()

array([1., 2., 3., 4., 5., 6.])

### np.concatenate 
Two or more arrays can be concatenated together using the `concatenate` function with a tuple of the arrays to be joined.

In [34]:
a = np.array([1,2], float)
b = np.array([3,4,5,6], float)
c = np.array([7,8,9], float)
np.concatenate([a, b, c], axis=0)

array([1., 2., 3., 4., 5., 6., 7., 8., 9.])

If an array has more than one dimension, it is possible to specify the axis along which multiple arrays are concatenated. By default (without specifying an axis), NumPy concatenates along the first dimension.

In [35]:
a = np.array([[1, 2], [3, 4]], float)
b = np.array([[5, 6], [7,8]], float)
np.concatenate((a,b))

array([[1., 2.],
       [3., 4.],
       [5., 6.],
       [7., 8.]])

In [36]:
np.concatenate((a,b), axis=0)

array([[1., 2.],
       [3., 4.],
       [5., 6.],
       [7., 8.]])

In [37]:
np.concatenate((a,b), axis=1)

array([[1., 2., 5., 6.],
       [3., 4., 7., 8.]])

Finally, the dimensionality of an array can be increased using the `newaxis` constant in bracket notation.

In [38]:
a = np.array([1, 2, 3], float)
a.shape

(3,)

In [39]:
a[:,np.newaxis]

array([[1.],
       [2.],
       [3.]])

In [40]:
a[:,np.newaxis].shape

(3, 1)

### `np.stack()`

Stacking is same as concatenation, however stacking is done along a new axis.

In [41]:
a = np.array([1,2,3])
b = np.array([4,5,6])
np.stack([a,b])

array([[1, 2, 3],
       [4, 5, 6]])

In [42]:
# Stacking along rows.
np.hstack([a,b])

array([1, 2, 3, 4, 5, 6])

In [43]:
# Stacking along columns
np.vstack([a,b])

array([[1, 2, 3],
       [4, 5, 6]])

### arange

The `arange` function is similar to the `range` function but returns an array.

In [44]:
np.arange(10, dtype=float)

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [45]:
np.arange(1, 7, 2, dtype=float)

array([1., 3., 5.])

### np.ones(), np.zeros()

The functions `zeros` and `ones` create new arrays of specified dimensions filled with these values.

In [46]:
 np.ones((2,3), dtype=float)

array([[1., 1., 1.],
       [1., 1., 1.]])

In [47]:
np.zeros(7, dtype=int)

array([0, 0, 0, 0, 0, 0, 0])

### np.zeros_like(), np.ones_like()

The `zeros_like` and `ones_like` functions create a new array with the same dimensions and type of an existing one.

In [48]:
a = np.array([[1, 2, 3], [4, 5, 6]], int)
np.zeros_like(a)

array([[0, 0, 0],
       [0, 0, 0]])

In [49]:
np.ones_like(a)

array([[1, 1, 1],
       [1, 1, 1]])

### np.identity

To create an identity matrix of a given size:

In [50]:
np.identity(4, dtype=float)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

The `np.eye` function returns matrices with ones along the $k^{th}$ diagonal.

In [51]:
np.eye(4, k=-1, dtype=float)

array([[0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.]])

## Array mathematics

Standard mathematical operations can be applied on `np.array` objects.

In [52]:
a = np.array([1,2,3], float)
b = np.array([4,5,6], float)
a + b

array([5., 7., 9.])

In [53]:
a - b

array([-3., -3., -3.])

In [54]:
a * b

array([ 4., 10., 18.])

In [55]:
a / b

array([0.25, 0.4 , 0.5 ])

In [56]:
a % b

array([1., 2., 3.])

In [57]:
b**a

array([  4.,  25., 216.])

For two-dimensional arrays, multiplication remains elementwise and does not correspond to matrix multiplication.

In [58]:
a = np.array([[1,2], [3,4]], float)
b = np.array([[2,0], [1,3]], float)
a * b

array([[ 2.,  0.],
       [ 3., 12.]])

In [59]:
# For matrix multiplication, use np.matmul.
np.matmul(a,b)

array([[ 4.,  6.],
       [10., 12.]])

Errors are thrown if arrays do not match in size.

In [60]:
a = np.array([1,2,3], float)
b = np.array([4,5], float)
#a + b

Arrays that do not match in the number of dimensions will be *broadcasted* by python to perform mathematical operations. This often means that the smaller array will be repeated as necessary to perform the operation indicated.

Following is a diagram showing the concept of *broadcasting*. [Scipy](http://www.scipy-lectures.org)

<img src="../data/numpy_broadcasting.png"> </img>

In [61]:
a = np.array([[1,2], [3,4], [5,6]], float)
b = np.array([-1, 3], float)
a + b

array([[0., 5.],
       [2., 7.],
       [4., 9.]])

Here, the 1-dimensional array `b` was broadcasted to a 2-dimensional array that matched the size of `a`. In fact, `b` was repeated for each row in `a`.

Python automatically broadcasts arrays in this manner. However, broadcast can be ambiguous. In such cases, we can use `np.newaxis` constant to specify how we want to broadcast.

In [62]:
a = np.zeros((2,2), float)
b = np.array([-1, 3], float)
a + b

array([[-1.,  3.],
       [-1.,  3.]])

In [63]:
a + b[np.newaxis,:]

array([[-1.,  3.],
       [-1.,  3.]])

In [64]:
a + b[:,np.newaxis]

array([[-1., -1.],
       [ 3.,  3.]])

## Array Iteration

In [65]:
a = np.array([1, 4, 5], np.int32)
for x in a:
    print(x, end=' ')

1 4 5 

In [66]:
a = np.array([[1,2], [3,4], [5,6]], float)
for (x,y) in a:
    print(x*y, end=' ')

2.0 12.0 30.0 

## Basic array operations

In [67]:
a = np.array([2,4,3], float)
a.sum() # or np.sum(a)

9.0

In [68]:
a.prod() # or np.prod(a)

24.0

In [69]:
a.mean() # Mean

3.0

In [70]:
a.var() # Variance

0.6666666666666666

In [71]:
a.std() # Standard deviation

0.816496580927726

In [72]:
a.min()

2.0

In [73]:
a.max()

4.0

In [74]:
# Index of (min,max) values.
a.argmin()

0

In [75]:
a.argmax()

1

For multi-dimensional arrays, each of these functions can take an optional argument `axis` that will perform an operation alog only the specifie axis.

In [76]:
a = np.array([[0,2], [3, -1], [3,5]], float, order='C')
a

array([[ 0.,  2.],
       [ 3., -1.],
       [ 3.,  5.]])

In [77]:
a[1,0]

3.0

In [78]:
a.mean(axis=0)

array([2., 2.])

In [79]:
a.mean(axis=1)

array([1., 1., 4.])

In [80]:
a.max(axis=0)

array([3., 5.])

In [81]:
a = np.array([6,2,5,-1,0],float)
sorted(a, reverse=True)

[6.0, 5.0, 2.0, 0.0, -1.0]

Values in an array can be *clipped* to be within a specified range.

In [82]:
a = np.array([6, 2, 5, -1, 0], float)
a.clip(0,5)

array([5., 2., 5., 0., 0.])

In [83]:
a = np.array([1, 1, 4, 5, 5, 5, 7],float)
np.unique(a)

array([1., 4., 5., 7.])

In [84]:
a = np.array([[1,2], [3,4]], float)
a.diagonal()

array([1., 4.])

### Logical operations

Compound Boolean expressions can be applied to arrays on an element-by-element basis using special functions `logical_and`, `logical_or`, and `logical_not`.

In [85]:
a = np.array([1,3,0], float)
np.logical_and(a > 0, a < 3)

array([ True, False, False])

In [86]:
b = np.array([True, False, True], bool)
np.logical_not(b)

array([False,  True, False])

In [87]:
c = np.array([False, True, False], bool)
np.logical_or(b,c)

array([ True,  True,  True])

The `np.where` function forms a new array from two arrays of equivalent size using a Boolean filter to choose between elements of the two.

In [88]:
a = np.array([2, 3, 0], float)
np.where(a != 1, 1 / a, a) # To suppress warnings, use the command: import warnings; warnings.simplefilter('ignore')

  


array([0.5       , 0.33333333,        inf])

In [None]:
np.where(a > 0, 3, 2)

array([3, 3, 2])

In [None]:
a = np.array([1, np.NaN, np.Inf], float)
np.isnan(a)

array([False,  True, False])

In [None]:
np.isinf(a)

array([False, False,  True])

### tile

Repeat array `A` a specified number of times along each axis. This is same as [repelem](https://www.mathworks.com/help/matlab/ref/repelem.html) in MATLAB.

In [None]:
a = np.array([1,2,3], float)
np.tile(a,2)

array([1., 2., 3., 1., 2., 3.])

In [None]:
a = np.array([[1,2], [3,4]], float)
np.tile(a,2)

array([[1., 2., 1., 2.],
       [3., 4., 3., 4.]])

In [None]:
np.tile(a,[2,1])

array([[1., 2.],
       [3., 4.],
       [1., 2.],
       [3., 4.]])

## Array item selection and manipulation

In [None]:
a = np.array([[6, 4], [5, 9]], float)
a >= 6

array([[ True, False],
       [False,  True]])

In [None]:
a[a >= 6]

array([6., 9.])

In [None]:
a[np.logical_and(a > 5, a < 9)]

array([6.])

A special function `take` is also available to perform selection with integer arrays. This works in an identical manner as bracket selection.

In [None]:
a = np.array([2, 4, 6, 8], float)
b = np.array([0, 0, 1, 3, 2, 1], int)
a.take(b)

array([2., 2., 4., 8., 6., 4.])

The opposite of the `take` function is the `put` function, which will take values from a source array and place them at specified indices in the array calling `put`. 

In [None]:
a = np.array([0, 1, 2, 3, 4, 5], float)
b = np.array([9, 8, 7], float)
a.put([0, 3], b)
a

array([9., 1., 2., 8., 4., 5.])

In [None]:
a = np.arange(5, dtype=float)
a.put([0,3], 5)
a

##  Vector and matrix mathematics

In [None]:
a = np.array([1, 2, 3], float)
b = np.array([0, 1, 1], float)
np.dot(a, b)

In [None]:
a = np.array([[0, 1], [2, 3]], float)
c = np.array([[1, 1], [4, 0]], float)
np.cross(c,a)

### Inner and Outer products

Given two vectors $\vec{u} = (u_1, u_2, \dots, u_m)$ and $\vec{v} = (v_1, v_2, \dots, v_n)$, the outer product $\vec{u} \otimes \vec{v}$ is defined as the $m \times n$ matrix __A__ by multiplying each element of $\mathbf{u}$ by each element of $\mathbf{v}$.

$$
  \vec{u} \otimes \vec{v} = \begin{bmatrix}
  u_1 \\
  u_2 \\
  \vdots \\
  u_m
  \end{bmatrix} \begin{bmatrix}
  v_1  v_2 \cdots v_n \end{bmatrix}
  = \begin{bmatrix}
      u_1v_1 & u_1v_2 \cdots u_1v_n \\
      u_2v_1 & u_2v_2 \cdots u_2v_n \\
      \vdots & \vdots  \\
      u_nv_1 & u_nv_2 \cdots u_mv_n \\
  \end{bmatrix}
$$

In [None]:
a = np.array([1, 4, 0], float)
b = np.array([2, 2, 1], float)
np.outer(a,b)

In [None]:
a = np.array([[1,2], [3,4]]) 
b = np.array([[11, 12], [13, 14]])

# [[1 2]]   [[11 12]]  = [[1*11+2*12 1*13+2*14]]
# [[3 4]] x [[13 14]]    [[3*11+4*12 3*13+4*14]]
np.inner(a,b)

NumPy also comes with a number of built-in routines for linear algebra calculations. These can be found in the sub-module `np.linalg`.

In [None]:
a = np.array([[4, 2, 0], [9, 3, 7], [1, 2, 1]], float)
np.linalg.det(a)

You can also find eigenvalues and eigenvectors of a matrix.

In [None]:
evals, evecs = np.linalg.eig(a)
evals, evecs

In [None]:
np.linalg.inv(a)

You can also find singular values.

In [None]:
a = np.array([[1, 3, 4], [5, 2, 3]], float)
U, s, Vh = np.linalg.svd(a)
U, s, Vh

### Statistics

In addition to `mean`, `var` an `std` functions, NumPy provides several other methods for returning statistical measures of an array.

In [None]:
# Median
a =  np.array([1, 4, 3, 8, 9, 2, 3], float)
np.median(a)

In [None]:
# Correlation coefficient
a = np.array([[1, 2, 1, 3], [5, 3, 1, 8]], float)
c = np.corrcoef(a)
c

The relationship between the correlation coefficient matrix, R, and the covariance matrix, C, is 

$$ R_{ij} = \frac{C_{ij}}{\sqrt{C_{ii} \times C_{jj}}} $$

In [None]:
np.cov(a)

## Random numbers

To generate random numbers using NumPy's built-in pseudo-random number generator, use `np.random` function.

In [None]:
# Set seed for reproducibility. Any program that starts with the same seed will generate exactly the same sequence of 
# random numbers each time its run.
np.random.seed(42)

In [None]:
np.random.rand(5)

In [None]:
np.random.rand(2,3)

In [None]:
np.random.rand()

In [None]:
np.random.randint(5, 10)

In [None]:
# To draw from the discrete Poisson distribution.
np.random.poisson(6.0)

### View

`np.ndarray.view()` helps to get a new view (or shallow copy) of array with the same data.

In [None]:
a = np.ones((2,5))
b = a.view()
b.shape = 5,2
b

In [None]:
a

In [None]:
id(b), id(a)

<div class="alert alert-block alert-success">
    <b> EXCERCISE (BASICS): </b>
    <ol>
        <li> Write a program to test whether none of the elements in a given array are zero. </li>
        <li> Write a program to test whether two arrays are element-wise equal within a given tolerance.</li>
        <li> Write a program to create a 2-D array, where rows are as: 10 zeros, 10 ones, and 10 fives. </li>
    </ol>
</div>

<div class="alert alert-block alert-success">
    <b> EXCERCISE (LINEAR ALGEBRA): </b>
    <ol>
        <li> Write a progam to compute (A<sup>T</sup> * A)<sup>-1</sup> * A<sup>T</sup>, where A is a random NumPy matrix of size=(5,3). </li>
        <li> Write a program to compute determinant of a given square matrix. </li>
        <li> Write a program to compute the <a href="https://www.w3resource.com/w3r_images/linear-algebra-image-exercise-8.png">Kronecker product</a> of two multi-dimensional arrays. <font color="red">(Try both using loops and in-built functions)</font> </li>
        <li> Write a program to compute the sum of diagonal elements of a given array. </li>
        <li> Write a program to compute <a href="https://user-images.githubusercontent.com/1092464/32795502-268c2e8c-c96c-11e7-9dd2-127546a21756.png">Frobenius norm</a> of a given array.</li>
    </ol>
</div>

<div class="alert alert-block alert-success">
    <b> EXCERCISE (LINEAR ALGEBRA): </b>
    <ol>
            <li> Write a program to compute the minimum and maximum value of a given array along second axis. For example, Original array: [[0 1],[2 3]], max: [1 3], min: [0 2].</li>
        <li> Write a NumPy program to compute the weighted average of a given array. Pythonic way is: <font color="red"> x = np.arange(5), wts = np.arange(1,6), res = (x * (wts / wts.sum())).sum()</font>. Bonus: Can you do this along a specified axis? </li>
        <li> Write a program to compute covariance of two given arrays. For example, a: [0 1 3], b: [2 4 5], output: [[2.333 2.16667], [2.16667 2.333]]</li>
    </ol>
</div>