# NumPy Basics

NumPy is the standard numerical library available in the `python` realm. It allows quicker computations on numerical array-like structures. Central objects in the `numpy` library are `ndarray`s. These are *homogenuous* $n$-dimensional arrays ; elements of the array are all of the same type. 

Efficiency of `ndarray` objects come from the fact that element-wise operations are `C` implemented to ensure low complexity. One can translate loop-like operations on array-like structures into available corresponding implementation for `ndarrays`. This process is called *vectorisation* ; it improves efficiency and must be on mind when dealing with scientific programming. 

## Defining an `ndarray` object

In [1]:
import numpy as np

In [2]:
import math as m

In [3]:
np_matrix = np.array([1, 2, 4., 5])


In [4]:
np_matrix.shape

(4,)

One can build up a $2$-dimensional `ndarray` object as a list of lists. The matrix in such a case is given line by line. An `ndarray` object comes with a lot of attributes, we'll be seeing a number of them while going on. Here are the ones enclosing the shape of the array.

In [5]:
np_matrix = np.array([[1, 2, 3, 4.], [5., -1, 2, 9]])

In [6]:
np_matrix.shape

(2, 4)

In many cases one has to initialize an `ndarray`, either by giving random coefficients to the elements of the matrix or by giving a specified type matrix. Here are the standard available `ndarray`s.

In [11]:
np.zeros((2, 3))

array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])

In [12]:
np.ones((2, 3))

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [13]:
np.identity(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [14]:
np.diag((1, 2, 3, 4))

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

To build up a random `ndarray` one can use available `numpy` built-in random generators.

In [7]:
from numpy.random import rand

In [8]:
rand(4, 4)

array([[ 0.41336162,  0.89542031,  0.56737693,  0.60750911],
       [ 0.93169608,  0.83901428,  0.75380584,  0.05409844],
       [ 0.84159329,  0.58687299,  0.51530587,  0.96867285],
       [ 0.65552795,  0.73184669,  0.01226483,  0.07002372]])

In [9]:
from numpy.random import randn

In [10]:
randn(200, 3)

array([[ -4.55080470e-01,   5.39304983e-02,   6.38493564e-01],
       [  1.23265778e+00,  -1.18745385e+00,  -6.69545744e-01],
       [ -1.28361235e+00,   4.05081434e-01,   5.68576169e-01],
       [  1.21525682e+00,   5.74137585e-01,   1.14004630e+00],
       [ -8.27522150e-01,  -8.40261991e-01,   1.02615353e-01],
       [ -3.24975917e-02,   1.08087243e+00,  -2.81723806e-01],
       [  5.39437615e-01,   9.63894558e-02,   5.07409353e-01],
       [ -1.30581632e+00,   1.60500924e+00,   8.03904981e-01],
       [ -3.71653313e-01,  -4.14321890e-01,   1.28911794e+00],
       [  1.65363698e+00,   4.16786360e-01,   3.06611590e-01],
       [ -6.29953487e-01,  -1.71565150e+00,  -9.71488349e-01],
       [ -2.25898429e+00,  -2.36749181e+00,  -5.61474953e-01],
       [ -1.04600949e+00,  -9.16620522e-01,   1.01584127e+00],
       [  4.68672317e-01,  -1.46080545e-01,   1.19693414e+00],
       [  8.12479547e-01,  -3.30502516e-01,  -4.72907739e-01],
       [  4.46873892e-01,   4.35076842e-01,  -5.0804829

A useful way of building up matrices out of lists is to reshape the standard one-line corresponding numpy array object. 

In [38]:
np_matrix.reshape(4, 2)

array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5., -1.],
       [ 2.,  9.]])

In [40]:
A = [1]*30
A

[1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1]

In [42]:
np_A = np.array(A)

In [48]:
np_B = np_A.reshape(3, -1)

In [49]:
np_B

array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

In [50]:
np_B.shape

(3, 10)

Another useful array definition is the one given by `arange`. It is the `numpy` version of `python` range. It returns a one dimensional array containing an arithmetic sequence of integers following range syntax.

In [20]:
np.arange?

In [17]:
np_A = np.arange(-1, 10, 2)
np_A

array([-1,  1,  3,  5,  7,  9])

In [19]:
np_A = np_A.reshape(-1, 2)
np_A

array([[-1,  1],
       [ 3,  5],
       [ 7,  9]])

In many applications one looks for a sequence of floats modelling the real line. A way of generating such a `numpy` array is to use the `linspace` function.

In [21]:
np.linspace?

In [23]:
np_A = np.linspace(-10, 10, 100)
np_A

array([-10.        ,  -9.7979798 ,  -9.5959596 ,  -9.39393939,
        -9.19191919,  -8.98989899,  -8.78787879,  -8.58585859,
        -8.38383838,  -8.18181818,  -7.97979798,  -7.77777778,
        -7.57575758,  -7.37373737,  -7.17171717,  -6.96969697,
        -6.76767677,  -6.56565657,  -6.36363636,  -6.16161616,
        -5.95959596,  -5.75757576,  -5.55555556,  -5.35353535,
        -5.15151515,  -4.94949495,  -4.74747475,  -4.54545455,
        -4.34343434,  -4.14141414,  -3.93939394,  -3.73737374,
        -3.53535354,  -3.33333333,  -3.13131313,  -2.92929293,
        -2.72727273,  -2.52525253,  -2.32323232,  -2.12121212,
        -1.91919192,  -1.71717172,  -1.51515152,  -1.31313131,
        -1.11111111,  -0.90909091,  -0.70707071,  -0.50505051,
        -0.3030303 ,  -0.1010101 ,   0.1010101 ,   0.3030303 ,
         0.50505051,   0.70707071,   0.90909091,   1.11111111,
         1.31313131,   1.51515152,   1.71717172,   1.91919192,
         2.12121212,   2.32323232,   2.52525253,   2.72

In [25]:
np_A = np_A.reshape(-1, 4)
np_A.shape

(25, 4)

## Slicing 

There are many different ways of slicing an `ndarray`. One needs to be careful about the fact that some give back a view on a slice of the array others copy part of it.

In [33]:
np_A[10:16:2, 1:3]

array([[-1.71717172, -1.51515152],
       [-0.1010101 ,  0.1010101 ],
       [ 1.51515152,  1.71717172]])

Standard slicing gives views on subelements of `ndarray`. 

In [34]:
np_A

array([[-10.        ,  -9.7979798 ,  -9.5959596 ,  -9.39393939],
       [ -9.19191919,  -8.98989899,  -8.78787879,  -8.58585859],
       [ -8.38383838,  -8.18181818,  -7.97979798,  -7.77777778],
       [ -7.57575758,  -7.37373737,  -7.17171717,  -6.96969697],
       [ -6.76767677,  -6.56565657,  -6.36363636,  -6.16161616],
       [ -5.95959596,  -5.75757576,  -5.55555556,  -5.35353535],
       [ -5.15151515,  -4.94949495,  -4.74747475,  -4.54545455],
       [ -4.34343434,  -4.14141414,  -3.93939394,  -3.73737374],
       [ -3.53535354,  -3.33333333,  -3.13131313,  -2.92929293],
       [ -2.72727273,  -2.52525253,  -2.32323232,  -2.12121212],
       [ -1.91919192,  -1.71717172,  -1.51515152,  -1.31313131],
       [ -1.11111111,  -0.90909091,  -0.70707071,  -0.50505051],
       [ -0.3030303 ,  -0.1010101 ,   0.1010101 ,   0.3030303 ],
       [  0.50505051,   0.70707071,   0.90909091,   1.11111111],
       [  1.31313131,   1.51515152,   1.71717172,   1.91919192],
       [  2.12121212,   2

In [35]:
np_B = np_A[10:, 2]

In [37]:
np_A[10:, 2]

array([-1.51515152, -0.70707071,  0.1010101 ,  0.90909091,  1.71717172,
        2.52525253,  3.33333333,  4.14141414,  4.94949495,  5.75757576,
        6.56565657,  7.37373737,  8.18181818,  8.98989899,  9.7979798 ])

Slicing through boolean choices.

In [38]:
np_A < 2.

array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False]], dtype=bool)

In [41]:
np_A[np_A <  2]

array([-10.        ,  -9.7979798 ,  -9.5959596 ,  -9.39393939,
        -9.19191919,  -8.98989899,  -8.78787879,  -8.58585859,
        -8.38383838,  -8.18181818,  -7.97979798,  -7.77777778,
        -7.57575758,  -7.37373737,  -7.17171717,  -6.96969697,
        -6.76767677,  -6.56565657,  -6.36363636,  -6.16161616,
        -5.95959596,  -5.75757576,  -5.55555556,  -5.35353535,
        -5.15151515,  -4.94949495,  -4.74747475,  -4.54545455,
        -4.34343434,  -4.14141414,  -3.93939394,  -3.73737374,
        -3.53535354,  -3.33333333,  -3.13131313,  -2.92929293,
        -2.72727273,  -2.52525253,  -2.32323232,  -2.12121212,
        -1.91919192,  -1.71717172,  -1.51515152,  -1.31313131,
        -1.11111111,  -0.90909091,  -0.70707071,  -0.50505051,
        -0.3030303 ,  -0.1010101 ,   0.1010101 ,   0.3030303 ,
         0.50505051,   0.70707071,   0.90909091,   1.11111111,
         1.31313131,   1.51515152,   1.71717172,   1.91919192])

Behaviour of `ndarrays` within boolean conditions.

## Setting Coefficient Values

In [43]:
np_A[1, 2] = 1000.

In [49]:
np_A[::, 0] = -30

In [50]:
np_A

array([[ -3.00000000e+01,  -9.79797980e+00,  -9.59595960e+00,
         -9.39393939e+00],
       [ -3.00000000e+01,  -8.98989899e+00,   1.00000000e+03,
         -8.58585859e+00],
       [ -3.00000000e+01,  -8.18181818e+00,  -7.97979798e+00,
         -7.77777778e+00],
       [ -3.00000000e+01,  -7.37373737e+00,  -7.17171717e+00,
         -6.96969697e+00],
       [ -3.00000000e+01,  -6.56565657e+00,  -6.36363636e+00,
         -6.16161616e+00],
       [ -3.00000000e+01,  -5.75757576e+00,  -5.55555556e+00,
         -5.35353535e+00],
       [ -3.00000000e+01,  -4.94949495e+00,  -4.74747475e+00,
         -4.54545455e+00],
       [ -3.00000000e+01,  -4.14141414e+00,  -3.93939394e+00,
         -3.73737374e+00],
       [ -3.00000000e+01,  -3.33333333e+00,  -3.13131313e+00,
         -2.92929293e+00],
       [ -3.00000000e+01,  -2.52525253e+00,  -2.32323232e+00,
         -2.12121212e+00],
       [ -3.00000000e+01,  -3.00000000e+01,  -3.00000000e+01,
         -3.00000000e+01],
       [ -3.00000000e

## Universal Functions

Many standard mathematical functions are reimplemented in numpy to ensure efficiency.

In [51]:
np.exp(np_A)

  """Entry point for launching an IPython kernel.


array([[  9.35762297e-14,   5.55637361e-05,   6.80029415e-05,
          8.32269459e-05],
       [  9.35762297e-14,   1.24662685e-04,              inf,
          1.86727806e-04],
       [  9.35762297e-14,   2.79692945e-04,   3.42308569e-04,
          4.18942123e-04],
       [  9.35762297e-14,   6.27518520e-04,   7.68002806e-04,
          9.39937692e-04],
       [  9.35762297e-14,   1.40789927e-03,   1.72308953e-03,
          2.10884229e-03],
       [  9.35762297e-14,   3.15875992e-03,   3.86592014e-03,
          4.73139424e-03],
       [  9.35762297e-14,   7.08698731e-03,   8.67357053e-03,
          1.06153465e-02],
       [  9.35762297e-14,   1.59003503e-02,   1.94600051e-02,
          2.38165696e-02],
       [  9.35762297e-14,   3.56739933e-02,   4.36604277e-02,
          5.34348070e-02],
       [  9.35762297e-14,   8.00380986e-02,   9.79564464e-02,
          1.19886224e-01],
       [  9.35762297e-14,   9.35762297e-14,   9.35762297e-14,
          9.35762297e-14],
       [  9.35762297e

In [53]:
m.exp(4.)

54.598150033144236

Standard algebraic operations on matrices implemented for `ndarrays`.

In [54]:
np_A * np_A

array([[  9.00000000e+02,   9.60004081e+01,   9.20824406e+01,
          8.82460973e+01],
       [  9.00000000e+02,   8.08182838e+01,   1.00000000e+06,
          7.37169677e+01],
       [  9.00000000e+02,   6.69421488e+01,   6.36771758e+01,
          6.04938272e+01],
       [  9.00000000e+02,   5.43720029e+01,   5.14335272e+01,
          4.85766758e+01],
       [  9.00000000e+02,   4.31078461e+01,   4.04958678e+01,
          3.79655137e+01],
       [  9.00000000e+02,   3.31496786e+01,   3.08641975e+01,
          2.86603408e+01],
       [  9.00000000e+02,   2.44975003e+01,   2.25385165e+01,
          2.06611570e+01],
       [  9.00000000e+02,   1.71513111e+01,   1.55188246e+01,
          1.39679625e+01],
       [  9.00000000e+02,   1.11111111e+01,   9.80512193e+00,
          8.58075707e+00],
       [  9.00000000e+02,   6.37690032e+00,   5.39740843e+00,
          4.49954086e+00],
       [  9.00000000e+02,   9.00000000e+02,   9.00000000e+02,
          9.00000000e+02],
       [  9.00000000e

In [55]:
np_A + np_A

array([[ -6.00000000e+01,  -1.95959596e+01,  -1.91919192e+01,
         -1.87878788e+01],
       [ -6.00000000e+01,  -1.79797980e+01,   2.00000000e+03,
         -1.71717172e+01],
       [ -6.00000000e+01,  -1.63636364e+01,  -1.59595960e+01,
         -1.55555556e+01],
       [ -6.00000000e+01,  -1.47474747e+01,  -1.43434343e+01,
         -1.39393939e+01],
       [ -6.00000000e+01,  -1.31313131e+01,  -1.27272727e+01,
         -1.23232323e+01],
       [ -6.00000000e+01,  -1.15151515e+01,  -1.11111111e+01,
         -1.07070707e+01],
       [ -6.00000000e+01,  -9.89898990e+00,  -9.49494949e+00,
         -9.09090909e+00],
       [ -6.00000000e+01,  -8.28282828e+00,  -7.87878788e+00,
         -7.47474747e+00],
       [ -6.00000000e+01,  -6.66666667e+00,  -6.26262626e+00,
         -5.85858586e+00],
       [ -6.00000000e+01,  -5.05050505e+00,  -4.64646465e+00,
         -4.24242424e+00],
       [ -6.00000000e+01,  -6.00000000e+01,  -6.00000000e+01,
         -6.00000000e+01],
       [ -6.00000000e

In [56]:
np_A.dot(np_A)

ValueError: shapes (25,4) and (25,4) not aligned: 4 (dim 1) != 25 (dim 0)

## Exercise

Look into saving and loading numpy arrays.

## Exercise

Compare efficiency of `numpy` matrix multiplication to naive function using built-in structures.

## Exercise

Simulate a random walk using both `numpy` and built-in structures. Compare both functions.

* Looking into the documentation of `matplotlib` write down a function enabling you to represent a random walk. 