# Numpy

Numpy is one of the most useful libraries in python for numerical computation. 
We will use numpy in many different places in our course. 

By the end of this notebook, you will learn
- why numpy library is so popular. 
- some of the most commonly used functions in the numpy library.

## Numpy is fast
I want to show how fast numpy is compared to our regular python. To do that, we need to find a way to compute the time that it takes python to do operations. Thankfully, there is an easy way to do that. We use `%%timeit` to find the time that it takes python to do an operation. 

Lets see how this works. 

create two lists `a` and `b` each with `N` numbers. 
take `N` to be 1000. we want to calculate `c = a * b`, where multiplication is elementwise. 
For example, if `a = [1,2,3,4]` and `b=[-1,2,0,2]`, then `c = [-1, 4, 0, 8]`

In [2]:
N = 100000
a = list(range(1, N))
b = list(range(N, 1, -1))
# print(f"a={a}")
# print(f"b={b}")

Write a for loop to calculate `a[i]*b[i]` and then append it to c. 
use `%%timeit` at the beginning of your cell to compute execuation time.  

In [3]:
c = []

In [4]:
%%time
for i in range(len(a)):
    c.append(a[i]*b[i])

#print(c)

CPU times: total: 15.6 ms
Wall time: 15 ms


What python does here is that it repeats the operation in this cell many many times, and chooses the best time (shortest) among all repeats. 

What happens if you increase N to 10000 or to 100000?

Now let's try this with numpy. 
Import numpy library.

In [5]:
import numpy as np

Instead of using lists, we will use something called numpy array. 
numpy arrays can be made in different ways. One way is to use `arange` function that works very similar to `range` function in python

In [6]:
a = np.arange(1,N)
b = np.arange(N,1,-1)

why is the type of `a` and `b`?

In [7]:
type(a)

numpy.ndarray

for elementwise multiplication of two numpy arrays, you can simply multiply them together. remember to time it. 

In [8]:
%%time
c = a*b

CPU times: total: 0 ns
Wall time: 0 ns


Can you tell how fast numpy array was compared to python for loop? Does this depend on N?

table of times in ms
\begin{array}{cc}
----       & \text{python} & \text{numpy} & \text{ratio}\\
N=10^3      &    0.16       &   0.0014     &  114.2      \\
N=10^4      &    1.72       &   0.0076     &  226.3      \\
N=10^5      &    17.4       &   0.699      &  24.8       \\
N=10^6      &    167        &   1.6        &  104.3        \\
N=10^7      &    1680       &   27.9       &  60.2
\end{array}

## What else can we do with numpy?

Execute the following lines of code. and explain what each line does.

In [9]:
a1 = np.array([2, 3, 4, 5])
print(a1)
# what is the type of a?

[2 3 4 5]


This is a 1-dimensional array 

![1d array img](https://drive.google.com/uc?id=1efi6X4VvraoGjOUhm-MMxCn5--h-kwb8)

In [10]:
type(a1)

numpy.ndarray

In [11]:
a1.shape

(4,)

In [12]:
a2 = np.array([[2,3,4,5], [6,7,8,9]])
print(a2)

[[2 3 4 5]
 [6 7 8 9]]


This is a 2-dimensional array 

![2d array img](https://drive.google.com/uc?id=1ShRBsFgp2YpkdbEUcb-C4VOS70KkPjhc)

In [13]:
a2.shape

(2, 4)

In [14]:
a3 = np.array([
               [[0, 1],   [2, 3],   [4, 5]],
               [[6, 7],   [8, 9],   [10, 11]],
               [[12, 13], [14, 15], [16, 17]],
               [[18, 19], [20, 21], [22, 23]] 
              ])

a3

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]],

       [[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]]])

This is a 3-dimensional array 
![3d array img](https://drive.google.com/uc?id=1D1OB3sl3XfUWQ2NnDN1ZzPcj0yP_Q8q0)

In [15]:
a3.shape

(4, 3, 2)

In [16]:
a3[1,2,0]

10

In [17]:
b = a2

In [18]:
c = np.zeros((4,2))
c

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

In [19]:
d = np.zeros((3,2)).astype('int')
d

array([[0, 0],
       [0, 0],
       [0, 0]])

In [20]:
e = np.ones((3,4))
e

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [21]:
print(b)

[[2 3 4 5]
 [6 7 8 9]]


In [22]:
f = b>4
f

array([[False, False, False,  True],
       [ True,  True,  True,  True]])

In [23]:
g = (b>2).astype(int)
g

array([[0, 1, 1, 1],
       [1, 1, 1, 1]])

In [24]:
g + b

array([[ 2,  4,  5,  6],
       [ 7,  8,  9, 10]])

In [25]:
g * b

array([[0, 3, 4, 5],
       [6, 7, 8, 9]])

In [26]:
g / b

array([[0.        , 0.33333333, 0.25      , 0.2       ],
       [0.16666667, 0.14285714, 0.125     , 0.11111111]])

In [27]:
b / g

# what is the problem here?

  b / g


array([[inf,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9.]])

In [28]:
b.sum()

44

In [29]:
b.sum(axis=0)

array([ 8, 10, 12, 14])

In [30]:
b.sum(axis=1)

array([14, 30])

In [31]:
b.sum(axis=2)
# what is the problem here?

AxisError: axis 2 is out of bounds for array of dimension 2

In [None]:
b.mean()

In [None]:
np.median(b)

In [None]:
b.std()

In [None]:
b.T

In [None]:
np.matmul(b.T, b) # this is matrix multiplication

In [None]:
np.dot(a, 2*a)  # this is dot product

In [None]:
m = b.reshape((8,1))
m

In [None]:
m.shape

In [None]:
n = b.reshape((2,-1))
n

In [None]:
l = np.random.random((10, 2))
l

In [None]:
np.sort(l)

In [None]:
np.sort(l, axis=0)

In [None]:
np.max(l, axis=0)

In [None]:
np.argmax(l, axis=0)

Run the following cell to have the tester installed in your notebook

# <font color='red'> **Practice Question 1:** </font>

In a later lecture, we will learn about the concept of a loss function.
A common loss function is mean squared error defined by
$f(x,y) = \frac{1}{N}\sum_0^N (x_i - y_i)^2$
where $x$ and $y$ are two vectors (numpy arrays) and $x_i$ and $y_i$ denote the $i^{th}$ element of each vector. Implement a function that computes $f$

In [32]:
def lesson2_q1_mean_absolute_error(x, y):
    """
    this function returns mean square error of x and y.
    :param x: a list or a numpy array of floats
    :param y: a list or a numpy array of floats
    :return: float

    Examples:
    x = [5.0]
    y = [4.5]
    output = 0.25

    x = [5.0, 4.0]
    y = [4.0, 2.0]
    output = 2.5
    """

    x, y = np.array(x), np.array(y)
    return ((x - y) ** 2).mean()

    # n = len(x)
    # return (1 / n) * sum((x - y) ** 2)

lesson2_q1_mean_absolute_error([5.0, 4.0], [4.0, 2.0])

2.5

# <font color='red'> **Practice Question 2:** </font>

Consider the function $f(x) = x^3 - x$ defined in the domain $[-1, 2]$. Use numpy to approximate the maximum of the function. 

In [None]:
def lesson2_q2_compute_max():
    """
    this function returns the maximum of f(x) = x^3 - x over the interval [-1, 2]
    HINT1 : Divide the interval into many small sections, compute the function.
    HINT2 : You may find it useful to use `np.linspace` function.
    inputs
    :return: float
    
    f'(x) = 2x^2 - 1
    0 = 3x^2 - 1
    1 / 3 = x ^ 2
    x = sqrt(1 / 3)
    """

    # mathematical approach
    # f = lambda x : x ** 3 - x

    # x1 = (1 / 3) ** 0.5
    # x2 = -1 * x1

    # print(max(f(x1), f(x2)))

    x = np.linspace(-1, 2)
    y = x ** 3 - x

    return y.max()

lesson2_q2_compute_max()

6.0