# Numpy
Numpy is a core library for scientific computing in Python. It provides a high-performance multidimensional array objects, such as matrices, and built-in functions to work on these arrays. Numpy arrays are commonly used to store matrices. Numpy built-in functions are commonly used to perform operations on matrices (transpose, dot product, etc.). 

### 1) Defining a Numpy array
Array is an ordered sequence of elements. The elements are indexed and should be of the same type. 

In [4]:
import numpy as np

In [5]:
arr = np.array([1,2,3])

In [6]:
type(arr)

numpy.ndarray

We use method shape to obtain the size of the array.

In [7]:
arr.shape

(3,)

We will use numpy arrays to compute with matrices. First, we can store $\begin{pmatrix} 4 & 6 &8 \\ 1 & 3 & 9 \\ 2 & 3 & 2 \\ 0 &1 &7  \end{pmatrix}$ as follows:

In [8]:
matrix = np.array([[4,6,8],[1,3,9],[2,3,2],[0,1,7]]) 

matrix is a 2D array with 4 rown and 3 columns. ?? remove 7 and check the shape ??

In [9]:
matrix.shape

(4, 3)

#### 1.1) Datatypes 

All the elements of an array must have the same type.

In [10]:
arr1 = np.array([1.,2,3])

In [11]:
arr1.dtype

dtype('float64')

In [12]:
arr2 = np.array([1,2.,3])

In [13]:
arr2.dtype

dtype('float64')

In [14]:
arr3 = np.array([2.,"Is this an array of string?",3])

In [15]:
arr3.dtype

dtype('<U32')

In [16]:
type(arr3[1])

numpy.str_

In [17]:
type(arr3[0])

numpy.str_

In [18]:
arr3[0]

'2.0'

In [19]:
arr2[1]

2.0

As you can see, python forces all the elements to be of the same type. For example, in arr3, python changes the type of all the elements to string. To explicity specify the type of your array, use argument dtype as follows: 

In [20]:
arr1 = np.array([1,2.23,3],dtype=np.int64)

In [21]:
arr1[1]

2

In [22]:
arr2 = np.array([1,2.,3],dtype=np.float64)

In [23]:
arr2[0]

1.0

In [24]:
arr3 = np.array(["Is this an array of string?",2.,3],dtype=np.str)

In [25]:
arr3[1]

'2.0'

#### 1.2) Indexing and slicing
Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, we must specify a slice for each dimension of the array.

In [26]:
matrix

array([[4, 6, 8],
       [1, 3, 9],
       [2, 3, 2],
       [0, 1, 7]])

In [27]:
matrix[0,1]

6

In [28]:
matrix[3]

array([0, 1, 7])

Using slicing, we extract the second column. matrix[:,1] means for all the rows (indicated by ":"), take the element at index 1.

In [29]:
matrix[:,1]

array([6, 3, 3, 1])

Using slicing, we extract the second and third row. matrix[1:3,:] means for all the rows (indicated by ":"), take the element at index 1.

In [30]:
matrix[1:3,:]

array([[1, 3, 9],
       [2, 3, 2]])

We can use the row index and column index in integer array indexing as follows:

In [31]:
matrix[[0,2],[1,2]] # we extract elements at index (0,1) and (2,2).

array([6, 2])

In [32]:
matrix[[0,1,2,3],[1,1,1,1]] 

array([6, 3, 3, 1])

Note that the indexing and slicing returns Numpy array. Indexing and slicing allow to build new arrays from existing ones.

In [33]:
vector = matrix[:,1]

In [34]:
vector

array([6, 3, 3, 1])

In [35]:
type(vector)

numpy.ndarray

#### 1.3) Filling-in 
Numpy provides functions to fill in arrays.

In [36]:
np.ones((5,5),dtype=np.int64) # make a matrix of size (5,5) and fill it with 1

array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])

In [37]:
np.zeros((4,3),dtype=np.int64)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [38]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

To fill-in a matrix with random values, one option is to select values from a continuous uniform distribution over the interval [0, 1).

In [39]:
np.random.random((3,5))

array([[0.78351344, 0.72395946, 0.28729085, 0.54341278, 0.2534541 ],
       [0.8959879 , 0.16433621, 0.55446978, 0.42292729, 0.40334902],
       [0.66643564, 0.65416595, 0.43211368, 0.61549721, 0.5645928 ]])

To select values over the interval [10,100)

In [40]:
(100 - 10) * np.random.random((3,5)) + 10

array([[28.98920901, 51.51678045, 34.35005065, 62.49158459, 90.39483775],
       [65.77615646, 14.66840737, 33.62438664, 32.20872636, 61.62767508],
       [51.54603412, 21.02448016, 21.70057352, 91.00655369, 23.92593062]])

randn generates an array whose values are drawn from a standard Gaussian distribution $N(0,1)$. 

In [41]:
np.random.randn(3,5)

array([[-0.21110676, -0.28544347,  2.1274904 , -1.46752182,  0.20892145],
       [-2.90243688, -1.37431035,  1.21090305,  1.08428733, -0.08487441],
       [ 0.18950053, -0.63563866, -0.90644151, -1.86856073, -2.11693588]])

For random samples from $N(\mu, \sigma^2)$, use $\sigma$ * np.random.randn(3,4) + $\mu$.

We use "zeros_like" or "ones_likes" to create matrices of the same size and fill it with 0 or 1, respectively.

In [42]:
np.zeros_like(matrix)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [43]:
np.ones_like(matrix)

array([[1, 1, 1],
       [1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

### 2) Arithmetic
Basic mathematical functions operate **elementwise** on arrays, and are available both as built-in function and arithmetic operator overloading. 

Let V and U be 3D vectors $\begin{pmatrix}-1 \\ 3 \\ 4/5\end{pmatrix}$ and $\begin{pmatrix}6 \\ 0 \\ 3\end{pmatrix}$, respectively.  

In [44]:
V = np.array([[-1],[3],[4/5]],dtype=np.float64)

In [45]:
U = np.array([[6],[0],[3]],dtype=np.float64)

We can perform addition and multiplication by a scalar:

In [46]:
U + 2*V # using operator overloading

array([[4. ],
       [6. ],
       [4.6]])

In [47]:
np.add(U,np.multiply(2,V)) # using Numpy built-in functions

array([[4. ],
       [6. ],
       [4.6]])

Note that "*" and "np.multiply" is elementwise multiplication (not matrix maltiplication or dot product).

In [48]:
V*U

array([[-6. ],
       [ 0. ],
       [ 2.4]])

In [49]:
np.multiply(U,V)

array([[-6. ],
       [ 0. ],
       [ 2.4]])

In [50]:
V + 3

array([[2. ],
       [6. ],
       [3.8]])

Mathematical functions also operate elementwise on arrays.

In [51]:
m = np.array([[4,5.8],[3.2,1.4]])

In [52]:
np.log(m)

array([[1.38629436, 1.75785792],
       [1.16315081, 0.33647224]])

In [53]:
np.sin(m)

array([[-0.7568025 , -0.46460218],
       [-0.05837414,  0.98544973]])

In [54]:
np.exp(m)

array([[ 54.59815003, 330.29955991],
       [ 24.5325302 ,   4.05519997]])

In [55]:
np.power(m,3)

array([[ 64.   , 195.112],
       [ 32.768,   2.744]])

### 3) Matrix operations

#### 3.1) Dot product
Following the previous example, we want to compute the rotation of vectors U and V by an angle $\theta$ across the z-axis. We compute the dot product of the matrix $\begin{pmatrix}cos(\theta) & -sin(\theta) & 0 \\ sin(\theta) & cos(\theta) & 0\\ 0 & 0 & 1\end{pmatrix}$ and the vectors U and V.

In [56]:
import math as m

In [57]:
teta = m.pi/6

In [58]:
rotz = np.array([[m.cos(teta),-m.sin(teta),0],[m.sin(teta),m.cos(teta),0],[0,0,1]])

In [59]:
rotz

array([[ 0.8660254, -0.5      ,  0.       ],
       [ 0.5      ,  0.8660254,  0.       ],
       [ 0.       ,  0.       ,  1.       ]])

In [61]:
np.dot(rotz,V)

array([[-2.3660254 ],
       [ 2.09807621],
       [ 0.8       ]])

In [62]:
np.dot(rotz,V).shape

(3, 1)

In [63]:
np.dot(rotz,U)

array([[5.19615242],
       [3.        ],
       [3.        ]])

In [64]:
rotz.dot(U)

array([[5.19615242],
       [3.        ],
       [3.        ]])

#### 3.2) Summation

In [65]:
matrix

array([[4, 6, 8],
       [1, 3, 9],
       [2, 3, 2],
       [0, 1, 7]])

In [66]:
matrix.sum() # computes the sum of all the elements

46

In [67]:
np.sum(matrix)

46

In [68]:
matrix.sum(axis=0) # compute the sum of the columns

array([ 7, 13, 26])

In [69]:
np.sum(matrix,axis=0)

array([ 7, 13, 26])

In [70]:
matrix.sum(axis=1) # compute the sum of the rows

array([18, 13,  7,  8])

In [71]:
np.sum(matrix,axis=1)

array([18, 13,  7,  8])

#### 3.3) Transpose

In [72]:
matrix.T

array([[4, 1, 2, 0],
       [6, 3, 3, 1],
       [8, 9, 2, 7]])

In [73]:
np.transpose(matrix)

array([[4, 1, 2, 0],
       [6, 3, 3, 1],
       [8, 9, 2, 7]])

In [74]:
(matrix.T).T == matrix

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

In [75]:
np.equal(matrix.T.T,matrix)

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

### 4) Linear algebra

In [76]:
x = np.array([[4,2,1],[2,6,3],[2,4,5]])

In [77]:
np.linalg.det(x)

59.999999999999986

In [78]:
np.linalg.eigvals(x)

array([ 3., 10.,  2.])

To solve the system of equations 3 * x0 + x1 = 9 and x0 + 2 * x1 = 8.

In [80]:
a = np.array([[3,1],[1,2]])

In [81]:
b = np.array([9,8])

In [83]:
np.linalg.solve(a,b)

array([2., 3.])

In [85]:
x0, x1 = 2., 3.

In [87]:
3*x0 + x1

9.0

In [88]:
x0+2*x1

8.0

### 4) List vs. array

Both list and array are also mutable objects.

In [89]:
m = np.array([[2,3],[4,5]])

In [90]:
m[1,1] = 8

In [91]:
m

array([[2, 3],
       [4, 8]])

In [92]:
id(m)

4547243584

In [93]:
p = m

In [94]:
id(p)

4547243584

Unlike list, elements of array must have the same type.

In [95]:
m[1,1] = "Can I add a string?"

ValueError: invalid literal for int() with base 10: 'Can I add a string?'

An array allows arithmetic operations, whereas a list does not.

In [96]:
m / np.sum(m)

array([[0.11764706, 0.17647059],
       [0.23529412, 0.47058824]])

In [99]:
[2,4,5]/2

TypeError: unsupported operand type(s) for /: 'list' and 'int'

array([[2, 3],
       [4, 8]])

Like list, we can iterate over an array.

In [101]:
for r in m:
    print(np.sum(r))

5
12


### 5) Boradcasting
Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

For example, suppose that we want to add a constant vector to each row of a matrix. We could do it like this:

In [105]:
def add_arrays_of_different_size(x,vector):
    y = np.empty_like(x)   # Create an empty matrix with the same shape as x

    # Add the vector v to each row of the matrix x with an explicit loop
    for i in range(len(x)):
        y[i, :] = x[i, :] + vector

    return y

In [106]:
matrix

array([[4, 6, 8],
       [1, 3, 9],
       [2, 3, 2],
       [0, 1, 7]])

In [107]:
add_arrays_of_different_size(matrix, [-2,3,1])

array([[ 2,  9,  9],
       [-1,  6, 10],
       [ 0,  6,  3],
       [-2,  4,  8]])

This works; however when the matrix x is very large, computing an explicit loop in Python could be slow. Note that adding the vector v to each row of the matrix x is equivalent to stacking multiple copies of v vertically, this is operation is called **broadcasting**. Numpy broadcasting allows us to perform this computation without using loop or creating multiple copies of v.

In [108]:
matrix + [-2,3,1]

array([[ 2,  9,  9],
       [-1,  6, 10],
       [ 0,  6,  3],
       [-2,  4,  8]])

**Practice**: Explain the results of the following broadcasting.

In [112]:
matrix

array([[4, 6, 8],
       [1, 3, 9],
       [2, 3, 2],
       [0, 1, 7]])

In [109]:
matrix + 1

array([[ 5,  7,  9],
       [ 2,  4, 10],
       [ 3,  4,  3],
       [ 1,  2,  8]])

In [110]:
matrix * [[0],[2],[5],[-2]]

array([[  0,   0,   0],
       [  2,   6,  18],
       [ 10,  15,  10],
       [  0,  -2, -14]])

In [111]:
matrix * [[9],[3]]

ValueError: operands could not be broadcast together with shapes (4,3) (2,1) 

### 6) Problem: Binary logistic regression
Binary logistic regression is a machine learning method used for prediction like any other regression analyses. Logistic regression is suitable for **classification problems** because it is suitable to explain the relationship between features $x_1$, $\ldots$, $x_n$ and a **binary output** $y$. For instance, whether features $x_1$, $\ldots$, $x_n$ describe **class A** or not. Estimation $\hat{y}$ is regarded as a probability of a class.

Suppose that we have a trained data using binary logistic regression. Given a new data $x_1$, $x_2$ and $x_3$, write python code, using Numpy, to compute the estimation $\hat{y}$.

<img src="images/LRegression.png">

where:
- $x_1$, $x_2$ and $x_3$ are features.
- $w_1$, $w_2$ and $w_3$ are the weights. 
- $b$ is the bias
- $\sigma$ is the sigmoid function, i.e. $\sigma(z) = \frac{1}{1+e^{-z}}$, that gives probability $\hat{y} \in [0,1]$


#### 6.1) Sigmoid function

In [None]:
def sigmoid(z):
    ### START CODE HERE ### (1 line of code)
    s = 
    ### END CODE HERE ###
    
    return s

In [None]:
sigmoid(3)

In [None]:
sigmoid(np.array([0,2]))

#### 6.2) Vectorization
Suppose we want to perform prediction for 10 thousand of data $X_1$, $\ldots$, $X_{10000}$. Data $X_i$ has three features $x_{1i}$, $x_{2i}$ and $x_{3i}$, where $1\leq i \leq 3$. We could iterate prediction $\sigma(\sum^{3}_{i=1}w_{ij} x_{ij} + b_j)$, using for-loop, 10000 times! However, a much faster approach is to use Numpy arrays. We first collect all data in arrays/matrices and this step is called vectorization. 


What is the shape of the matrix **X** of 10000 input data? 

In [117]:
X = np.ones((10000,3)) # enter the size

In [118]:
X.shape

(10000, 3)

What is the size of vector **W** of weights?

In [119]:
W = np.zeros((3,1)) # enter the size

In [120]:
W

array([[0.],
       [0.],
       [0.]])

Guess the size of $\hat{y}$.

In [121]:
hat_y = np.random.random(10000)

In [122]:
hat_y

array([0.8206179 , 0.18273107, 0.22410058, ..., 0.7504201 , 0.37067835,
       0.91920483])

#### 6.3) Linear function
Implement the linear function in the above picture, i.e. $\sum^{3}_{i=1}w_{ij} x_{ij} + b_j$. **The function should be able to compute for more than one data at once**. To that end, use the appropriate matrix operation.

In [None]:
def linear_fct(X, W, b):
    ### START CODE HERE ### (1 line of code)
    Z = 
    ### END CODE HERE ###
    
    return Z

#### 6.4) Estimation
Function estimation (i) computes the estimation $\hat{y}$ and (ii) classify the data, e.g. if $\hat{y} \geq 0.7$ then the data is in class 1 otherwise class 0. 

In [None]:
def estimation(Z):

    ### START CODE HERE ### (~ 2 lines of code)
    y_hat = 
    bclass = 
    ### END CODE HERE ###
    
    return y_hat, bclass

#### 6.5) Prediction
Compute the prediction of data X using the functions you implemented so far.

In [None]:
w, b, X = np.array([[1.],[2.]]), 2., np.array([[1.,2.,-1.],[3.,4.,-3.2]])