# Lecture 2: Notation, Scikit and Feature Engineering

## Notation in Numpy

In [9]:
import numpy as np

### Scalar

Numpy is the most important package for linear algebra, fast array - vector - operations and primitives.
But, in Numpy there is no specific data type for scalars, rather we use the standard data types for integer and float in Python.

In [10]:
a = 3.5 # connect the name to the object
w = 10

### Vector

To create a vector we use the array() function (this function takes the list and transforms it to the vector). By this function we can create any n-dimensional array, e.g. matrix and tensors. Numpy uses a ROW representation for vector, the opposite w.r.t. to the notation, but it is only the matter of transposing.

In [11]:
w_vec = np.array([1, 2, 3, 4.5])

Since at least one of the scalars in the vector is a float, the dtype of the vector is float64, i.e. each element of the vector is a float.

In [12]:
w_vec.dtype

dtype('float64')

An array is basically a list, so it supports most of the basic operations on the list, e.g. indexing and slicing (we take subvector from the vector).

In [13]:
w_vec[1] # we are indexing the second element / second dimension of the vector w_vec

2.0

In [14]:
w_vec[1:3] # we are taking a subvector starting from index 1 and ending at index 2 included

array([2., 3.])

### Matrix

The function array() can also create matrices intended as a 2-dimensional arrays with nested brackets. To create a n*d matrix, we must stack n rows of d-dimensional vectors.

In [15]:
W = np.array([
    [1, 4, 7],
    [4.9, 4, 2]
])
# 2 x 3 matrix
W

array([[1. , 4. , 7. ],
       [4.9, 4. , 2. ]])

To select a specific row with index j, we use the standard indexing and slicing operations. The same holds for selecting columns.

In [16]:
W[0][:], W[0, :] # select row with index 0

(array([1., 4., 7.]), array([1., 4., 7.]))

In [17]:
W[:, :2] # select submatrix

array([[1. , 4. ],
       [4.9, 4. ]])

In [18]:
W[:, 1] # select column with index 1

array([4., 4.])

### Shape

The shape of an array returns the number of scalars for each dimension. We can access the shape of an array by the attribute shape. Sort of debugging function to check if we have appropriate dimensions.

In [19]:
W.shape, w_vec.shape

((2, 3), (4,))

### Transpose

The transpose of a vector or matrix can be obtained by the operator T.

In [20]:
W_T = W.T
W_T

array([[1. , 4.9],
       [4. , 4. ],
       [7. , 2. ]])

In [21]:
W.shape, W_T.shape

((2, 3), (3, 2))

## Operations on Vectors and Matrices

- Summing two vectors: x + y
- Multiplying a vector by a scalar: a*x
- Inner product of two vectors: x.dot(y) or np.dot(x, y)
- Hadamard product of two vectors: x * y

In [22]:
x_vec, y_vec = np.array([1, 2, 3, 4]), np.array([6, 7, 8, 9])

In [23]:
x_vec + y_vec

array([ 7,  9, 11, 13])

In [24]:
a * x_vec

array([ 3.5,  7. , 10.5, 14. ])

In [25]:
# vector and scalar summation
x_vec + 2

array([3, 4, 5, 6])

In [26]:
x_vec.dot(y_vec), np.dot(x_vec, y_vec)

(80, 80)

In [27]:
x_vec * y_vec

array([ 6, 14, 24, 36])

- Multiplying a matrix by a vector: M.dot(x) or M @ x
- Multiplying two matrices: np.dot(M1, M2) or M1 @ M2

In [28]:
A = np.array([
    [1, 0, 1],
    [0, 2, 1]
])

In [29]:
W @ np.array([2, 1, 1])

array([13. , 15.8])

In [30]:
W.shape, A.shape

((2, 3), (2, 3))

In [31]:
W @ A.T

array([[ 8. , 15. ],
       [ 6.9, 10. ]])

In [32]:
W.T @ A

array([[1. , 9.8, 5.9],
       [4. , 8. , 8. ],
       [7. , 4. , 9. ]])

In Numpy, there is no straightforward function to computer the maximum of a function f(a),, but we may exploit list comprehension.

In [33]:
S = [2, 3, 0.5, 6, 1, -10]
def f(a):
    return a**2

np.max([f(a) for a in S]) 

100.0

In [34]:
S[np.argmax([f(a) for a in S])]

-10

## 