# Numpy

Original version of this notebook is used in the PSIML 2020. seminar (Practical Seminar in Machine Learning).
Check out official numpy documentation [here](https://numpy.org/doc/stable/).

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

In [1]:
import numpy as np

print("Numpy is loaded successfully!")

Numpy is loaded successfully!


## Arrays

In [2]:
# Define array
a = np.array([1,2,3])

# Some basic properties
print("Array a: ", a)
print("\nShape of array a: ", a.shape)
print("\nData type of array a: ", a.dtype)

Array a:  [1 2 3]

Shape of array a:  (3,)

Data type of array a:  int64


In [3]:
# Define matrix
b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int64)

# Some basic properties
print("Matrix b: \n", b)
print("\nShape of matrix b: ", b.shape)
print("\nData type of matrix b: ", b.dtype)

Matrix b: 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Shape of matrix b:  (3, 3)

Data type of matrix b:  int64


In [4]:
# Multidim arrays - tensor
c = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]], dtype=np.float64)

# Some basic properties
print("Tensor c: \n", c)
print("\nShape of tensor c: ", c.shape)
print("\nData type of tensor c: ", c.dtype)

Tensor c: 
 [[[ 1.  2.  3.]
  [ 4.  5.  6.]]

 [[ 7.  8.  9.]
  [10. 11. 12.]]]

Shape of tensor c:  (2, 2, 3)

Data type of tensor c:  float64


## Initialization functions

In [5]:
# All zeros
print("All zeros: \n", np.zeros((2,2)))

# All ones
print("\nAll ones: \n", np.ones((2,3,4)))

# All same value
print("\nAll same value: \n", np.full((2,2), 2))

# Identity matrix
print("\nIdentity matrix: \n", np.eye(3))

# Linspace
print("\nUniformly spaced elements: \n", np.linspace(start=0, stop=1, num=10, endpoint=False))

# Arange
print("\nNumpy array with range of numbers: \n", np.arange(start=0, stop=10, step=2, dtype=np.float64))

All zeros: 
 [[0. 0.]
 [0. 0.]]

All ones: 
 [[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]

All same value: 
 [[2 2]
 [2 2]]

Identity matrix: 
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Uniformly spaced elements: 
 [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]

Numpy array with range of numbers: 
 [0. 2. 4. 6. 8.]


Numpy offers a variety of ways to generate "random" numbers.

In [6]:
# All random
# Setting a random seed is important for reproducibility of the code.
# It is good practice to use it in ML before moving to actual training as it makes debuging a lot easier.
np.random.seed(5)
print("\nRandom uniform: \n", np.random.random((2,2)))

# If high=None, then [0, low) is returned. By default, single integer is returned if size=None
print("\nRandom integer: \n", np.random.randint(low=2, high=9, size=(3, 3)))
print("\nRandom choice: \n", np.random.choice([1, 2, 3, 4, 5, 6], size=(4)))
print("\nSample from Binomial distribution: \n", np.random.binomial(n=10,p=0.5))


Random uniform: 
 [[0.22199317 0.87073231]
 [0.20671916 0.91861091]]

Random integer: 
 [[2 6 8]
 [5 2 8]
 [2 6 3]]

Random choice: 
 [6 1 4 5]

Sample from Binomial distribution: 
 4


## Array indexing
Indexing starts from 0. It is possible to use negative indexes (for example -1 for last element of array)

In [7]:
print("Array a: ", a)
print("First element of a: ", a[0])
print("Last element of a: ", a[2])
print("Last element of a: ", a[-1])

Array a:  [1 2 3]
First element of a:  1
Last element of a:  3
Last element of a:  3


Indexing in matrix and tensor is the same and we can index any column, row etc.

In [8]:
print("Matrix b: \n", b)
print("\nValue of b[0]: \n", b[0])
print("\nValue of b[-2]: \n", b[-2])
print("\nValue of b[0][1]: ", b[0][1])
print("Value of b[0, 1]: ", b[0, 1])
print("\nValue of b[0, :]: \n", b[0, :])
print("\nValue of b[0:2, 1:]: \n", b[0:2, 1:]) # General form is start:end

Matrix b: 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

Value of b[0]: 
 [1 2 3]

Value of b[-2]: 
 [4 5 6]

Value of b[0][1]:  2
Value of b[0, 1]:  2

Value of b[0, :]: 
 [1 2 3]

Value of b[0:2, 1:]: 
 [[2 3]
 [5 6]]


In [9]:
print("Tensor c: \n", c)
print("\nValue of c[0]: \n", c[0])
print("\nValue of c[-2]: \n", c[-2])
print("\nValue of c[0][1]: ", c[0][1])
print("Value of c[0][0][0]: ", c[0][0][0])
print("Value of c[0, 0, 0]: ", c[0, 0, 0])
print("\nValue of c[0, :, 0:2]: \n", c[0, :, 0:2])

Tensor c: 
 [[[ 1.  2.  3.]
  [ 4.  5.  6.]]

 [[ 7.  8.  9.]
  [10. 11. 12.]]]

Value of c[0]: 
 [[1. 2. 3.]
 [4. 5. 6.]]

Value of c[-2]: 
 [[1. 2. 3.]
 [4. 5. 6.]]

Value of c[0][1]:  [4. 5. 6.]
Value of c[0][0][0]:  1.0
Value of c[0, 0, 0]:  1.0

Value of c[0, :, 0:2]: 
 [[1. 2.]
 [4. 5.]]


### Exercise 1 
Create a 3x3 matrix with random integers between 0 and 9. Print the matrix and its shape. Print the first row of the matrix. Print the last column of the matrix. Print the middle element of the matrix. 

In [10]:
# initialize matrix with random values
d = np.random.randint(low=0, high=10, size=(3, 3))
print(f"Matrix d: \n {d}")
print(f"Matrix shape: {d.shape}")
print(f"First row: {d[0]}")
print(f"Last column: {d[:, -1]}")
print(f"Middle element: {d[d.shape[0] // 2, d.shape[1] // 2]}")

Matrix d: 
 [[1 4 6]
 [2 9 9]
 [9 9 1]]
Matrix shape: (3, 3)
First row: [1 4 6]
Last column: [6 9 1]
Middle element: 9


## Basic operations

In [11]:
x = np.array([[1, 2], [3, 4]], dtype=np.float64)
y = np.array([[5, 6], [7, 8]], dtype=np.float64)

print("Matrix x: \n", x)
print("\nMatrix y: \n", y)

Matrix x: 
 [[1. 2.]
 [3. 4.]]

Matrix y: 
 [[5. 6.]
 [7. 8.]]


In [12]:
print("Addition:\n", x + y)
print("Subtraction:\n", y - x)
print("Elementwise multiplication:\n", x * y)
print("Multiplication:\n", np.matmul(x, y))
print("Multiplication (dot product):\n", np.dot(x, y))
print("Multiplication x@y:\n", x@y)
print("Divison:\n", x / y)
print("x>2:\n", x>2)
print("Square root:\n", np.sqrt(x))
print("Exp:\n", np.exp(x))
print("Dot product:\n", np.dot(x[1], y[0]))
print("Transpose:\n", x.T)
print("Inverse:\n", np.linalg.inv(x))
print("Determinant:\n", np.linalg.det(y))

Addition:
 [[ 6.  8.]
 [10. 12.]]
Subtraction:
 [[4. 4.]
 [4. 4.]]
Elementwise multiplication:
 [[ 5. 12.]
 [21. 32.]]
Multiplication:
 [[19. 22.]
 [43. 50.]]
Multiplication (dot product):
 [[19. 22.]
 [43. 50.]]
Multiplication x@y:
 [[19. 22.]
 [43. 50.]]
Divison:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]
x>2:
 [[False False]
 [ True  True]]
Square root:
 [[1.         1.41421356]
 [1.73205081 2.        ]]
Exp:
 [[ 2.71828183  7.3890561 ]
 [20.08553692 54.59815003]]
Dot product:
 39.0
Transpose:
 [[1. 3.]
 [2. 4.]]
Inverse:
 [[-2.   1. ]
 [ 1.5 -0.5]]
Determinant:
 -2.000000000000005


In [13]:
w = [0, 3, -2, 9, 5]
print("w:\n", w)
print("Argmax w:\n", np.argmax(w))
print("Max w:\n", np.max(w))
print("Sorted w:\n", np.sort(w))

w:
 [0, 3, -2, 9, 5]
Argmax w:
 3
Max w:
 9
Sorted w:
 [-2  0  3  5  9]


Boolean indexing

In [14]:
arr = np.array([0, -3, 4, -1, 2, 8, 5])
print("arr:\n", arr)
print("Boolean indexing:\n", arr[[False, False, True, False, False, True, True]])
print("arr>2:\n", arr>2)
print("arr[arr>2]:\n", arr[arr>2])

arr:
 [ 0 -3  4 -1  2  8  5]
Boolean indexing:
 [4 8 5]
arr>2:
 [False False  True False False  True  True]
arr[arr>2]:
 [4 8 5]


### Exercise 2
Create a 3 x 4 matrix with random integers between 0 and 100. Reshape the matrix into array with 12 elements (flatten) and filter out all elements that are odd. Find the maximum element of the constructed array.

In [15]:
d = np.random.randint(low=0, high=100, size=(3, 4))
print(f"Matrix d: \n {d}")
d = d.reshape((12,))
print(f"Reshaped array d: \n {d}")
d = d[d % 2 == 0]
print(f"Filtered array d: \n {d}")
max_element = np.max(d)
print(f"Max element: {max_element}")

Matrix d: 
 [[82 16 78  5]
 [58  0 80  4]
 [36 51 27 31]]
Reshaped array d: 
 [82 16 78  5 58  0 80  4 36 51 27 31]
Filtered array d: 
 [82 16 78 58  0 80  4 36]
Max element: 82


## Broadcasting
Broadcasting is one of the most important numpy features. The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is "broadcast" across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations.

In [16]:
a = np.array([1.0, 2.0, 3.0])
b = np.array([2.0, 2.0, 2.0])
print("a * b, a as vector, b as vector:", a * b)

b = np.array([2])
print("a * b, a as vector, b as scalar:", a * b)

a * b, a as vector, b as vector: [2. 4. 6.]
a * b, a as vector, b as scalar: [2. 4. 6.]


In [17]:
a = np.array([[1,2,3], [4,5,6]])
b = np.array([2,4,6])

print("Matrix a:\n", a)
print("Vector b:", b)
print("a + b, a as matrix, b as vector:\n", a + b)
print("a * b, a as matrix, b as vector:\n", a * b)
print("Dot product of a and b:\n", np.dot(a, b))

Matrix a:
 [[1 2 3]
 [4 5 6]]
Vector b: [2 4 6]
a + b, a as matrix, b as vector:
 [[ 3  6  9]
 [ 6  9 12]]
a * b, a as matrix, b as vector:
 [[ 2  8 18]
 [ 8 20 36]]
Dot product of a and b:
 [28 64]


### Exercise 3: Implement sigmoid function:

Sigmoid function:

\begin{equation*}
S(x) = \frac{1}{1 + e^{-x}}
\end{equation*}

You can find more at *https://en.wikipedia.org/wiki/Sigmoid_function*

In [18]:
def sigmoid(x):
    S = 1 / (1 + np.exp(-x))
    return S

In [19]:
print("Sigmoid of \"0\":", sigmoid(0))
print("Expected value: 0.5")
testArray = np.array([1,5])
print("Sigmoid of [1,5]:", sigmoid(testArray))
print("Expected value: [0.73105858 0.99330715]")

Sigmoid of "0": 0.5
Expected value: 0.5
Sigmoid of [1,5]: [0.73105858 0.99330715]
Expected value: [0.73105858 0.99330715]
