# Practical Introduction to Data Science: NumPy Practical (1)

Source file for tutorial: https://python-course.eu/numerical-programming/introduction-to-numpy.php

## Introduction to NumPy

In [1]:
import numpy

In [2]:
import numpy as np #more common to see as np

### Example 1: Converting a list of temperatures to Fahrenheit

In [3]:
cvalues = [20.1, 20.8, 21.9, 22.5, 22.7, 22.3, 21.8, 21.2, 20.9, 20.1]
C = np.array(cvalues) #turning  our list "cvalues" to a one-dimensional numpy array
print(C)

[20.1 20.8 21.9 22.5 22.7 22.3 21.8 21.2 20.9 20.1]


In [4]:
print(C * 9 / 5 + 32) #converting to Fahrenheit
print(C) #array C is unchanged by the expression

[68.18 69.44 71.42 72.5  72.86 72.14 71.24 70.16 69.62 68.18]
[20.1 20.8 21.9 22.5 22.7 22.3 21.8 21.2 20.9 20.1]


In [5]:
type(C)

numpy.ndarray

Much easier to do with a numpy array than in normal Python, e.g.

In [6]:
fvalues = [ x*9/5 + 32 for x in cvalues] #Using Core Python
print(fvalues)

[68.18, 69.44, 71.42, 72.5, 72.86, 72.14, 71.24000000000001, 70.16, 69.62, 68.18]


### Time comparison between Python lists and NumPy arrays

In [19]:
import time
size_of_vec = 1000000

def pure_python_version():
    t1 = time.time()
    X = range(size_of_vec)
    Y = range(size_of_vec)
    Z = [X[i] + Y[i] for i in range(len(X)) ]
    return time.time() - t1

def numpy_version():
    t1 = time.time()
    X = np.arange(size_of_vec)    #has to call the module numpy by np.
    Y = np.arange(size_of_vec)
    Z = X + Y                     #vector sum
    return time.time() - t1

In [20]:
t1 = pure_python_version()
t2 = numpy_version()

print(t1, t2)
print("Numpy is in this example " + str(t1/t2) + " faster!")

0.3218955993652344 0.00400543212890625
Numpy is in this example 80.3647619047619 faster!


## Creating NumPy Arrays

We can create multi-dimensional arrays in NumPy. 

### Zero-dimensional Arrays: Scalars

x = np.array(13)
print(type(x))
print(np.ndim(x)) #number of dimensions

### One-dimensional Arrays: Vectors

Difference from lists is that arrays have to contain items of the same type (lists don't have to). Vectors are homogenous.

In [26]:
F = np.array([1, 1, 2, 3, 5, 8, 13, 21])
V = np.array([3.4, 6.9, 99.8, 12.8])
print("F: ", F)
print("V: ", V)
print("Type of F: ", F.dtype) #dtype to find out the type of the whole vector
print("Type of V: ", V.dtype)
print("Dimension of F: ", np.ndim(F))
print("Dimension of V: ", np.ndim(V))

F:  [ 1  1  2  3  5  8 13 21]
V:  [ 3.4  6.9 99.8 12.8]
Type of F:  int32
Type of V:  float64
Dimension of F:  1
Dimension of V:  1


### Two- and Multidimensional Arrays (Matrices)

We create multidimensional arrays by passing nested lists or tuples to the array method of numpy.

In [27]:
A = np.array([ [3.4, 8.7, 9.9], 
               [1.1, -7.8, -0.7],
               [4.1, 12.3, 4.8]])
print(A)
print(A.ndim)

[[ 3.4  8.7  9.9]
 [ 1.1 -7.8 -0.7]
 [ 4.1 12.3  4.8]]
2


In [28]:
B = np.array([ [[111, 112], [121, 122]],
               [[211, 212], [221, 222]],
               [[311, 312], [321, 322]] ])
print(B)
print(B.ndim)

[[[111 112]
  [121 122]]

 [[211 212]
  [221 222]]

 [[311 312]
  [321 322]]]
3


### Shape of an Array

The function "shape" returns the shape of an array. The shape is a tuple of integers, these denotate the lenghts of each dimension. The output (6, 3) denotates 6 rows and 3 columns. 

In [None]:
x = np.array([ [67, 63, 87],
               [77, 69, 59],
               [85, 87, 99],
               [79, 72, 71],
               [63, 89, 93],
               [68, 92, 78]])
print(np.shape(x))
# or look at the property of x for the same output
print(x.shape)

Shape indicates the order in which the indices are processed: first rows, then columns and then the other dimensions. "shape" can als be used to change the shape of an array.

In [32]:
x.shape = (3, 6)
print(x)

[[67 63 87 77 69 59]
 [85 87 99 79 72 71]
 [63 89 93 68 92 78]]


### Indexing and Slicing

Accessing elements of an array and indexing is similar to lists and tuples.

In [33]:
# Single indexing

F = np.array([1, 1, 2, 3, 5, 8, 13, 21])
# print the first element of F
print(F[0])
# print the last element of F
print(F[-1])

1
21


In [35]:
# Indexing multidimensional arrays

A = np.array([ [3.4, 8.7, 9.9], 
               [1.1, -7.8, -0.7],
               [4.1, 12.3, 4.8]])

print(A[1][0]) 
# we're accessing the element in the 2nd row and 1st column

# This is not as efficient as we are creating an 
# intermediate array:
tmp = A[1]
print(tmp)
print(tmp[0])

1.1
[ 1.1 -7.8 -0.7]
1.1


In [36]:
print(A[1, 0]) # more efficient

1.1


In [37]:
# Slicing in the same way as with lists

S = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print(S[2:5])
print(S[:4])
print(S[6:])
print(S[:])

[2 3 4]
[0 1 2 3]
[6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]


In [38]:
# Multidimensional slicing

A = np.array([
[11, 12, 13, 14, 15],
[21, 22, 23, 2"4, 25],
[31, 32, 33, 34, 35],
[41, 42, 43, 44, 45],
[51, 52, 53, 54, 55]])

print(A[:3, 2:]) 
# all of the entries up to but not including 3,
# all of the values starting at 2 up to the end.
# rows first, then columns

[[13 14 15]
 [23 24 25]
 [33 34 35]]


In [44]:
print(A[3:, :]) #only in one direction

[[41 42 43 44 45]
 [51 52 53 54 55]]


In [45]:
X = np.arange(28).reshape(4, 7) # third parameter step
print(X)
print(X[::2, ::3])

[[ 0  1  2  3  4  5  6]
 [ 7  8  9 10 11 12 13]
 [14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27]]
[[ 0  3  6]
 [14 17 20]]


New view on the original array - performance benefits, we're not making any copies, but we're changing the original arrays.

### Initializing Arrays with Ones and Zeros

There are two ways of initializing arrays with zeros and ones. Method one takes a tuple t with the shape of an array and fills it with ones (or zeros). By default it will be Ones of type float.

In [46]:
# Ones
E = np.ones((2,3))
print(E)

F = np.ones((3,4),dtype=int)
print(F)

[[1. 1. 1.]
 [1. 1. 1.]]
[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]


In [47]:
# Zeros
Z = np.zeros((2,4))
print(Z)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [48]:
# Zeros, and same shape as an existing array 'a'
x = np.array([2,5,18,14,4])
E = np.ones_like(x)
print(E)

Z = np.zeros_like(x)
print(Z)

[1 1 1 1 1]
[0 0 0 0 0]


### Identity matrix / Identity Array

In linear algebra, the identity matrix, or unit matrix, of size n is the n × n **square matrix** with ones on the main diagonal and zeros elsewhere. Two types: the identity function, the eye function.

In [49]:
# identity function, identity(n, dtype = None)
# n is an int, defines the n of rows and columns
# dtype, default is float

np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [51]:
np.identity(4, dtype=int)

array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]])

In [58]:
# eye function, eye(N, M = None, k = 0, dtype = float)
# can be rectangular!
# N is the n of rows, M is the n of columns (default is N)
# k is the position of diagonal, 0 is main diagonal

np.eye(5, 8, k=1, dtype=int)

array([[0, 1, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0]])