# Basics of NumPy

Documentation: https://numpy.org/doc/stable/reference/routines.linalg.html

In [168]:
import numpy as np

It is possible to specify the data type of the array elements using the dtype parameter. The following code shows how to create an array of 16-bit integers:

In [169]:
a = np.array([1,3,6], dtype='int16')
b = np.array([[1,3,6,5,9,10], [4.5,3.7,6.8,9,0,8]])

print(a)
print(b)

[1 3 6]
[[ 1.   3.   6.   5.   9.  10. ]
 [ 4.5  3.7  6.8  9.   0.   8. ]]


## Getting the dimensions of an array

How many dimensions does the array have?

In [170]:
b.ndim

2

## Getting the shape of the data

(number or arrays/ rows, number of elements/ columns)

In [171]:
b.shape

(2, 6)

## Getting the type of a variable

In [172]:
a.dtype

dtype('int16')

## Getting the size of an object

Note that the provided dtype in the initialization results in different sizes of both arrays.

In [173]:
a.itemsize

2

In [174]:
b.itemsize

8

## Getting the total size of the array

itemsize * number of items in the array

In this example: Size of the array object (2) * 3 items in the array = 6

In [175]:
a.nbytes

6

## Accessing elements

Get a specific element [row, column]. Alternatively one can think of it as [array, index]

In [176]:
b[1, 3]

9.0

Get a specific row. Returns all the columns/ elements from an array.

In [177]:
b[1, :]

array([4.5, 3.7, 6.8, 9. , 0. , 8. ])

Get a specific column (for all rows/ arrays). So the result will be one element from each row/ array at the specified index.

In [178]:
b[:, 2]

array([6. , 6.8])

Getting more specific elements [row/ array, startindex:endindex:stepsize]

In [179]:
b[1, 1:6:2]

array([3.7, 9. , 8. ])

## Changing elements

As simple as accessing the element and re-assigning it to the new value.

In [180]:
b[1,3] = 20
print(b)

[[ 1.   3.   6.   5.   9.  10. ]
 [ 4.5  3.7  6.8 20.   0.   8. ]]


Changing the value for all the rows to a specific value. In this example all the elements of each row/ array at the index/ column 2 are being assigned to the number 5.

In [181]:
b[:,2] = [5]
print(b)

[[ 1.   3.   5.   5.   9.  10. ]
 [ 4.5  3.7  5.  20.   0.   8. ]]


This also works if you want to assign them different numbers, and not just all the same. Say you want to assign the element in the first row/ array at index 2 to 1, and the element in the second row/ array at index 2 to 3.

If you have two rows/ arrays (2-d): [number to add to array 1, number to add to array 2]

If you have three rows/ arrays (3-d): [number to add to array 1, number to add to array 2, number to add to array 3]

and so on...

In [182]:
b[:,2] = [1, 3]
print(b)

[[ 1.   3.   1.   5.   9.  10. ]
 [ 4.5  3.7  3.  20.   0.   8. ]]


## Working with more dimensions

Access from the outisde in. Start with the row/ array you want to access, move on to the inner row/ array, then access the column/ index in question.

Example: Accessing the second dimension, then accessing the first row/ array and finally accessing the second element/ index.

In [183]:
c = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
c[1,0,1]

6

## Ways to initilize arrays

All 0s matrix. One can specifiy the shape in the tuple (number of rows, number of columns)

In [184]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

It is also possible to initilize more dimensions (number of rows, number of columns, number of dimensions)

In [185]:
np.zeros((2,3,2))

array([[[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]]])

All 1s matrix. Similar to the all 0s.

In [186]:
np.ones((2,3,2))

array([[[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]]])

Any other number. Second parameter takes in the number that should fill the matrix.

In [187]:
np.full((2,4),99)

array([[99, 99, 99, 99],
       [99, 99, 99, 99]])

It is also possible to create a new matrix that is like another existing matrix shape using the full_like method. (matrix one wants to copy in shape, number that should be inserted)

In [188]:
np.full_like(b, 99)

array([[99., 99., 99., 99., 99., 99.],
       [99., 99., 99., 99., 99., 99.]])

A matrix of random decimal numbers between 0 and 1. Using the random.rand method and the desired shape of the array/ matrix.

In [189]:
np.random.rand(4, 2)

array([[0.11838582, 0.83756093],
       [0.78488659, 0.18996506],
       [0.84684159, 0.86204994],
       [0.74844822, 0.59994619]])

It is also possible to use random values with the shape from another matrix using random_sample, similar to the full_like method just with random numbers.

In [190]:
np.random.random_sample(b.shape)

array([[0.46120195, 0.75501192, 0.33708656, 0.05279249, 0.33446635,
        0.05844929],
       [0.06443765, 0.15244822, 0.63819267, 0.19351315, 0.51574339,
        0.80884471]])

Random integer values with random.randint. Using the starting value (default is 0), maximum value and the size(shape) of the matrix. 

In [191]:
np.random.randint(5, 30, size=(3,4))

array([[21, 17,  8, 13],
       [26, 20, 19, 21],
       [29, 13, 17, 18]])

Identity matrix. Only need one value, because it will always be a square matrix.

In [192]:
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

Repeating an existing array. First specify which array should be repeated. Then specify the number of repetitions. At the end sepcifiy the axis in question.

In [193]:
arr = np.array([[1,2,3]])
np.repeat(arr, 3, axis=0)

array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

In [194]:
test_array = np.zeros((5,5))

test_array[0:5:4, :] = [1]
test_array[:, 0:5:4] = [1]

test_array[2,2] = 9

print(test_array)

[[1. 1. 1. 1. 1.]
 [1. 0. 0. 0. 1.]
 [1. 0. 9. 0. 1.]
 [1. 0. 0. 0. 1.]
 [1. 1. 1. 1. 1.]]


In [195]:
test_array2 = np.ones((5,5))

injection_array = np.zeros((3,3))
injection_array[1,1]=9

test_array2[1:4, 1:4] = injection_array

print(test_array2)

[[1. 1. 1. 1. 1.]
 [1. 0. 0. 0. 1.]
 [1. 0. 9. 0. 1.]
 [1. 0. 0. 0. 1.]
 [1. 1. 1. 1. 1.]]


## Being careful when copying arrays

Reassignment of an array to a new variable does not copy the array, it just creates a new pointer to the original array. To really copy the content and leave the original array untouched use .copy() .

"Wrong" way:

In [196]:
a = np.array([1,2,3])
b = a
b[0] = 100

print(a)
print(b)

[100   2   3]
[100   2   3]


Right way:

In [197]:
a = np.array([1,2,3])
b = a.copy()
b[0] = 100

print(a)
print(b)

[1 2 3]
[100   2   3]


## Basic mathematical operations

In [198]:
a = np.array([1,2,3])

a - 2

array([-1,  0,  1])

In [199]:
a * 2

array([2, 4, 6])

In [200]:
a ** 2

array([1, 4, 9])

In [201]:
np.cos(a)

array([ 0.54030231, -0.41614684, -0.9899925 ])

In [202]:
b = np.array([4,5,6])

a + b

array([5, 7, 9])

In [203]:
a * b

array([ 4, 10, 18])

## Linear Algebra

The number of columns of the first matrix need to be equal to the number of rows of the second matrix. 

In [204]:
a = np.ones((2,3))
print(a)

b = np.full((3,2), 3)
print(b)

np.matmul(a,b)

[[1. 1. 1.]
 [1. 1. 1.]]
[[3 3]
 [3 3]
 [3 3]]


array([[9., 9.],
       [9., 9.]])

Finding the determinant of a matrix:

In [205]:
c = np.identity(1)

np.linalg.det(c)

1.0

Finding the eigenvalues for a matrix:

In [206]:
d = np.array([2,4,6,8,10])
np.linalg.eigvals(c)

array([1.])

## Statistics

In [207]:
stats = np.array([[2, 4, 6, 10], [1,3,5,7]])

np.min(stats)

1

In [208]:
np.max(stats)

10

Returns the matrix with the smallest value:

In [209]:
np.min(stats, axis=0)

array([1, 3, 5, 7])

Returns the smallest values in the matrices:

In [210]:
np.min(stats, axis=1)

array([2, 1])

Returns the sum of all the values from all the matrices:

In [211]:
np.sum(stats)

38

Returns the sum of columns. In this example the following applies:

[2, 4, 6, 10]

plus

[1, 3, 5, 7]

equals

[3, 7, 11, 17]

In [212]:
np.sum(stats, axis=0)

array([ 3,  7, 11, 17])

Returns the sum of the values inside each matrix:

In [213]:
np.sum(stats, axis=1)

array([22, 16])

## Reorganizing/ Reshaping arrays

Gives a new shape to an array without changing its data. To reshape the array, the new shape must be compatible with the inital shape. 

This means if a shape is for example (2, 4), the possibilities of reshaping the array boil down to comibnations that equal to 2 * 4 = 8. This means the following are all valid reshapes:

- (1, 8)
- (4, 2)
- (8, 1)
- (2, 2, 2)

In [214]:
initial = np.array([[1,2,3,4],[5,6,7,8]])
print(initial)
np.shape(initial)

[[1 2 3 4]
 [5 6 7 8]]


(2, 4)

In [215]:
after = initial.reshape((4,2))
print(after)
np.shape(after)

[[1 2]
 [3 4]
 [5 6]
 [7 8]]


(4, 2)

It is possible to stack two matrices on top of each other, i.e. vertically:

In [216]:
stack1 = np.array([1,2,3])
stack2 = np.array([4,5,6]) 

np.vstack([stack1, stack2])

array([[1, 2, 3],
       [4, 5, 6]])

They can be stacked as many times as desired:

In [217]:
np.vstack([stack1, stack2, stack1, stack2])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

The same is also possible with horizontal stacking, i.e. adding the values of one matrix to the end of another matrix:

In [218]:
np.hstack([stack1, stack2])

array([1, 2, 3, 4, 5, 6])

In [219]:
stack3 = np.zeros((2,3))
stack4 = np.ones((2,3))

np.hstack([stack3, stack4])

array([[0., 0., 0., 1., 1., 1.],
       [0., 0., 0., 1., 1., 1.]])

## Importing data from files

It is genereally recommended to use Pandas to handle data, but numpy also offers methods to import data from files using the genfromtxt() method:

In [220]:
filedata = np.genfromtxt('data.txt', delimiter=',')
# it is possible to cast the data to a specific data type
filedata = filedata.astype('int32')
print(filedata)

[[  1  13  21  11 196  75   4   3  34   6   7   8   0   1   2   3   4   5]
 [  3  42  12  33 766  75   4  55   6   4   3   4   5   6   7   0  11  12]
 [  1  22  33  11 999  11   2   1  78   0   1   2   9   8   7   1  76  88]]


## Boolean masking & advanced indexing

Can be used to find specific values in data which conform to specified boolean operations.

In [221]:
filedata[filedata > 50]

array([196,  75, 766,  75,  55, 999,  78,  76,  88], dtype=int32)

In [222]:
filedata[(filedata > 50) & (filedata < 100)]

array([75, 75, 55, 78, 76, 88], dtype=int32)

It is possible to use indices in a List in NumPy. In the followign example we are accessing the first matrix (0) and inside this matrix we are accessing the indices [1,2,3] to get their values:

In [223]:
filedata[(0,[1,2,3])]

array([13, 21, 11], dtype=int32)

Finding out if any values pass a specified boolean operation on a given axis. In this example it checks all the values in each column. 
```
[[ 1 ]
 [ 766 ]
 [ 11 ]]
 ```
This example would give the following result using axis=0:

```
[[ True ]]
 ```

In [224]:
np.any(filedata > 50, axis=0)

array([False, False, False, False,  True,  True, False,  True,  True,
       False, False, False, False, False, False, False,  True,  True])

The same is also possible with the all() method which checks if all the values in a given axis matches the boolean operation. 

```
[[ 1 ]
 [ 766 ]
 [ 11 ]]
 ```
This example would give the following result using axis=0:

```
[[ False ]]
 ```
---
 ```
[[ 766 ]
 [ 766 ]
 [ 766 ]]
 ```
This example would give the following result using axis=0:

```
[[ True ]]
 ```

In [225]:
np.all(filedata > 50, axis=0)

array([False, False, False, False,  True, False, False, False, False,
       False, False, False, False, False, False, False, False, False])

In [227]:
test_index = np.array([[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20],[21,22,23,24,25],[26,27,28,29,30]])
print(test_index)

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]
 [26 27 28 29 30]]


In [235]:
test_index[2:4, 0:2]

array([[11, 12],
       [16, 17]])

In [254]:
test_index[[0,1,2,3],[1,2,3,4]] 

array([ 2,  8, 14, 20])

In [267]:
test_index[[0,4,5], 3:5] 

array([[ 4,  5],
       [24, 25],
       [29, 30]])