# Introduction to NumPy
---

### Import of Python Libraries
- Import library via command `import name as nametag`, e.g. `import numpy as np`
- This allows us to call all functions which are part of the library using the nametag, e.g. `np.array()` to create a numpy array or `np.sum(values)` to sum over `values`
- Official NumPy Tutorial: https://numpy.org/learn/
- Here we will just cover a couple of functions which might be of importance for our Opinion Dynamics project

In [2]:
# Import Numpy
import numpy as np

### Creation of NumPy Arrays:
- The core of NumPy are numpy arrays which essentially are matrices
- The whole library is based on using those arrays for numerics
- We can either build those arrays by hand (using specific values) or use a numpy based function

In [5]:
# 2D Matrix by Hand:
arr1 = np.array([[1,2,3],[4,5,6],[7,8,9]])

print('2D matrix of 3 row and column elements:\n', arr1, '\n')

# 6x4 Matrix of random values inside range [-10,10)
arr2 = np.random.uniform(low=-10, high=10, size=(6,4))

print('6x4 Matrix of random values inside range [-10,10):\n', arr2, '\n')

# 10 linearly spaced values in between 1a and 10 (row-vector)
arr3 = np.linspace(start=1, stop=10, num=10)

print('10 linearly spaced values in between 1a and 10:\n', arr3, '\n')

# Transform the previous row-vector to a column-vector
arr4 = arr3[:,np.newaxis]

print('Previous row-vetor as column-vector:\n', arr4, '\n')

2D matrix of 3 row and column elements:
 [[1 2 3]
 [4 5 6]
 [7 8 9]] 

6x4 Matrix of random values inside range [-10,10):
 [[-3.41414117 -8.30284261  6.10022431 -0.43188695]
 [ 3.38141135 -1.91676864  7.00523072 -5.11327225]
 [ 8.70230954 -5.88653805 -7.94001931  6.54621209]
 [ 4.48592229 -4.57851461  4.46324927  6.07227625]
 [-2.43082735 -6.95534859  0.16585593  5.59416252]
 [-2.85836037 -4.64664758  0.81735982 -0.84490079]] 

10 linearly spaced values in between 1a and 10:
 [ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10.] 

Previous row-vetor as column-vector:
 [[ 1.]
 [ 2.]
 [ 3.]
 [ 4.]
 [ 5.]
 [ 6.]
 [ 7.]
 [ 8.]
 [ 9.]
 [10.]] 



### Accessing elements in NumPy Arrays:
- Use square brackets to access an element
- As with all of Python, indexing starts at zero, i.e. element at index zero would be the first elements etc..
- Example: Use `arr1[1,2]` to get the element at row=1 and column=2
- When using `:`, we can access multiple elements at once or get portions of an array
- Example: `arr1[1:3,0:2]` to get all elements from row=1 to row=3 and column=0 to column=2. The upper bound is not included
- If we want all elements from the first or until the last index we can also just write `arr1[1:,:2]` which would be the same as above
- When trying to access a non-existing index, an error is raised.

In [7]:
print(arr1[1:3,0:2], '\n')
print(arr1[1:,:2], '\n')
print(arr2[2:,:], '\n')

[[4 5]
 [7 8]] 

[[4 5]
 [7 8]] 

[[ 8.70230954 -5.88653805 -7.94001931  6.54621209]
 [ 4.48592229 -4.57851461  4.46324927  6.07227625]
 [-2.43082735 -6.95534859  0.16585593  5.59416252]
 [-2.85836037 -4.64664758  0.81735982 -0.84490079]] 



### Numerics with NumPy Arrays:
- Numpy arrays are always combined element-wise
  - `arr_a + arr_b` will add element-wise
  - `arr_a * arr_b` will multiply element-wise
  - `arr_a @ arr_b` will perform a matrix multiplication

In [17]:
arr_a = np.random.randint(low=-5,high=5,size=(3,3))
arr_b = np.random.randint(low=-5,high=5,size=(3,3))

print("Array a:\n", arr_a, "\n")
print("Array b:\n", arr_b, "\n")
print("Array a + Array b:\n", arr_a+arr_b, "\n")
print("Array a * Array b:\n", arr_a*arr_b, "\n")
print("Array a @ Array b:\n", arr_a@arr_b, "\n")

Array a:
 [[-1  1 -5]
 [ 0 -3 -5]
 [-3 -4 -4]] 

Array b:
 [[-1 -5 -5]
 [-1  1  1]
 [ 4 -4 -1]] 

Array a + Array b:
 [[ -2  -4 -10]
 [ -1  -2  -4]
 [  1  -8  -5]] 

Array a * Array b:
 [[  1  -5  25]
 [  0  -3  -5]
 [-12  16   4]] 

Array a @ Array b:
 [[-20  26  11]
 [-17  17   2]
 [ -9  27  15]] 



- Some useful numpy functions:
  - `np.sum(arr)` to calculate the sum across all elements in `arr`
  - `np.mean(arr)` to calculate the mean across all elements in `arr`
  - `np.abs(arr)` to return the absolute value of each element in `arr`
  - `arr.shpae` to return the number of elements per dimension of `arr`, e.g. if `arr` is 2D, calling `arr.shape[0]` would return number of rows and `arr.shape[1]` number of columns
  - `np.random.choice(arr)` randomly sample a value from `arr`
  - `np.where(condition)` returns all indices where condition is True, e.g. `np.where(arr==1)` would return all indices where elements of `arr` are 1
- Most function allow to define the `axis` argument which allows us to specify over which dimension we want to apply the function
  - If `arr` is a 2D array, `np.sum(arr, axis=0)` sums along rows
  - If `arr` is a 2D array, `np.mean(arr, axis=1)` averages along columns

In [23]:
print("Array a:\n", arr_a, "\n")
print("Sum along rows:\n", np.sum(arr_a,axis=0), "\n")
print("Mean along columns:\n", np.mean(arr_a,axis=1), "\n")

Array a:
 [[-1  1 -5]
 [ 0 -3 -5]
 [-3 -4 -4]] 

Sum along rows:
 [ -4  -6 -14] 

Mean along columns:
 [-1.66666667 -2.66666667 -3.66666667] 



### Functions in Python
- In General it is adventageous to summarize mutiple lines of code in function and seperate your code in different concerns
- Using `def function_name(argument1, argument2):` to define a function called `function_name` which takes the arguments `argument1` and `argument2`
- The code executed when calling the function is inside the function body
- If the funciton should return a value e.g. `x`, finish the function body via `return x`

In [27]:
def return_where_value_matches(arr1, arr2, value):
    """Function which returns all elements of arr1 at which arr2 equals values"""

    idx     = np.where(arr2==value)
    arr3    = arr1[idx]

    return arr3

shape   = (10,10)
arr_1   = np.random.randint(low=-5,high=5,size=shape)
arr_2   = np.random.randint(low=-5,high=5,size=shape)

print("Array 1:\n", arr_1, "\n")
print("Array 1:\n", arr_2, "\n")

arr3    = return_where_value_matches(arr1=arr_1, arr2=arr2, value=)

Array 1:
 [[-2  4 -1 -4 -2 -5 -1 -3  0 -2]
 [-5 -5  3 -2  3  3 -5 -5  4 -1]
 [ 1 -1  0 -3  4 -3  4 -5  1  3]
 [ 0 -5  3 -2 -4 -5 -1 -5  2  2]
 [ 0  4 -2  3  0 -3  3  2  0 -4]
 [ 2 -3 -4  0 -2 -2  1 -5  3  4]
 [ 1 -2  3 -1  2 -3  3  4  1 -5]
 [ 4 -5  4 -2  4 -3  4  0 -1  3]
 [-5 -4  2 -2  3  2 -3 -3  0 -5]
 [ 3  4  3  3  3  4  0  1  1  3]] 

Array 1:
 [[-2  0  0  1  0  1  1 -5 -5  2]
 [ 1  2  4  0 -2  0  1  2 -3 -5]
 [-4  1 -5 -2 -3 -3 -5  3  4  3]
 [ 4 -4 -1  4  4 -5  1  0 -5  2]
 [ 2  3  4 -4 -4 -1 -2  2  4  1]
 [ 1  0  4  4 -4 -2 -3  4 -2 -1]
 [-4 -2 -1  0  2 -5  1  2 -2  4]
 [ 3 -3 -4  2 -1 -5  4  0  1  2]
 [ 3 -2  1 -5  0 -4  4  3 -3 -5]
 [ 4 -3  1  2 -5 -2 -2 -2 -4 -1]] 



In [5]:
# Eine Liste kann man, wie bereits oben gezeigt, auch immer in ein Array umwandeln
# Was im Falle der obigen Liste gefüllt mit 2D Arrays zu einem 3D Array führt
my_arrays_in_an_array = np.array(my_arrays_in_a_list)
my_arrays_in_an_array

array([[[-0.36649819],
        [-4.40928346],
        [-4.70884276],
        [-3.91577457],
        [-2.44127063]],

       [[ 0.24132285],
        [ 0.60454247],
        [-2.48688499],
        [-3.61759201],
        [-4.13802158]],

       [[-4.45419864],
        [-3.73258923],
        [-0.61330219],
        [-1.40141049],
        [ 0.73043485]],

       [[-0.32801036],
        [-1.11066146],
        [-3.5702478 ],
        [-1.02153367],
        [-2.099118  ]],

       [[-2.29326359],
        [-3.66110123],
        [-0.66053178],
        [-3.80032446],
        [-0.38227265]]])

In [6]:
# Shape eines Arrays (Anzahl Elemente in jede Dimension):
my_arrays_in_an_array.shape

(5, 5, 1)

Hier noch ein paar nützliche Funktionen

In [7]:
# 2D Array (4 Reihen, 4 Spalten) mit gleichverteilten Werten zwischen -1 und 10
arr5 = np.random.uniform(low=-1, high=10, size=(4,4))

# 2D Array (4 Reihen, 1 Spalten) mit normalverteilten Werten mean = 0, standardabweichung 2
arr6 = np.random.normal(loc=0, scale=2, size=(4,1))

# Matrix Multiplikation
arr7 = arr5 @ arr6
arr7

array([[-33.47082857],
       [-39.39258223],
       [-31.71170567],
       [-40.45936034]])

In [8]:
# Mathematische Operation werden element-wise durchgeführt
arr7 = arr5 * arr6
arr7

array([[-19.3841036 , -14.09295055, -11.6890637 , -10.23926606],
       [ -5.85098293, -22.26530662, -24.71433019, -22.18874597],
       [ -0.72639109,  -5.04818322,  -3.15675094,  -1.90796575],
       [ -2.40080135,  -3.53694958,   0.30173353,  -3.23663142]])

In [9]:
# Maximum
print("Max of arr5:\n", np.max(arr5))

# Minimum
print("Max of arr5:\n", np.min(arr5))

# Mean
print("Max of arr5:\n", np.mean(arr5))

# Standard Deviation
print("Max of arr5:\n", np.std(arr5))

# Absolute
print("Abs of arr5:\n", np.abs(arr5))

# Sum
print("Sum of arr5:\n", np.sum(arr5))

# Round
print("Round arr5:\n", np.round(arr5,2))

Max of arr5:
 9.896819123589744
Max of arr5:
 -0.6788937916997706
Max of arr5:
 5.40192578413196
Max of arr5:
 3.0159138342327587
Abs of arr5:
 [[5.90689425 4.29452764 3.56199413 3.12019906]
 [2.34301797 8.91611104 9.89681912 8.88545244]
 [1.30963561 9.10154403 5.69141539 3.4399374 ]
 [5.40175016 7.95805867 0.67889379 7.28234942]]
Sum of arr5:
 86.43081254611135
Round arr5:
 [[ 5.91  4.29  3.56  3.12]
 [ 2.34  8.92  9.9   8.89]
 [ 1.31  9.1   5.69  3.44]
 [ 5.4   7.96 -0.68  7.28]]


In [10]:
# Um Funktionen entlang einer Reihe oder Spalte durchzuführen verwende das axis argument
print("Array:\n", arr2)

# Mean jeder Spalte
print("Mean Spalte:\n", np.mean(arr2,axis=0))

# Mean jeder Reihe
print("Mean Reihe:\n", np.mean(arr2,axis=1))

Array:
 [[-4.72145136 -8.72371301  3.50358448  6.56120278]
 [ 4.84222374  2.7209878   4.70053277  5.41546621]
 [ 2.13844765  8.75570377 -6.88148575  5.57440398]
 [ 1.29635062  0.13240471  0.17978886  1.1950813 ]
 [-5.89621164  8.43250484 -7.15642436 -0.78529914]
 [ 7.61674607  5.43882763 -3.76713469  6.04573467]]
Mean Spalte:
 [ 0.87935085  2.79278596 -1.57018978  4.0010983 ]
Mean Reihe:
 [-0.84509428  4.41980263  2.39676741  0.70090637 -1.35135757  3.83354342]


Übunsaufgabe:\
Erstelle eine Funktion die eine AR(p) Zeitreihe generiert.\
    $ AR(p) : Y_t = \sum_{t_0=1}^p \phi_{t_0} Y_{t-t_0} + \epsilon_{t} $ \
Die Funktion sollte die Länge der Zeitreihe (length), die Anzahl an betrachteten lags (p) und die Parameter (phi) als Argumente aufnehmen.\
Für phi sollen die Werte als Liste, also `[phi1, phi2, phi3, ...]` angegeben werden, wobei die Länge der Liste über p festgelegt ist.\
Speichere die Werte für `phi=[0.2, 0.1, -0.6, 0.0, -0.4, 0.3]`, `p=6`, und `length=1000` ab.\
Speicher array via `np.savetxt(fname='time_series.csv', X=ts_array)` ab. Das erste Argument, gibt den Pfad + Filename an, das zweite wäre das zu spechernde array.

Übunsaufgabe:\
Erstelle eine Funktion die eine MA(q) Zeitreihe generiert.\
    $ MA(q) :  Y_t = \mu + \sum_{t_0=1}^q \theta_{t_0} \epsilon_{t-t_0} + \epsilon_t $ \
Die Funktion sollte die Länge der Zeitreihe (length), die Anzahl an betrachteten lags (p) und die Parameter (theta) als Argumente aufnehmen.