# Welcome to Machine Learning -- An Interdisciplinary Introduction

In our exercises, we will introduce and utilize several machine learning techniques.


## Introduction
### Simple Data Structures
This first JuPyter notebook explains the basics of machine learning, which is a Multi-Dimensional Array structure.
Many of you are surely familiar with scalar value. 
These are just single numbers that have a specific value.


### Array Operations
All mathematical operations can be applied to matrices, and we can use the default python operators `+, -, *, /, //, %, **`.
You can also modify matrices inplace using `+=, -=, /=, //=, %=, **=`.
By default, all operations are just done element-wise.
This is only possible when the dimensionality of the operands are identical:

In [None]:
scalar = 0.5

In Python, different data structures are used regularly.
One of these structures is a list, which can contain various different Python objects.
Note that in python, everything is an object, even imported modules are objects that can be put into a list:

In [None]:
import math
values = [0, 1., 1e-4, True, "Wednesday", None, math]

In machine learning, we typically rely on numerical data only; how other types of data are handled will be discussed later in this lecture series.
Any list of numerical values can be represented as a mathematical vector:

In [None]:
vector = [0.,1.,2.,3.,4.]

Now, since a list can contain any type of objects, lists can also contain lists, in which case we talk about nested lists.
We define a nested list of lists with different length:

In [None]:
nested = [
  [0.,1.,2.,3.,4.],
  [6.,7.,8.,9.],
  [11.,12.,13.]
]

When all of the nested lists have the same number of element (the same dimensionality), they build the mathematical concept of a matrix:

In [None]:
matrix = [
  [0.,1.,2.,3.,4.],
  [6.,7.,8.,9.,10.],
  [11.,12.,13.,14.,15.]
]

In the above example, the matrix has three rows and five columns.
Similarly, we can extend this idea to build a tensor, which is basically adding another layer of nesting:

In [None]:
tensor = [
  [[0.,1.], [2.,3.], [4.,5.], [6.,7.]],
  [[8.,9.], [10.,11.], [12.,13.], [14.,15.]],
  [[16.,17.], [18.,19.], [20.,21.], [22.,23.]]
]

The above tensor has the dimensionality of $3\times4\times2$.
Factually, also the scalars, vectors and matrices are specializations of tensors with zero, one or two levels.
Furthermore, tensors are not limited to three levels, but larger structures are also possible -- for example, deep learning systems usually use four or five level tensors.

## Task 1

Obtain the dimensionality of the above data structures.
The length of a list can be obtained using the `len` function in python.
To index an element in python, you can use the index operators, e.g., `values[1]`.
For the vector, the matrix and the tensor, obtain the dimensionality of these data structures.

In [None]:
# print the length of the vector
print(...)

# print the dimensionality of the matrix
print(..., ...)

# print the dimensionality of the tensor
print(..., ..., ...)


### Numpy Arrays
While it is possible to work with these types of nested lists, it is often easier and faster to work with a dedicated data structure for numerical data.
Such a data structure is provided in the `torch` python library.
The basic data structure represents any multidimensional mathematical tensor, therefore the data structure is called `torch.tensor`.
A `torch.tensor` is a Python class and can be constructed in various different ways.

The easiest way of defining `torch.tensor`s is to construct them from data:

In [None]:
import torch
my_data = torch.tensor([
  [-1., 2., 3],
  [3., -4., 1e-4]
])

We can print the contents of the array using pythons built-in `print` function.
We can also obtain the dimensionality of the data by asking for its `shape`:

In [None]:
print(my_data)
print(my_data.shape)

## Task 2

Create `torch.tensor`s for all our data structures from above.
Print the shapes of all structures.
What happens to our nested list above?

In [None]:
# turn all of our data structures into torch tensors
torch_scalar = ...
torch_vector = ...
torch_matrix = ...
torch_tensor = ...

# print their shapes
...

# what about nested arrays of different lengths?
torch_nested = ...

Another way of defining `torch.tensor`s is by defining their `shape`.
The function to create an empty array is the `torch.empty` function.
A specific data type can be specified using the `dtype` argument, which can take various types, such as `float` (the default), `int`, `bool`, `complex` and alike.
The data is usually **uninitialized** and contain anything that resides in that region of the RAM:

In [None]:
empty_tensor = torch.empty((4,7), dtype=torch.float32)
print(empty_tensor)

int_tensor = torch.empty((2,3), dtype=torch.uint8)
print(int_tensor)

bool_tensor = torch.empty((2,3), dtype=torch.bool)
print(bool_tensor)

In order to initialize the data, you can use various functions.
For examples, you can initialize with `0`, or `1`:

In [None]:
zero_matrix = torch.zeros((3,4))
print(zero_matrix)

one_matrix = torch.ones((2,4,3), dtype=torch.int)
print(one_matrix)

Another option is to create random data using functionality from `torch`.
Random values between 0 and 1 are obtained using `torch.rand`.

In [None]:
random_tensor = torch.rand((2,4))
print(random_tensor)

Normal distributed values can be obtained with `torch.normal` where one can specify the mean and the standard deviation of the values.
For example, if you want to have 6 two-dimensional normal distributed vectors with `mean` $(-3,1)$ and standard deviation `std` of $(2,1)$, you can write:

In [None]:
normal_tensor = torch.normal(mean = torch.tensor([(-3,1.)]*6), std = torch.tensor([(2.,1.)]*6))
print(normal_tensor)

More random distributions can be found in the `torch.distributions` package, for more information check the following [link](https://pytorch.org/docs/stable/distributions.html)

In [None]:
normal_tensor = torch.distributions.normal.Normal(loc=torch.tensor((-3.,1.)), scale=torch.tensor((2.,1.))).sample([6])
print(normal_tensor)

bernoulli_tensor = torch.distributions.bernoulli.Bernoulli(probs=0.5).sample([4,3])
print(bernoulli_tensor)

Tensors can also be combined using `torch` functionality.
For example, `torch.concat` will concatenate two tensors (where the all but the first level must be identically shaped), while `torch.stack` will produce a new dimension (all dimensions must be identically shaped):

In [None]:
concatenated = torch.concat((torch.zeros((2,4)), torch.ones((3,4))))
print(concatenated)

stacked = torch.stack((torch.zeros((2,4)), torch.ones((2,4))))
print(stacked)

### Tensor Operations
All mathematical operations can be applied to tensors, and we can use the default python operators `+, -, *, /, //, %, **`.
You can also modify tensors inplace using `+=, -=, /=, //=, %=, **=`.
By default, all operations are just done element-wise.
This is possible when the dimensionality of the operands are identical:

In [None]:
negatives = torch.zeros((3,2)) - torch.ones((3,2))
print(negatives)

Mathematical operations can also be **broadcasted**, i.e., dimensionalities are automatically adapted.
The most simple broadcasting is via a scalar, but more complicated broadcasts can be done.

In [None]:
random_range = torch.rand((4,2)) * 10. - 5.
print(random_range)

mixes = torch.ones((3,2)) * torch.tensor([-1,3])
print(mixes)

Comparison of tensors can be done using the default comparison operators `>, >=, <, <=, ==`.
These operations are applied element-wise and result in boolean tensors.
If you want to reduce these to a single value (to be used in an `if` condition), you can use the `all` or `any` functions.
Broadcasting applies for comparison operators as well:

In [None]:
random_positives = random_range > 0
print(random_positives)
print(random_positives.all())
print(random_positives.any())

Also, other mathematical operations are applied element-wise:

In [None]:
x = torch.tensor([1.,2.,3.,4.,5.])
print(torch.exp(x))
print(torch.sin(x))
print(torch.log(x))

### Indexing Arrays
Arrays can be indexed in several ways, typically using the index `[]` operator.
To obtain a certain value, you can specify the exact index.
Indexing starts at 0 and negative indexes work similarly as with python lists.
Other than for python lists, indexes can contain tuples:

In [None]:
print(normal_tensor[3][1])
print(normal_tensor[3,1])

As you can see, when using the index operator, a scalar value is returned, which is defined as a `torch.tensor` with a single value.
If you want to obtain the raw data, you can use the `.item()` function on the tensor; note that this only works for scalars, not for vectors, matrices or higher-order tensors.

In [None]:
print(normal_tensor[3,1].item())

You can also use boolean arrays for indexing.
Note that also integral arrays are allowed, but results might be different from what you expect.

In [None]:
random_range[random_positives] *= -1
print(random_range)
print(torch.all(random_range <= 0))

When indexing arrays with fewer entries than levels, sub-arrays are returned.
Indexing always starts at the first level, missing levels are assumed to be all elements:

In [None]:
print(torch_matrix[0])
print(torch_matrix[1])
print(torch_tensor[0])
print(torch_tensor[0,2])

You can also use `start:end` to define ranges, where `:` just represents all elements as this level. 
Similarly, you can define `start:end:step` to define slices.
This can be useful, when you actually want to index the second level only.
Note that negative `step`s are not allowed in PyTorch (in `numpy`, they are allowed).

In [None]:
print(torch_vector[1:3])
print(torch_matrix[:,1])
print(torch_tensor[-1,0:2:2,::2])

## Task 3
Generate a random array in dimension `(5,4)`, where the values are in range $[-2, 4]$.
Multiply all negative values with 4, divide all positive values by 4.
Print the array.
Compute and print the sum of all elements in the array.

In [None]:
# generate random tensor in range [-2,4] with shape (5,4)
random_array = ...
# multiply negative values with 4
...
# divide positive values by 4
...

# print the tensor
...

# compute and print the sum of the elements in the tensor

### Sorting Tensors
Several times, you will need to sort your tensors ascendingly.
Similarly to the `sorted` function in python, you can sort an array using `torch.sort`.
Note that this function returns two values: the sorted tensor, and the indexes that -- applied to the original unsorted tensor -- will provide the sorted tensor:

In [None]:
unsorted = torch.tensor([-4, 3, 7, -5, 0, 6, 4])
sorted_array, sorted_indexes = torch.sort(unsorted)
print(sorted_array)
print(sorted_indexes)
print(unsorted[sorted_indexes])

### Matrix Operations
As mentioned above, by default all tensor operations are performed element-wise.
Mathematical matrix operations can be applied using `torch` functionality.
For example, the dot product between two tensors (matrices, vectors) can be performed via `torch.matmul`.
Vector operations are, for example, `torch.inner`, `torch.outer`.
Matrices can be transposed using `torch.transpose` or quickly the `.T` member.

In [None]:
vector = torch.rand((4))
matrix = torch.rand((3,4))
print(torch.matmul(matrix, vector))
print(torch.matmul(vector, matrix.T))

We can also reduce matrices by computing min, max, mean and standard deviation across a specific dimension `dim`.
Please note that `torch.max` additionally returns the indexes across the given `dim` when a `dim` is specified.

In [None]:
normal_array = torch.distributions.normal.Normal(torch.tensor((-3,4.)), torch.tensor((6.,7.))).sample([1000])
print(torch.min(normal_array))
print(torch.max(normal_array, dim=0))
print(torch.mean(normal_array, dim=0))
print(torch.std(normal_array, dim=0))

We can also quickly compute norms of vectors and matrices, or invert matrices:

In [None]:
import scipy.spatial
print(torch.norm(vector).item())
print(torch.inverse(torch.tensor([[2.,0.], [0.,-2.]])))


 Since PyTorch is optimized for parallel processing, distance computations are not as straightforward.
 We first need to instantiate a distance object, and then we can compute distances: 

In [None]:
euclidean = torch.nn.PairwiseDistance(p=2)
print(euclidean(vector, vector-2).item())

cosine = torch.nn.CosineSimilarity(dim=0)
print(cosine(vector, matrix[0]).item())

### Plotting with matplotlib

Plotting is usually done with `matplotlib`, which typically handles `torch.tensor`s as expected.
Here, we describe only standard functionality, much more can be done otherwise.
All plots in the slides are generated with `matplotlib`.

### Line plotting
There are several ways of plotting a line in `matplotlib`.
A straight line can be defined by a start and an end point.
The `pyplot.plot` function will take all x-positions and y-positions in a list/array/tensor.
How the line is plotted is defined using a `format` option, which combines color `rgbkmcy` and style `.+*xosd-:`.
A blue line is plotted with format `b-`:

In [None]:
from matplotlib import pyplot
two_points = torch.tensor([
  (-1,1),
  (2,-2)
])
pyplot.plot(two_points[0], two_points[1], "b-")

Generally, `matplotlib` plots points connected with straight lines. 
To be more fine-grained, use more offset positions.

In [None]:
x = torch.arange(0,6.001,2)
pyplot.plot(x,torch.exp(x), "r-")

x = torch.arange(0,6.001,0.01)
pyplot.plot(x,torch.exp(x), "g-")

You can also add labels, axis labels, and more:

In [None]:
pyplot.plot(torch.cos(x), "g:", label="cos")
pyplot.plot(torch.sin(x), "r--", label="sin")
pyplot.legend()
pyplot.xlabel("x")
pyplot.ylabel("y")

### Plotting Points
Points can be plotted in various ways. 
For example, you can select a different marker instead of a line:

In [None]:
x = torch.arange(0, 6.01, 0.5)
pyplot.plot(x, x**2, "gx", label="$x^2$")
pyplot.plot(x, 2**x, "mo", label="$2^x$")
pyplot.legend()
pyplot.xlabel("x")
pyplot.ylabel("y")

Note that the points do not need to be related in any way:

In [None]:
points = torch.rand(size=(100,2))
pyplot.plot(points[:,0], points[:,1], "rs")

## Task 4
Create equidistant and sorted input values $-4\leq x \leq4$. (Hint: as shown prior you can do this with [torch.arange()](https://pytorch.org/docs/stable/generated/torch.arange.html))

Compute the function $y = f(x)=2 x^2 + 10\cos(x)$.

Plot the function $(x,y)$ as a green line plot and label `function`. (Hint: look how prior plots where done with matplotlib)

Add noise to your output $t = y + \mathcal N(0,3)$.

Plot the points $(x,t)$ as separate points with start shape and red color and label `points`.

In [None]:
# create equidistant input values x in range [-4, 4]
x = ...
# compute function y = 2x^2 + 10 cos(x)
y = ...

# plot function by providing points for x and y, with the label "function"
...

# select our targets t to be the function y with some additional Gaussian-distributed noise with mean 0 and std 3.
t = y + ...

# plot the points with a star marker with the label ("points")
...

# add the label using .legend() function
...