# General Remarks

- Please complete all tasks (text and code) directly in this notebook
- Save the notebook with your first name and surname in the filename:  e.g. **Klausur_ThomasManke.ipynb**
- Upload the notebook to the CQ portal (and share with me)
- This test will cover three parts: Markov chains, Hidden Markov Models, Artifical Neural Networks
- Each part will need their own and sometimes overlapping packages to import (e.g numpy). Even if it is redundant, import the relevant parts explicitly at the beginnig of each part.
- Complete the code cells in their respective sections and add (concise) text, where more verbal explanations are required. Comments in the code cells are also welcome.
- Feel free to add multiple code cells if you prefer, but make sure that they stay in their respective sections
- All tasks should run on any modern laptops (with 2 GB free RAM), but make sure to switch off other resource hungry programs.

- If you encounter any technical problems, please inform me immediately !
- Deadline for submission: **30.09. 2022 15:30**


---



# Markov Chains

##  The story: A ball game

Alice, Bob and young Clemens are playing a new ball game - here are the rules:
- If Alice has the ball, she will throw a (fair 6-sided) die and keep the ball if she throws a 6, otherwise she'll pass the ball to Bob
- If Bob has the ball, he'll pass it to Alice or Clemens, based on the throw of a fair coin
- If Clemens has the ball he'll return it to the child from whom he got it 

Let's assume that this game will continue for a very long time.

At the beginning of the game, their father throws the ball to Alice or Bob.
However, he is three times more likely to throw it to Alice, and he never throws it to Clemens.


## The Tasks

Translate the story into a Markov Model. 
Optionally: add a scanned drawing of the Markov graph as jpeg file to this notebook.

- What are the states and how many states are there?
- What is the initial state distribution ? Write it down as numpy.array below.
- Write down the transition matrix as numpy.array.
- For each child, give their long-term probabilities that they hold the ball. 
- Bonus: If Alice has a biased die - will there be a unique stationary distribution?


## Your solutions

In [None]:
# import the necessary modules

# write down the necessary code and descibe it with comments


# Hidden Markov Models

## A story

The DNA of a (hypothetical) organism exists in 3 different configurations (0,1,2) that cannot be observed directly. They are, however, characterized by a specific distribution of observable nucleotides (A,C,G,T) that are emitted from each state. The state transition rates and emission rates are shown in the figure below.
<div>
   <img src="https://github.com/thomasmanke/ABS/raw/main/figures/HMM_DNA.jpg",  width="1000">
</div>


This problem can be modelled as a hidden Markov Model.










## Tasks

1. Write down the HMM parameters as numpy arrays. The initial state probability $\pi$ is not given, but you may assume that it is the stationary distribution of state transitions - calculate it and report it.

2. Using MultinomialHMM() from the hmmlearn package, set up a sequence generating model with the parameters $(\pi, P, E)$.
(Use CategoricalHMM() for hmmlearn version 0.2.8 or higher)

3.  Sample a sequence of 5000 hidden states $Z$ and the corresponding observations $X$ from the model. Use a random seed = 42 for reproducibility.
Report the first 20 pairs of hidden states and observations.


4. Calculate the logarithm of the probability $\log Pr(X)$ given the model from which you generated $X$. Why is it so low (1-2 sentences)?

5. Name two algorithms to decode the "best" possible path of hidden states $Z$ from observations $X$ and a given model. Briefly describe their different goals (2 sentences).
Run the respective function from hmmlearn to calculate 
$Z$ for both methods, given the $X$ and the current model.
Save the result as $Z_1$ and $Z_2$ and report the number of differences between $Z_1$ and $Z_2$.

6. Use the hmmlearn implementation of the Baum-Welch algorithm to determine the best parameters for the HMM model, if only $X$ is given. 
  - first define a new model that does not yet know any parameters (e.g. model_fit). 
  - You may assume that the number of hidden states is known to be 3.
  - Run 10 different fits with at most 200 iterations each, but keep only the best scoring model
  - Compare the fitted parameters with your knowlegde of parameters from the generating model for $X$. 
  - Comment on possible differences and name two ways in which you might improve the parameter fit.

## Install and load the Software

In [None]:
# install hmmlearn


In [None]:
# import required modules you need


## Your solution

In [None]:
%%script complete this cell

# numpy arrays of transition and emission matrices
P = 
E =  

# stationary distribution
pi = 

# define model
model_gen

# sample 5000 observations and hidden states from model
np.random.seed(42)
X, Z = 

# show first 20 of X and Z

# Pr(X)
print('log Pr(X) = ', ....)

# two ways to predict best path Z1 or Z2
...

# number of differences between two paths
print('differences (Z1-Z2): ', ...)

Fitting

In [None]:
# fit model from 10 different starts and keep only the best model
np.random.seed(42)

...
...

# compare fitted parameters with known parameters
print('fitted P: \n', np.round(best_model.transmat_,2))
print('known P: \n', P)
print('\n')
print('fitted E: \n'  , np.round(best_model.emissionprob_,2))
print('know E: \n', E)

# comment on possible differences and improvements

# Artificial Neural Networks




## The Data

The MINST-Fashion dataset contain a large number of (small and coarse-grained) images from fashion items. This set has been annotated with labels for both traing and test data sets.

Link: https://www.tensorflow.org/datasets/catalog/fashion_mnist

The goals is to construct a Neural Network that can predict the fashion label from a given image.

The sections below will describe the individual tasks.

## The Tasks

## Load Packages

In [None]:
# import required modules 
....

# plotting function for confusion matrix
def plot_cm(mat):
  classes = np.arange(cm.shape[0])
  plt.imshow(mat, cmap=plt.cm.Blues)
  for (j,i),label in np.ndenumerate(mat):
    plt.text(i,j,np.round(label,2),ha='center',va='center')

  plt.colorbar()
  plt.title('Confusion Matrix')
  plt.xlabel('True label')
  plt.ylabel('Pred label')
  plt.xticks(classes)
  plt.yticks(classes)
  plt.show()

## Load Data

In [None]:
mnist = tf.keras.datasets.fashion_mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# normalization
X_train, X_test = X_train / 255.0, X_test / 255.0

## Data Exploration and Preprocessing

- How many images (=samples) are included in the training data? 
- What is the shape of these images?
- How many distinct labels does it have? 
- visualize a sample image of your choice

In [None]:
# write the code to answer the above questions


## 2. Define Model and Learning Strategy

Construct an artifical neural network with

- an input layer that takes the proper shape of images
- a dense layer with 128 nodes including a 'ReLu' activation function for non-linear mapping 
- an output layer corresponding to the number of classes in the problem and a softmax activation function

Use the Adam optimizer and define a suitable loss function.
Make sure that during the learning process you will track a) the loss for multiclass classification and b)'sparse_categorical_accuracy' as additional metrics.

Summarize the model. How many parameters does it have?

In [None]:
...


## Fit the Model

Fit the model to the training data for 15 epochs - 
use 10% of the training data for validation.

Once the fit is finished you may save the model.

In [None]:
... fit ...

... optional: save model ...

## Evaluate the Model

Plot the history of loss and accuracy for the training and validation set and compare it the same metrics obtained (after fitting) for the **test data**.

Are there any indications for overfitting - explain this briefly (1-2 sentences).

In [None]:
# evaluate the model

# plot history of losses and accuracy


## Inspect predictions

Inspect the test image with index 43 and compare the predicted label with the true label.

Compare all predicted label from the test set with all true labels - you may want to use the plot_cm() funcion defined above.

In [None]:
id=43


## Suggestions for improvements

Make suggestions for possible improvements to the model and the fitting process

- 
- 
-

