In [8]:
import numpy as np

# Applied Neural Networks - Exercises

**NOTICE:**
1. You are allowed to work in groups of up to three people but **have to document** your group's\
 members in the top cell of your notebook.
2. **Comment your code**, explain what you do (refer to the slides). It will help you understand the topics\
 and help me understand your thinking progress. Quality of comments will be graded.
3. **Discuss** and analyze your results, **write-down your learnings**. These exercises are no programming\
 exercises it is about learning and getting a touch for these methods. Such questions might be asked in the\
 final exams.
 4. Feel free to **experiment** with these methods. Change parameters think about improvements, write down\
 what you learned. This is not only about collecting points for the final grade, it is about understanding\
  the methods.

### Exercise 1 - Data Normalization and Standardization


**Summary:** In this exercise you will implement the min-max normalization and standardization and compare it to\
sklearn's implementation. It is important to remember, that we always normalize or standardize for all samples\
 over a single feature dimension.


**Provided Code:** In the cell below I have provided you with a sample code to initialize some dummy data.\
The parameter ```n_samples``` defines the number of samples we have in the training set (the number of $x_i$)\
while ```n_features``` defines the number of dimensions of each sample feature vector.


**Your Tasks in this exercise:**
1. Implement the MinMax Normalization and Standardization.
2. Use the ```MinMaxScaler``` and ```StandardScaler``` from sklearn to verify your results.


In [9]:
from sklearn.datasets import make_regression
from sklearn.preprocessing import MinMaxScaler, StandardScaler

x,y = make_regression(n_samples=10, n_features=5)

In [21]:
x

array([[ 1.13586403, -0.10179095,  0.36380753, -1.28729935, -1.40260827],
       [ 0.29280375, -1.8521821 , -1.49523347, -0.13074253, -1.87733858],
       [ 1.092428  ,  0.73758805, -0.81709332,  0.08346904, -0.39307965],
       [-1.86501077,  0.1742972 ,  0.38576889,  0.24531084, -0.80732322],
       [ 1.10589195, -0.77424238,  0.34495833, -0.2434372 , -0.12753742],
       [-0.01870997,  1.63154691, -0.70545114, -0.08225763,  1.4592431 ],
       [ 0.82476867,  0.01529071,  1.26427446,  0.58641108,  0.56616575],
       [-0.41810317,  0.8286818 ,  4.28627745, -1.08418892,  0.1920402 ],
       [-0.50834768, -0.17360891, -1.27929847,  0.61060796, -0.48145071],
       [ 1.76410365,  0.47255802,  0.81613702,  1.23262023,  0.95300645]])

## Normalize

In [27]:
# homemade
def min_max_normalize(data):
    return (data - np.min(data, axis=0))/(np.max(data, axis=0) - np.min(data, axis=0))

In [32]:
# scikit
scaler_minmax = MinMaxScaler()
scaler_minmax.fit(x)

print(scaler_minmax.data_max_)

[1.76410365 1.63154691 4.28627745 1.23262023 1.4592431 ]


In [50]:
# test
normalized_homemade = min_max_normalize(x)
normalized_scikit = scaler_minmax.transform(x)

In [51]:
normalized_homemade

array([[0.826889  , 0.50244756, 0.32154934, 0.        , 0.14228044],
       [0.59458432, 0.        , 0.        , 0.45896577, 0.        ],
       [0.81492023, 0.74339024, 0.11729462, 0.54397307, 0.44484418],
       [0.        , 0.58169832, 0.32534789, 0.60819806, 0.32069209],
       [0.81863022, 0.30942123, 0.31828908, 0.41424423, 0.52442929],
       [0.50874692, 1.        , 0.13660483, 0.47820642, 1.        ],
       [0.74116689, 0.53605571, 0.47729875, 0.74355962, 0.73233763],
       [0.3986944 , 0.76953859, 1.        , 0.08060195, 0.62020924],
       [0.37382759, 0.48183231, 0.03734923, 0.75316186, 0.41835867],
       [1.        , 0.66731371, 0.39978658, 1.        , 0.84827686]])

In [52]:
normalized_scikit

array([[0.826889  , 0.50244756, 0.32154934, 0.        , 0.14228044],
       [0.59458432, 0.        , 0.        , 0.45896577, 0.        ],
       [0.81492023, 0.74339024, 0.11729462, 0.54397307, 0.44484418],
       [0.        , 0.58169832, 0.32534789, 0.60819806, 0.32069209],
       [0.81863022, 0.30942123, 0.31828908, 0.41424423, 0.52442929],
       [0.50874692, 1.        , 0.13660483, 0.47820642, 1.        ],
       [0.74116689, 0.53605571, 0.47729875, 0.74355962, 0.73233763],
       [0.3986944 , 0.76953859, 1.        , 0.08060195, 0.62020924],
       [0.37382759, 0.48183231, 0.03734923, 0.75316186, 0.41835867],
       [1.        , 0.66731371, 0.39978658, 1.        , 0.84827686]])

In [70]:
np.mean(np.absolute(normalized_homemade - normalized_scikit)) < 0.00001

np.True_

## Standardize

In [42]:
def standardize(data):
    return (data - np.mean(data, axis=0))/np.sqrt(np.var(data, axis = 0))

In [62]:
standardized_homemade = standardize(x)

In [63]:
scaler_standard = StandardScaler()
scaler_standard.fit(x)


In [65]:
standardized_scikit = scaler_standard.transform(x)

In [66]:
standardized_homemade

array([[ 0.7822393 , -0.21921814,  0.02993001, -1.77238086, -1.23743527],
       [-0.04698096, -2.16106121, -1.14411177, -0.17136454, -1.72264079],
       [ 0.73951633,  0.71196933, -0.71584533,  0.12516756, -0.20563082],
       [-2.16937194,  0.08706761,  0.04379929,  0.34920443, -0.62901491],
       [ 0.75275926, -0.96522005,  0.01802617, -0.32736725,  0.06577076],
       [-0.35338075,  1.70370642, -0.64533984, -0.10424711,  1.6875645 ],
       [ 0.47625101, -0.08933047,  0.59860268,  0.82138793,  0.77478088],
       [-0.74621736,  0.81302661,  2.50709087, -1.49121602,  0.39240004],
       [-0.83498037, -0.2988913 , -1.00774214,  0.85488355, -0.29595184],
       [ 1.40016549,  0.41795121,  0.31559006,  1.71593229,  1.17015745]])

In [67]:
standardized_scikit

array([[ 0.7822393 , -0.21921814,  0.02993001, -1.77238086, -1.23743527],
       [-0.04698096, -2.16106121, -1.14411177, -0.17136454, -1.72264079],
       [ 0.73951633,  0.71196933, -0.71584533,  0.12516756, -0.20563082],
       [-2.16937194,  0.08706761,  0.04379929,  0.34920443, -0.62901491],
       [ 0.75275926, -0.96522005,  0.01802617, -0.32736725,  0.06577076],
       [-0.35338075,  1.70370642, -0.64533984, -0.10424711,  1.6875645 ],
       [ 0.47625101, -0.08933047,  0.59860268,  0.82138793,  0.77478088],
       [-0.74621736,  0.81302661,  2.50709087, -1.49121602,  0.39240004],
       [-0.83498037, -0.2988913 , -1.00774214,  0.85488355, -0.29595184],
       [ 1.40016549,  0.41795121,  0.31559006,  1.71593229,  1.17015745]])

In [68]:
np.mean(np.absolute(standardized_homemade - standardized_scikit)) < 0.00001

np.True_

### Exercise 2 - Softmax

**Summary:** In this exercise you will implement the softmax activation using the naive and numerically\
more stable log-sum variation.


**Provided Code:** In the cell below there is some sample code that generates sample inputs.


**Your Tasks in this exercise:**
1. Implement the softmax function using the naive approach.
2. Implement the softmax function using the log-sum trick.
3. Compare your two implementations for numerical stability\
(experiment with different values of std) and verify
your results using ```tf.nn.softmax```



In [None]:
import numpy as np
import tensorflow as tf

mu = 0
std = 10
xi = mu + std * np.random.randn(10)

### Exercise 3 - Chess Endgames

**Summary:** In this exercise your task is to predict the optimal depth-of-win for white in   
chess-endgames. In particular, we will focus on **king-rook** vs. **king** endgames. The   
possible outcomes are either a **draw** or a **number of moves** for white to win (0 to 16).


**Provided Code:** The code below loads the original (*unprepared*) raw dataset.   
You will have to prepare it accordingly to be used with neural nets.

The structure of each row in the dataset is:
1. White King column (a-h)
2. White King row (1-8)
3. White Rook column (a-h)
4. White Rook row (1-8)
5. Black King column (a-h)
6. Black King row (1-8)
7. Optimal depth-of-win for White in 0 to 16 moves or a draw


**Your Tasks in this exercise:**
1. Train a neural net to predict the depth-of-win (or draw) given a board position
    * You will have to prepare your data accordingly to make it compatible   
    with neural nets. Think about input and output encodings, normalization or standardization.
    * Decide how you will model this problem as either regression or classification task.
    * Build a fully connected neural net with appropriate configuration and loss and train it.
    * Use appropriate cross-validation for training and validation (it is enough to use two datasets)
2. Explain in writing:
    * How and why did you prepared the data?
    * How did you model the problem task?
    * What is your neural network architecture/configuration/loss?
    * Plot your loss while training.
    * Interpret and explain your results.
    



In [None]:
!wget https://github.com/shegenbart/Jupyter-Exercises/raw/main/data/chess_endgames.pickle -P ../data
import pickle
with open('../data/chess_endgames.pickle', 'rb') as fd:
    chess_endgames = pickle.load(fd)


Der Befehl "wget" ist entweder falsch geschrieben oder
konnte nicht gefunden werden.
