# Going deep and convoluted
The one trick that we haven't used yet is the use of [Convolutional Neural Network](https://en.wikipedia.org/wiki/Convolutional_neural_network) (CNN) layers. So far, we have fed the board state into the Neural Network as a 1d array. This makes it hard for the Neural Network to exploit the inherently 2D nature of the board though.

CNN layers can slide a 2D window over the board-state and thus learn about 2 dimensional patterns:

<table width=100%> <tr> <td>
<img src="https://upload.wikimedia.org/wikipedia/commons/6/68/Conv_layer.png" width="250" />
    </td><td>
<img src="https://cdn-images-1.medium.com/max/1600/0*iqNdZWyNeCr5tCkc." width="200" />
    </td></tr>
<tr> <td>
    [Source: Wikiepedia](https://en.wikipedia.org/wiki/File:Conv_layer.png)
    </td><td>
    [Source: Daphne Cornelisse "An intuitive guide to Convolutional Neural Networks"](https://medium.freecodecamp.org/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050)

</table>

And indeed, many contemporary NN approaches to learning how to play board games use CNN layers as part of their topology. The most prominent exmaple being [Alpha Go](https://deepmind.com/blog/alphago-zero-learning-scratch/) by Deep Mind. If it's good enough for them, it should be good enough for us. We'll take the NN from our previous part and simply pop a few CNN layers on top. We'll also have to change our input state encoding to be 3 2D planes, one plane each for crosses, naughts, and empty fields.

# Some other changes while we are at it
There are two more smaller changes we make at this stage. We add regularization loss and [TensorBoard](https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard) support.

## Regularization Loss
One problem Neural Networks can face is that weights get increasingly large. This in turn can cause numerical instabilities. A straight forward approach to dealing with this, is to define a loss based on how large the weights are and add this to the total loss function of the Neural Network. This gives the training step an incentive to keep weights small. 

We need to balance the regularization loss with the rest of the loss function however to avoid the Neural Network optimizing the regularization loss at the expense of a higher loss in our Q function. We don't want this.

In the code this means, we add a `kernel_regularizer` to every layer, e.g. like this:

```Python
return tf.layers.dense(input_tensor, output_size, activation=activation_fn,
                       kernel_initializer=tf.contrib.layers.variance_scaling_initializer(),
                       kernel_regularizer=tf.contrib.layers.l1_l2_regularizer(),
                       name=name)
```

We then add the regularization loss to the total loss:

```Python
self.reg_losses = tf.identity(tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES, scope=name),
                              name="reg_losses")
reg_loss = self.beta * tf.reduce_mean(self.reg_losses)

self.total_loss = tf.add(self.loss, reg_loss, name="total_loss")
```
with `self.beta` being a factor smaller than 1 to scale down `reg_loss` and prevent it from dominating the optimization process.

## TensorBoard support
The 2nd change we are making is to add support for TensorBoard. TensorBoard is a dashboard that comes with TensorFlow the helps visualizing NN graphs as well as all kinds of metrics around training and performance.

So far, the only metric we looked at as the win-loss-draw ratio out NN achieved. By using TensorBoard we might be able to get some additional insights.

In order to do this, we need to create a summary writer, write to it, and in the end close it:

```Python
sess = TFSessionManager.get_session()
writer = tf.summary.FileWriter(TENSORLOG_DIR, sess.graph)
[...]
writer.close()
```

To capture the actual metrics, we add the approptiate lines when building graph;
```Python
def build_graph(self, name: str):
    """
    Builds a new TensorFlow graph with scope `name`
    :param name: The scope for the graph. Needs to be unique for the session.
    """
    with tf.variable_scope(name):
        [...]
        self.q = tf.reduce_sum(tf.multiply(self.q_values, self.actions_onehot), axis=1, name="selected_action_q")

        tf.summary.histogram("Action Q values", self.q)

        [...]
        self.loss = tf.reduce_mean(self.td_error, name="q_loss")

        tf.summary.scalar("Q Loss", self.loss)
        [...]
        reg_loss = self.beta * tf.reduce_mean(self.reg_losses)
        tf.summary.scalar("Regularization loss", reg_loss)

        self.merge = tf.summary.merge_all()

        [...]

```

This creates a Tensor `self.merge` which we can run with the training step:

```Python
def final_result(self, result: GameResult):
    [...]
    
    summary, _ = TFSN.get_session().run([self.q_net.merge, self.q_net.train_step], feed_dict=[...])
    
    [...]
    
    if self.writer is not None:
        self.writer.add_summary(summary, self.game_counter)
        summary = tf.Summary(value=[tf.Summary.Value(tag='Random Move Probability',
                                                     simple_value=self.random_move_prob)])
        self.writer.add_summary(summary, self.game_counter)
```