Skip to content
Karl Sundequist Blomdahl edited this page Feb 21, 2022 · 19 revisions

Welcome to the dream-go wiki!

Neural Network Architecture

Since the neural network architecture suggested by DeepMind seems to have huge trouble identifying Life & Death correctly, and seems to have a tendency to memorize sequences of play instead of generalizing. We suggest an alternative set of input features that seeks to include more domain information, and helps to solve some of these problems. Some of these features are inspired by the KataGo features, but with our own spin.

Input features

I suggest the following input features:

  • Global properties
    • If the current player is black
    • If the current player is white
    • Komi (-1 to +1 with 7.5 representing the maximum recognized komi)
  • Vertex properties
    • Is corner (vertex has max 2 liberties)
    • Is edge (vertex has max 3 liberties)
    • Is middle (vertex has max 4 liberties)
  • History (Last 8 moves)
    • Player
      • Is ladder capture if played
      • Is ladder escape if played
      • Is filled
    • Opponent
      • Is ladder capture if played
      • Is ladder escape if played
      • Is filled
    • Liberties
      • Has 2 liberties
      • Has 3 liberties

Data augmentation

During training we perform two independent augmentations for the data:

  • Symmetry, for each training example a random symmetry is picked and applied to the example (and answer).
  • History Erasure, each training example has a chance to get its history features erased:
    • 10% change to get its history features erased.

This totals up to a 16x data augmentation.

Adversarial networks

There are some features of the game that we do not want the network to discover, as they represent bias in the training data. These properties include:

  1. Whether the current player is black, or white based on the win rate.

We add an adversarial network in order to avoid the network learning these properties. The training procedure is roughly the following [1]:

  • 3x We train the adversarial network to minimise the loss where the network can accurately predict the given properties.
  • 1x We train the main network to minimise its loss, and maximise the loss where the network can accurately predict the given properties.