Note: Does not cover NEAT, only the extensions of HyperNEAT to it.

# HyperNEAT

The motivation for this and for other similar techniques is that "Because natural evolution discovered intelligent brains with billions of neurons and trillions of connections, perhaps neuroevolution can do the same"

However, the scale found in nature is still far out of reach.

HyperNEAT uses something called *connective Compositional Pattern Producing Networks* (connective CPPN) and the advantage of this approach is that it can exploit the geometry of the task by mapping its regularities onto the topology of the network, thereby shifting problem difficulty away from dimensionality to underlying problem structure.

The design principle seems to be that the brain has "precise, intricate, repeating motifs" which is what they try to mimic, albeit at a smaller scale I guess.

TODO: concept of *reuse* here

The biggest difference from NEAT, seems to be that hyperNeat uses an indirect encoding with some smart mapping from genotype to phenotype which is very expressive?

Something about that they don't want to disregard the order of input? WIth regular ANNs we can input the input data in any order and train it successfully but this forces it to learn correlations in different parts of the network which makes reuse from encoding mapping harder??? They use the connective CPPNs to represent a connectivity pattern in order to not disregard useful geometric information. I think this is where the hyper cube name comes from, since the input (only the input? or this it apply at other places too?) could be connected as a hypercube. Consider a 2d boardgame state where neighboring tiles in all directions could be important. They want to capture this information. Then generalize this to higher dimensions I guess.


TODO: initial understanding: hyperneat uses neat to evolve cppns which then describe a big network that solves the task???

HyperNEAT "paints" regular patterns onto a hypercube.


## Background: CPPNs
CPPNs are capable of generating complex spatial patterns in Cartesian space.

In biological genetic encoding, the mapping between genotype and phenotype is indirect. 30000 genes gives 100 trillion neuron connections in the brain etc. Because phenotypic structures often occur in repeating patterns, each time a pattern repeats, the same gene group can provide the specification. Fingers and toes are examples of repeating patterns in biology.

CPPNs produce spatial patterns by composing basic functions. Simple canonical functions can be composed to create an overall network that produces complex regularities and symmetries. 

Each component function creates a novel geometric *coordinate frame* within which other functions can reside. The main idea is that these simple canonical functions are abstractions of specific events in development such as establishing bilateral symmetry (e.g. with a symmetric function such as Gaussian) or the division of the body into discrete segments (e.g. with a periodic function such as sine).

CPPNs take spatial coordinates as input and outputs value at that position. One of the interesting things are the symmetric and repeating patterns they can produce. The counterpart in biology is how many animals have a left-right symmetry, or the repeating receptive fields in the eyes.

CPPNs can include many different basic functions to build the more complex ones. Some example outputs below.

<img src="figs/HyperNEAT/cppn-generated-regularities.png" width="80%" height="80%">

TODO: better descriptions of CPPNs maybe

## Basics of HyperNEAT
The idea is to evolve CPPNs with NEAT and then use the CPPNs to generate networks.

In original NEAT, only hidden nodes with sigmoid activations are used. In CPPN-NEAT the nodes can have different activation functions from a fixed set. Like gaussian, periodic, sigmoid, etc. 

The goal is then to have the regularities of CPPNs evolved with NEAT transferred to evolved connectivity patterns. The representational power of the CPPNs could then evolve big conmplex ANNs with symmetries and repeating patterns.

Connectivity patterns tell which things on the grid that are connected.

### Mapping spatial patterns to connectivity
The problem to solve is how to find a way to have spatial patterns describe connectivity.

Spatial patterns are what we see in the output of a CPPN where each coordinate is inputted to the CPPN and the output is the intensity at that coordinate basically, the result then is the spatial pattern. The spatial pattern can be viewed as the phenotype and the function f computing each intensity is the genotype.

**Anyway, the main idea to solve this is to let the input to the CPPN be the *two points* that define a connection instead of just each coordinate independently. The output is then interpreted as the weight of this connection instead of the intensity of a single coordinate.**

For a 2d problem, the CPPN is defined for 4 inputs $CPPN(x_1, y_1, x_2, y_2)$. This way we get a weight between each every node pair in the grid. This includes recurrent connections as well (how?? by just changing the order of inputs to $(x_2, y_2, x_1, y_1)$?).

By convention, if the magnitude of a computed weight is too small, it is not expressed in the phenotype network. All weights above this threshold, are scaled between zero and a maximum weight for the resulting network. (No negative weights?). In this way we can get any network topology. (But I guess we still have to choose a maximum number of nodes?)

<img src="figs/HyperNEAT/hyperneat-connectivity.png" width="60%" height="60%">

"Hypercube-based Geometric Connectivity Pattern Interpretation. A grid of nodes, called the substrate, is assigned coordinates such that the center node is at the origin. 

(1) Every potential connection in the substrate is queried to determine its presence and weight; the dark directed lines shown in the substrate represent a sample of connections that are queried. 

(2) For each query, the CPPN takes as input the positions of the two endpoints and 

(3) outputs the weight of the connection between them. After all connections are determined, a pattern of connections and connection-weights results that is a function of the geometry of the substrate. In this way, connective CPPNs produce regular patterns of connections in space."

This will exhibit a pattern that is a function of the geometry.

**NOTE**: The connectivity pattern produced by a CPPN in this way is called the substrate so that it can be verbally distinguished from the CPPN itself, which has its own internal topology

CPPNs that are interpreted to produce connectivity patterns are called *connective CPPNs* while CPPNs that generate spatial patterns are called *spatial CPPNs*. This paper uses connective CPPNs

A key insight that the authors point out is that since the connective CPPN is a function of four dimensions, the two dimensional connectivity pattern expressed by the CPPN is isomorphic to a spatial pattern embedded in a four dimensional hypercube. This is important because it means the symmetries and regularities now instead correspond to connectivity patterns that should have the corresponding regularities which was the goal. See images below that show the resulting networks. Or: a spatial pattern in 4d is interpreted as connectivity pattern in 2d.

<img src="figs/HyperNEAT/hyperneat-connectivity-patterns.png" width="60%" height="60%">


### Substrate configuration
The substrate does not necessarily have to be a 2d grid of nodes that was shown in the example before. Different forms of the substrate are likely good for different stuff. For example depending on the problem it might even have to be a 3d grid.

<img src="figs/HyperNEAT/hyperneat-connectivity-substrate-patterns.png" width="60%" height="60%">

### Input and output placement
Part of the substrate design is to decide which nodes are input and which are outputs. The fact that the CPPN is aware of this (which it's not in a normal ANN) can be an advantage because the geometric relationships can be exploited.


### Substrate resolution
TODO


### Evolving connective CPPNs
In this paper, the connective CPPNs are evolved via NEAT.

1. Choose substrate configuration
2. Initialize minimal CPPNs with random weights
3. Repeat 
  1. For each solution in population, query the CPPN for each possible connection to form the phenotype network
  2. Then run on the task it to find its fitness
  3. Create the next population according to NEAT algorithm

This means that the connective CPPNs start out small and add nodes and edges as they "discover new global dimensions of variations in connectivity across the substrate"

Each new connection in the CPPN represents a new regularity in connectivity in the substrate.

TODO: Are the activation functions just randomly chosen in NEAT's add node structural mutation?

### TLDR
Have some grid of nodes (substrate) which encode the possible connections. 

A connective CPPN that takes two coordinates in the grid and outputs a weight for that connection is evolved with NEAT.

The phenotype of this, i.e. actual network is obtained by letting each coordinate pair on the substrate pass through the connective CPPN and seing which connection weights are high enough.

Are the cool activation functions only in the CPPNs and not in the phenotype network?

## Experiments (examples of applications)
The paper performs two experiments using HyperNEAT.

1. In a visual discrimination task the goal is to show HyperNEAT's ability to exploit regularity. The actual task is to discriminate a large object from a small object.
2. The second task is a food gather task for a robot in a room. It has some range finder sensors in each direction.


TODO


## Discussion and Thoughts
Find recent uses of this

Their concern about connectivity being disregarded in regular ANNs (?). Is this solved with convolutional neural networks basically? I think this is a big reason convolutions are so good in image related stuff, since it preserves local activations in a good way?

"Vision is well-suited to testing learning methods on high-dimensional input. Natural vision also has the
intriguing property that the same stimulus can be recognized equivalently at different locations in the visual
field.", seems like basically the same rationale behind convolutions in CNNs so HyperNEAT is probably outdated for these types of tasks

They only talk about the 2d case, does it generalize well to higher dimensions, i.e. connections in 3d would be a 6d connective CPPN I guess?

We have to choose a maxium number of nodes (grid in the "substrate")? Is this a weakness? Wtih GPUs, using the same max size would we get similar results faster anyway?

What about recurrent connections?

It came out in 2009, before deep learning got popular, does it mean HyperNEAT is outdated?
