# Neuroevolution Ticket Search: Finding Sparse Trainable DNN Initilaisations

This is a blog post describing the workshop paper currently under review at the [ICLR 2023 Workshop on Sparsity in Neural Networks (SNN 2023)](https://www.sparseneural.net/) to be held in Kigali, Rwanda.
The workshops focus is on practical limitations and tradeoffs between sustainability and efficiency.



## Background on Artificial Neural Networks

An aritifical neural network is a biologically-inspired computerised *model* inspired by the brains of organisms that occur in nature (for example, human brains).
The key idea is that the brain is a network of neurons that are connected together in a way that allows them to process information.

We know that if an artificial neural network is configured correctly, it can approximate any continuous function.
In this context, a *continuous function* is a mathematical formula/algorithm/operation that takes a number as input and returns another number as output — for example, the function `square` takes a number (e.g. 2) and returns another number (e.g. 4).
To approximate a function means to define a way of computing the output from the input that is as close as possible to the actual/true value.

**This is a very powerful idea because it means that we can use neural networks to *learn* the relationship between inputs (e.g. the pixels of an image) and outputs (e.g. the probability that the image is a cat).**


### A Single Neuron

The foundation of the model is a *neuron* which is a simple mathematical function that takes a weighted sum of its inputs and applies an *activation function* to the result.

The diagram below shows a single neuron with four inputs and one output. This is as complicated as the maths will get!

<center>

![Neuron](../images/neuron.drawio.png)

</center>

- The inputs are represented by the circles $x_1, x_2, \dots, x_4$.
- Each input is multiplied by a weight $w_1, w_2, \dots, w_4$.
- The weighted inputs are summed together to produce a single output — denoted by $\Sigma$.
- The sum is then passed through an activation function $f$ to produce the final output $y$.



Mathematically, this is represented by the following equation. $$y = f(\Sigma) = f(w_1x_1 + w_2x_2 + w_3x_3 + w_4x_4)$$

The activation function $f$ is typically a non-linear function.
Non-linear means that, if you were to plot the function on a graph, it would not be a straight line.
For example, one common function, the ReLU function, is shown as a graph below.

<center>

![ReLU](../images/relu.drawio.png)

</center>

In essence, it takes any number as input: if the number is positive, it returns the number; if the number is negative, it returns zero.


### A Neural Network

A neural network is a collection of neurons: in artificial neural networks, the neurons are typically arranged in layers.
The diagram below shows two different neural networks with four layers.
We say that these networks are equally deep because they have the same number of layers (four).

<center>

![Neural Network](../images/dense_sparse_nets.drawio.png)

</center>

However, notice that, for the network on the right, not every neuron is connected to every neuron in the next layer.
We say that the network on the right is *sparse* because it has a lot of connections missing.
Similarly, the network on the left is *dense* because every neuron is connected to every neuron in the next layer.

Sparse networks are considered to be more efficient than dense networks because they have fewer connections and can often perform just as well.
However, it is often much easier to find and train a dense network than it is for a sparse network.

One of the main ways of mitigating this problem is to train a dense network first, and then prune it to make it sparse.
However, this requires fully training a dense network, which can be very computationally expensive: and may not even be possible in most cases.

**The key question behind this blog post is: can we find sparse networks without training a dense one first? Our tentative answer is: yes!**