# DAML 10 - ANN Architectures

Michal Grochmal <michal.grochmal@city.ac.uk>

Apart from multiple perceptrons interconnected with each other,
several neural network architectures have been developed over the years.

## Deep Architectures

- Deep Neural Networks are simply networks with lots of layers,
  but training layers deep through backpropagation turns to be hard.
  A *vanishing gradient* problem it tackled by careful selection of activation functions.

- Convolutional Neural Nets are networks with one or more convolutional layers,
  these layers are not fully connected providing feature selection based on parts
  of the input.

- Recurrent Neural Networks have connections going backwards, i.e. the output
  of a neuron feeds into the input of a neuron in the same or previous layer.
  RNN do not need to be deep networks but excel as such.

- Long Short Term Memory (LSTM) are specifically constructed RNNs,
  with very specific activation functions per layer.

- Autoencoders are DNNs which can repeat the patterns they were presented with.
  These are trained through unsupervised learning.

## Other Architectures

- Hopfield Networks are early *autoencoders*, these could repeat a known
  pattern when presented with a similar one.

- Boltzman Machines are networks of probabilistic neurons where all neurons
  are connected in all directions.
  The input and output is done from the same neurons (visible neurons).

- Restricted Boltzman Machines are BMs in which the hidden layer neurons
  are not interconnected.
  These are much easier to train than full BMs.

- Deep Belief Networks are stacked RBMs on top of each other.
  Each RBM can be trained separately, and we can stack several layers or RBMs.
  These were the early DNNs.

- Self-Organizing Maps are unsupervised networks for data visualization
  and dimensionality reduction.
  They use the concept that connections through which
  data passes should be reinforced whilst all other connections should decay.

## Tensorflow

The current top library for most neural network computing.
It really is a directed acyclic graph (DAG) processing engine on top of tensors.
Where tensors are pretty much matrices (e.g. NumPy arrays).
Its main selling point is `tensorboard` a web UI to monitor the processing
of the graph, and therefore monitor the network training.

A demo of a similar (limited to a handful of 2-dimensional problems)
interface can be found at [tensorflow playground][tfpl]

[tfpl]: http://playground.tensorflow.org