## Setting up a perceptrons as the neural network

In this part we are going to set up a network and look in detail at its loss function, its activation function and how to control its topology.

> At the moment TATi can only setup multi-layer perceptrons. We'll come to this in the very end.

Let's start by importing TATis's `simulation` module.

In [1]:
import TATi.simulation as tati

The `simulation` module is an especially *light-weight*, yet powerful interface to TATi.

> Although the class name is `Simulation` (inside a module `simulation`), we will refer to it as `tati` here.

It relies on a `dict` of **options**. These options control every aspect of TATi: the network, the dataset, how and what files are written, ...

A specific command `tati.help()` lists all available options.

In [2]:
tati.help()

averages_file:             CSV file name to write ensemble averages information such as average kinetic, potential, virial
batch_data_file_type:      type of the files to read input from
batch_data_files:          set of files to read input from
batch_size:                The number of samples used to divide sample set into batches in one sampleing step.
burn_in_steps:             number of initial steps to drop when computing averages
collapse_walkers:          Whether to regularly collapse all dependent walkers to restart from a single position again, maintaining harmonicapproximation for ensemble preconditioning. 0 will not collapse.
covariance_after_steps:    Number of steps after which to regularly recompute the covariance matrix. This will require communication between the walkers. 0 will never compute a covariance matrix.
covariance_blending:       mixing for preconditioning matrix to gradient update, identity matrix plus this times the covariance matrix obtained from the other 

We will not go through all of them, but let's at least take a closer look at one of them: **hidden_dimension**.

In [3]:
tati.help("hidden_dimension")   # mind that the option name needs to be a string

Option name: hidden_dimension
Description: Dimension of each hidden layer, e.g. 8 8 for two hidden layers each with 8 nodes fully connected
Type       : list of <class 'int'>
Default    : []


There we have the option's name, a brief description, its type and the default value. Here, **hidden_dimension** has an empty list.

### Single-layer perceptron

Perceptrons have the following properties:

- input layer dimension
- output layer dimension
- number of hidden layers and their dimension
- additional drop-out layers
- activation function per node
- loss function

Each of these properties can be tuned with one of the options above.

The whole network is set up by instantiating `tati` with a given set of options. We can simply pass the options whose default value we want to change, by giving them as *keyword arguments (kwargs)* to the constructor of the class.

In [5]:
nn = tati(input_dimension=2,
          output_dimension=1,
          hidden_dimension=0,
          hidden_activation="linear", output_activation="relu",
          loss="mean_squared")

ValueError: Option hidden_dimension needs to be list of type <class 'int'>

Ups, we made a mistake! ... rats, what was again the type of **hidden_dimension**?

Of course, we knew already that it needs to be a list of ints. Then, let's fix the above instantiation.

In [6]:
nn = tati(input_dimension=2,
          output_dimension=1,
          hidden_dimension=[0],
          hidden_activation="linear", output_activation="relu",
          loss="mean_squared")

This single-layer perceptron should have three degrees of freedom, let's check using `num_parameters()`.

In [7]:
print(nn.num_parameters())

AttributeError: Neural network has not been constructed, dataset provided?

The *dataset* is an essential part of the network. Its dimensions define type and number of input and output nodes. Therefore, the network is internally *constructed first when a dataset is provided*.

Let us provide a dummy dataset through the option `dataset` and check again for the number of degrees of freedom.

In [8]:
import numpy as np
# mind that features, labels need to be lists of lists
nn.dataset = [np.asarray([[0,0]], dtype=np.float32), np.asarray([[1]], dtype=np.int32)]  
print(nn.num_parameters())

3


That's the correct results, two weights and a single bias.

#### Inspecting the the options

In case you are curious about the options inside `tati`, you can inspect its private member variable `_options`.

In [9]:
print(nn._options)

{'averages_file': None, 'batch_data_file_type': csv, 'batch_data_files': [], 'batch_size': 1, 'burn_in_steps': 0, 'collapse_walkers': False, 'covariance_after_steps': 100, 'covariance_blending': 0.0, 'diffusion_map_method': vanilla, 'dimension': 1, 'directions_file': None, 'do_hessians': False, 'dropout': None, 'every_nth': 1, 'fix_parameters': None, 'friction_constant': 0.0, 'hamiltonian_dynamics_time': 1.0, 'hidden_activation': linear, 'hidden_dimension': [0], 'in_memory_pipeline': True, 'input_columns': [], 'input_dimension': 2, 'inter_ops_threads': 1, 'intra_ops_threads': None, 'inverse_temperature': 1.0, 'learning_rate': 0.03, 'loss': mean_squared, 'max_steps': 1000, 'number_of_eigenvalues': 4, 'number_walkers': 1, 'optimizer': GradientDescent, 'output_activation': relu, 'output_dimension': 1, 'output_type': binary_classification, 'parse_parameters_file': None, 'parse_steps': [], 'prior_factor': 1.0, 'prior_lower_boundary': None, 'prior_power': 1.0, 'prior_upper_boundary': None, '

> *WARNING:* Do not change `nn._options["input_dimension"]=3` directly, rather use `tati.set_options()`.

This is because some options severely affect the network topology to the effect that the network is reinstantiated. `setup_options()` takes this into account ...

In [10]:
nn.set_options(input_dimension=3)

ValueError: Changing the network is not yet supported.

... and properly warns you in case the change is too severe.

#### Fixing degrees of freedom

Consider the case where we only want a network with two weights and no bias. The bias is removed if we set it to zero. How can we fix the single bias to this value?

> Fixing the bias is essentially changing the network, hence we need to add this parameter at the start. Let's reinstantiate `tati`.

In [12]:
nn = tati(input_dimension=2,
          output_dimension=1,
          fix_parameters="output/biases/Variable:0=0.",
          hidden_dimension=[0],
          hidden_activation="linear", output_activation="relu",
          loss="mean_squared")

# mind that features, labels need to be lists of lists
nn.dataset = [np.asarray([[0,0]], dtype=np.float32), np.asarray([[1]], dtype=np.int32)]  
print(nn.num_parameters())

2


This is the critical change! The bias degree of freedom has been effectively removed.

The above string `output/biases/Variable:0=0.` needs some explanation. The string addresses a particular variable inside tensorflow, namely `Variable:0` in the name scopes `biases` and `output`. Moreover, we assign ("=") this variable the fixed value of *0.*. In case you want to fix a weight, replace `biases` by `weights`. In case it is the first hidden layer, user `layer1` in place of `output`. If the name cannot be found, you'll get a helpful error message. 

> There need to be as many values as the variables has components (comma-separated list). Moreover, it is not possible to fix single components. At the moment only all weights of a layer or all biases of a layer can be fixed.

### Multi-layer perceptron

Let us return to `set_options()` and changing *hidden_dimension*.

What do you do in case you really want a different network? You need to reinstantiate `tati` with the different set of options. This will automatically reset tensorflow's internal computational graph.

> Therefore, you cannot have two instances of `tati` at the same time.

Let's add two hidden layers, each with 8 nodes. Moreover, we want to use the *sigmoid* function for activation. Finally, we need to use the cross entropy function with softmax as loss.

In [13]:
nn = tati(input_dimension=2,
          output_dimension=1,
          hidden_dimension=[8, 8],
          hidden_activation="sigmoid", output_activation="relu",
          loss="softmax_cross_entropy")

Let us see the number of degrees of freedom - note this is less tedious once you see how to pass a dataset easily.

In [14]:
# mind that features, labels need to be lists of lists
nn.dataset = [np.asarray([[0,0]], dtype=np.float32), np.asarray([[1]], dtype=np.int32)]  
print(nn.num_parameters())

105


Let us briefly check whether this is true: 

In [15]:
print(2*8+8*8+8*1+8+8+1)

105


Seems correct.

This is all about the setting up of the network. Next we will be looking at specifying the dataset.