# Homework 12

## ASTR 5900, Fall 2017, University of Oklahoma

### Neural Networks

# Problem 1

Imagine a neural network that has already been trained to classify configurations of classical spins in a lattice as 1 of 2 magnetic phases: ferromagnetic or paramagnetic.  A training set was generated using Monte Carlo at a range of temperatures, and the target classifications were determined based on their temperatures relative to the known critical temperature for the 2D ferromagnetic Ising model with Hamiltonian $$ H = -J\sum_{\langle i,j \rangle} \sigma^z_i \sigma^z_j $$
where $\sigma^z_k = \{-1, 1 \}$ is the spin at site $k$ and the summation is over adjacent spin sites (sites that share an edge).  This Hamiltonian says there is energy stored between adjacent sites with opposite spins.  Because of this, at low temperatures the spins generally point in a single direction (either up or down).  As the temperature increases and crosses the critical temperature $T_C$, this aggregate behavior suddenly fades and the Ising system becomes disordered with an average magnetization of zero.  This happens because at higher temperatures the lattice is more likely to reach configurations with high energy (ie those with opposing spins next to each other).


### Part A

In this problem we want to locate the critical temperature of the Ising model.  To do this we will study the output of the trained neural net mentioned in the above prompt from a series of samples that we will generate at various temperatures.

In this problem we will study square lattices of size $10 \times 10$ with periodic boundary conditions.

Write 3 python functions that will be used to create our spin samples.  The first function `energy` should take a spin configuration (an array of 1's and -1's) and the return its energy as determined from the Hamiltonian above.  The second is the `metropolis` function which takes a temperature as an argument and returns a set of spin configurations sampled at that temperature.  This must be performed using an Metropolis-Hastings algorithm, described below:

1.  Initialize a uniform spin configuration $s$.
2. Loop $n$ times
    1. Generate a neighboring spin configuration $s'$ that's a 'neighbor' of $s$
    2. Assign variable `a` = $\text{min}(1, \exp(\frac{E_i - E_j}{T}))$ 
    3. With probability `a` accept state $s'$ and store it in your sample array.  If rejected, store $s$.
3. Return the array of stored states

The third function is `neighbor` and it returns a 'neighboring' state that is close to the input state.  It takes a state as an argument, makes a copy of it with `numpy.copy`, and flips the spin of 3 random sites in the copied state.  The altered state is then returned.

### Part B

It's time to generate data with the functions we just defined and apply it to the trained neural net.  As you know, a standard feed forward neural network is a series of layers of connected neurons that fire (or fractionally fire) based on the linear sum of weights and inputs from their respective previous layer.

The neural network in question has 3 layers: the 100 size input layer (because the input is an spin configuration on a $10 \times 10$ lattice), a hidden layer with 3 perceptrons, and an output layer of 2 perceptrons.  The weights and biases of the first layer are:

$$ W_1 = \frac{1}{N(1 + \epsilon)}
 \begin{pmatrix}
  1 & 1 & \cdots & 1 \\
  -1 & -1 & \cdots & -1 \\
  1 & 1 & \cdots & 1
 \end{pmatrix} \; \; \text{and} \; \; b_1 = \frac{\epsilon}{1+\epsilon} \begin{pmatrix}
  -1 \\
  -1 \\
  1
 \end{pmatrix}$$
 
 This actually works out to be $$ Wx + b = \frac{1}{1 + \epsilon}\begin{pmatrix}
  m(x) - \epsilon \\
  -m(x) - \epsilon \\
  m(x) + \epsilon
 \end{pmatrix}$$
 
 where $m(x) = \frac{1}{N}\sum_i \sigma^z_i$ is the total magnetization of configuration $x$.  $\epsilon$ is a parameter ranging from 0 to 1 that determines how polarized the spins have to be to be classified a one of the ways.  Start with an $ \epsilon $ of 0.3, but feel free to vary it and discuss its effect.  A heaviside step function is applied to each of these elements because they are Perceptrons.
 
The output's weights and biases are:
$$ W_2 = 
 \begin{pmatrix}
  2 & 1 &  -1 \\
      -2 & -2 & 1
 \end{pmatrix} \; \; \text{and} \; \; b_2 =\begin{pmatrix}
  0 \\
    0
 \end{pmatrix}$$

The output features 2 neurons: one that fires if the input is in the cold, ferromagnetic state and the other if the input is in the warm, paramagnetic state.  Write a function `toy_model` that maps an Ising configuration to a vector of length 2 with these specifications.  This function will be the neural net, effectively.

Sample 20 temperatures between 1 and 5 inclusive.  For each temperature, generate 10000 spin configurations with `metropolis`.  Determine the average of each output neuron, and plot the results as a function of temperature on the same figure.  The critical temperature is understood to be at the point in which the lines cross.  What is $T_C$?

# Problem Z

Comment on the amount of time this assignment required.