In [1]:
import numpy as np
from numpy import tanh
import matplotlib.pyplot as plt
import random
from math import sqrt
from model import Model
from rbm import RBM
from rbm_operator import Operator, Sx_, Sy_, Sz_, SzSz_, set_h_Hamiltonian, set_J1_Hamiltonian, set_J2_Hamiltonian

# Simple attempt to implement a variational quantum state calculation 

Reminder: the operators are defined as 
$$S\cdot S = S_x \cdot S_x + S_y \cdot S_y + S_z \cdot S_z = \frac{1}{2} (S_+ \cdot S_- + S_- \cdot S_+) + S_z \cdot S_z$$

where $S_z = \frac{\hbar}{2} \begin{pmatrix} 1 & 0 \\ 0  & -1 \end{pmatrix}$, $S_+ = \hbar \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} = S_x + S_y$, $S_- = \hbar \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix} = S_x - S_y$. A singlet state has energy $-3J/4$ and a triplet state has energy $J/4$

As an example, we start with a very simple J1/J2 Heisenberg spin chain

We define a class "model" that contains a Hamiltonian and a rule for generating spins according to the MSMS algorithm.

We have something to obtain the wavefunction amplitude $\psi_\theta(s)$ from the spin configuration $s$ that depends on some amplitude $\theta$

From  $\psi_\theta(s)$ for the given spin configuration, we can produce an estimate of the energy of the variational ground state.

Then we can perform a stochastic gradient descent based on it.

# Implementation of the RBM class

After having a class of spin and Hamiltonian, consider a wavefunction object, take takes a spin and returns a number. The variational quantum state is defined as 

\begin{align*}
\Psi_M(S;W) = \sum_{h_i} e^{\sum_j a_j \sigma^z_j + \sum_i b_i h_i + \sum_{ij} W_{ij} h_i \sigma^{z}_j}
\end{align*}

Here the free parameters of the models are $a_i, b_, W_{ij}$ and $h_1, \dots, h_M$ represents auxilliary spin variables in the network. The internal spins can be explicited traced out to read

\begin{align*}
\Psi(S;W) = e^{\sum_j a_j \sigma^z_j} \times \Pi_{i=1}^M F_i(S)
\end{align*}

where

\begin{align*}
F_i(S) = 2\cosh\left[ b_i + \sum_J W_{ij} \sigma^z_j\right] =  2\cosh\theta_i
\end{align*}

The object would be an NN (most basic example includes the Carleo RBM, which actually has an analytical form). $a_j$ is the visible layer with the physical spins, $b_i$ is the hidden layer with an arbitrary layer of hidden spins, while $W_{ij}$ is a weight connecting the physical and hidden spins.

I need a function to compute the variational energy. A function to give gradient. And a function to do the gradient descent.


# Operators and derivatives: 

Noting that an operator dcdan be approximated by its local value

\begin{align*}
O_{loc}(s) = \sum_{s, s'}  \left< s\middle| O \middle| s' \right>\frac{\left<s' \middle| \Phi_\theta\right>}{\left<s \middle| \Phi_\theta\right>}
\end{align*}

and an operator's expectation value can be reasonably approximated by 

\begin{align*}
\left<O \right> = \sum_{s} P(s) O_{loc}(s) \approx \frac{1}{M} \sum_{s_i} O_{loc}(s_i)
\end{align*}

Now consider the energy minimization. Let $O_p(s) = \frac{\partial}{\partial \theta_p} \log \left<s\middle|\Psi_\theta\right> = \left<s\middle|O_p \middle| s\right> $,

\begin{align*}
\frac{\partial E(\theta)}{\partial \theta_p} &= 2 \Re \left[ \left\langle E_{\text{loc}}(\mathbf{s}) O^*(\mathbf{s}) \right\rangle - \left\langle E_{\text{loc}}(\mathbf{s}) \right\rangle \left\langle O^*(\mathbf{s}) \right\rangle \right] \\
&= 2 \Re \left[ \left\langle (E_{\text{loc}}(\mathbf{s}) - \left\langle E_{\text{loc}}(\mathbf{s}) \right\rangle) O^*(\mathbf{s}) \right\rangle \right]
\end{align*}

To evaluate the derivatives, note that 

\begin{align*}
O_{a_i} = \frac{\partial}{\partial a_i} \log \left<s\middle|\Psi_\theta\right> &= \sigma_i^z \\
O_{b_j} = \frac{\partial}{\partial b_j} \log \left<s\middle|\Psi_\theta\right> &= \tanh(\theta_j(S)) \\
O_{W_{ij}} = \frac{\partial}{\partial W_{ij}} \log \left<s\middle|\Psi_\theta\right> &= \sigma^z_i\tanh(\theta_j(S))
\end{align*}

It follows that the gradient is given by (up to proportionality factors)

\begin{align*}
\frac{\partial E(\theta)}{\partial a_i}  &= \left[ \left\langle E_{\text{loc}}(\mathbf{s}) \sigma_i^z \right\rangle - \left\langle E_{\text{loc}}(\mathbf{s}) \right\rangle \left\langle \sigma_i^z \right\rangle \right] \\
\frac{\partial E(\theta)}{\partial b_j}  &= \left[ \left\langle E_{\text{loc}}(\mathbf{s}) \tanh(\theta_j(s)) \right\rangle - \left\langle E_{\text{loc}}(\mathbf{s}) \right\rangle \left\langle \tanh(\theta_j(s)) \right\rangle \right] \\
\frac{\partial E(\theta)}{\partial W_{ij}}  &= \left[ \left\langle E_{\text{loc}}(\mathbf{s}) \sigma^z_i\tanh(\theta_j(s)) \right\rangle - \left\langle E_{\text{loc}}(\mathbf{s}) \right\rangle \left\langle \sigma^z_i\tanh(\theta_j(s)) \right\rangle \right]
\end{align*}


Technically, I can use the naive gradient descent. But it tends to have poor convergence because the wavefunction is highly non-linear and correlated. Stchastic reconfiguration is known to have better performance. 

## Sanity check: expectation value of operators 

### Two-site operators (SzSz only, for now)

In [4]:
model = Model(2,2)
rbm = RBM(model)
average_expectations = [rbm.expectation_value(SzSz_(0,0,0,1,model), np.array([[[1, 0], [0, 1]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation <SzSz|ud__|SzSz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
average_expectations = [rbm.expectation_value(SzSz_(0,0,0,1,model), np.array([[[0, 1], [0, 1]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation  <Sz|dd__|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
rbm = RBM(model)
batch = rbm.create_batch(200) #This uses evaluate_dummy, which gives 2/3 for spin up and 1/3 for spin down 
average_expectations = [rbm.expectation_value(SzSz_(0,0,0,1,model), batch[i]) for i in range(len(batch))]
print(f"The average expectation for a 2:1 mixed state is {np.mean(average_expectations)})")

The average expectation <SzSz|ud__|SzSz> is (-0.25+0j) with standard deviation 0.0
The average expectation  <Sz|dd__|Sz> is (0.25+0j) with standard deviation 0.0
The average expectation for a 2:1 mixed state is (-0.0125+0j))


### One-site operator (Sz makes sense. Unsure if Sx does, but I'll sweep it under the rug for now.)

In [18]:
model = Model(2,2)
rbm = RBM(model)
average_expectations = [rbm.expectation_value(Sz_(0,0,model), np.array([[[1, 0], [0, 1]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation <Sz|u___|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
average_expectations = [rbm.expectation_value(Sz_(0,0,model), np.array([[[0, 1], [1, 0]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation  <Sz|d___|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
rbm = RBM(model)
batch = rbm.create_batch(200) #This uses evaluate_dummy, which gives 2/3 for spin up and 1/3 for spin down 
average_expectations = [rbm.expectation_value(Sz_(0,0,model), batch[i]) for i in range(len(batch))]
print(f"The average expectation for a 2:1 mixed state is {np.mean(average_expectations)})")

The average expectation <Sz|u___|Sz> is (0.5+0j) with standard deviation 0.0
The average expectation  <Sz|d___|Sz> is (-0.5+0j) with standard deviation 0.0
The average expectation for a 2:1 mixed state is (0.42+0j))


In [19]:
model = Model(2,2)
rbm = RBM(model)
average_expectations = [rbm.expectation_value(Sz_(1,0,model)+Sz_(0,1,model), np.array([[[1, 0], [0, 1]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation <Sz|u___|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
average_expectations = [rbm.expectation_value(Sz_(1,0,model)+Sz_(0,1,model), np.array([[[0, 1], [1, 0]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation  <Sz|d___|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
rbm = RBM(model)
batch = rbm.create_batch(200) #This uses evaluate_dummy, which gives 2/3 for spin up and 1/3 for spin down 
average_expectations = [rbm.expectation_value(Sz_(1,0,model)+Sz_(0,1,model), batch[i]) for i in range(len(batch))]
print(f"The average expectation for a 2:1 mixed state is {np.mean(average_expectations)})")

The average expectation <Sz|u___|Sz> is (-1+0j) with standard deviation 0.0
The average expectation  <Sz|d___|Sz> is 0j with standard deviation 0.0
The average expectation for a 2:1 mixed state is (0.335+0j))


In [20]:
operator = Operator(model)
for i in range(2):
    for j in range(2):
        operator += Sz_(i, j, model)
model = Model(2,2)
rbm = RBM(model)
average_expectations = [rbm.expectation_value(operator, np.array([[[1, 0], [0, 1]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation <Sz|u___|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
average_expectations = [rbm.expectation_value(operator, np.array([[[0, 1], [1, 0]], [[0, 1], [1, 0]]])) for _ in range(40)]
print(f"The average expectation  <Sz|d___|Sz> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
rbm = RBM(model)
batch = rbm.create_batch(200) #This uses evaluate_dummy, which gives 2/3 for spin up and 1/3 for spin down 
average_expectations = [rbm.expectation_value(operator, batch[i]) for i in range(len(batch))]
print(f"The average expectation for a 2:1 mixed state is {np.mean(average_expectations)})")

The average expectation <Sz|u___|Sz> is 0j with standard deviation 0.0
The average expectation  <Sz|d___|Sz> is 0j with standard deviation 0.0
The average expectation for a 2:1 mixed state is (0.88+0j))


In [21]:
#NB: This assumes that the spin is up for probability 2/3 and down for probabilty 1/3
#If you set them to be the same the spin expectation is 1/2, which checks out for |up>+|dn>
model = Model(2,3)
rbm = RBM(model)
average_expectations = [rbm.expectation_value(Sx_(1,0,model), np.array([[[1, 0], [0, 1], [1, 0]], [[0, 1], [1, 0], [1,0]]])) for _ in range(40)]
print(f"The average expectation <Sx|u___|Sx> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
average_expectations = [rbm.expectation_value(Sx_(1,0,model), np.array([[[1, 0], [0, 1], [1, 0]], [[0, 1], [1, 0], [1,0]]])) for _ in range(40)]
print(f"The average expectation  <Sx|d___|Sx> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
rbm = RBM(model)
batch = rbm.create_batch(100) #This uses evaluate_dummy, which gives 2/3 for spin up and 1/3 for spin down 
average_expectations = [rbm.expectation_value(Sx_(0,0,model), batch[i]) for i in range(len(batch))]
print(f"The average expectation for a 2:1 mixed state is {np.mean(average_expectations)})")

The average expectation <Sx|u___|Sx> is (0.5+0j) with standard deviation 0.0
The average expectation  <Sx|d___|Sx> is (0.5+0j) with standard deviation 0.0
The average expectation for a 2:1 mixed state is (0.3575000000000001+0j))


In [24]:
model = Model(2,3)
rbm = RBM(model)
average_expectations = [rbm.expectation_value(Sy_(1,0,model), np.array([[[1, 0], [0, 1], [1, 0]], [[0, 1], [1, 0], [1,0]]])) for _ in range(40)]
print(f"The average expectation <Sy|u___|Sy> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
average_expectations = [rbm.expectation_value(Sy_(1,0,model), np.array([[[1, 0], [0, 1], [1, 0]], [[0, 1], [1, 0], [1,0]]])) for _ in range(40)]
print(f"The average expectation  <Sy|d___|Sy> is {np.mean(average_expectations)} with standard deviation {np.std(average_expectations)}")
rbm = RBM(model)
batch = rbm.create_batch(200) #This uses evaluate_dummy, which gives 2/3 for spin up and 1/3 for spin down 
average_expectations = [rbm.expectation_value(Sy_(1,0,model), batch[i]) for i in range(len(batch))]
print(f"The average expectation for a 2:1 mixed state is {np.mean(average_expectations)})")

The average expectation <Sy|u___|Sy> is -0.5j with standard deviation 0.0
The average expectation  <Sy|d___|Sy> is -0.5j with standard deviation 0.0
The average expectation for a 2:1 mixed state is 0.01j)


# Implementation of the learning process

Idea:

For each iteration:

Start with a set of weights.

For each weight, initialize the batch of state configurations based on the MCMC. For each state, get the amplitude. The combination gives an estimation of energy. This energy generates an estimation of gradient descent. Change the weight. 



In [2]:
model = Model(2,3)
rbm = RBM(model)

N = 100
batch = rbm.create_batch(N)
Ham =  set_h_Hamiltonian(model, h = 4) #set_J1_Hamiltonian(model, J = 1) + set_h_Hamiltonian(model, h = 4)

Szs = set_h_Hamiltonian(model, h = 1)

print(f"the spin expectation value is {rbm.expectation_value_batch(Szs, batch)}")

def calculate_Sz_expectation_brute_force(batch):
    return  np.mean(np.sum(batch[:, :, :, 0]/2 - batch[:, :, :, 1]/2, axis=(1, 2)))

print(f"the naive spin expectation value is {calculate_Sz_expectation_brute_force(batch)}")
      

the spin expectation value is (0.9-5.361749559878048e-18j)
the naive spin expectation value is 0.9


In [3]:
model = Model(2,3)
Ham = set_h_Hamiltonian(model, h = 1)
#Ham += set_J1_Hamiltonian(model, J = 1)
spin1 = np.array([[[1,0] for _ in range(model.L2)] for _ in range(model.L1)])
print(f"Hamiltonian expectation: <spin1|H|spin1>={Ham.vdot(spin1, spin1)}")
spin1[0, 0] = [0,1]
print(f"Flipping a spin in s Hamiltonian gives: <spin1|H|spin1>={Ham.vdot(spin1, spin1)}")
spin1[1, 2] = [0,1]
print(f"Flipping a spin in s Hamiltonian gives: <spin1|H|spin1>={Ham.vdot(spin1, spin1)}")

Hamiltonian expectation: <spin1|H|spin1>=(3+0j)
Flipping a spin in s Hamiltonian gives: <spin1|H|spin1>=(2+0j)
Flipping a spin in s Hamiltonian gives: <spin1|H|spin1>=(1+0j)


In [4]:
model = Model(2,3)
rbm = RBM(model)
Ham =  set_h_Hamiltonian(model, h = 4)

N= 50
test = [rbm.expectation_value_batch(Ham, rbm.create_batch(N)) for _ in range(10)]
print(f"The average energy is {np.mean(test)} with standard deviation {np.std(test)}")

N = 500
test = [rbm.expectation_value_batch(Ham,  rbm.create_batch(N)) for _ in range(10)]
print(f"The average energy is {np.mean(test)} with standard deviation {np.std(test)}")

# N = 5000
# test = [rbm.expectation_value_batch(Ham,  rbm.create_batch(N)) for _ in range(10)]
# print(f"The average energy is {np.mean(test)} with standard deviation {np.std(test)}")

The average energy is (2.04+1.0882532906389126e-17j) with standard deviation 0.36353816856005644
The average energy is (1.7048000000000003+9.166667502534988e-18j) with standard deviation 0.14934309491904874


In [24]:
#get the current wight and energy
a, b, M = rbm.get_weights()
batch = rbm.create_batch(N)
print(f"the initial energy is {rbm.expectation_value_batch(Ham, batch)}")
delta_a, delta_b, delta_M = rbm.get_deltas(Ham, 1)
gamma = 0.000000001
a -= gamma * delta_a
b -= gamma * delta_b
M -= gamma * delta_M
rbm.set_weights(a, b, M)
batch = rbm.create_batch(N)
print(f"the updated energy is {rbm.expectation_value_batch(Ham, batch)}")

the initial energy is (1.84+0j)
the updated energy is (2.48+1.8718492984125588e-18j)


In [22]:
calculate_Sz_expectation_brute_force(test)

1.0

In [3]:
rbm.train(Ham, 1)

Updating weights at iteration 0
Current energy: 0j
Current Sz: 0.0
Updating weights at iteration 1
Current energy: (0.7199999999999999+0j)
Current Sz: 0.18
Updating weights at iteration 2
Current energy: (-0.84+0j)
Current Sz: -0.21
Updating weights at iteration 3
Current energy: (1.16-2.2684320903241372e-18j)
Current Sz: 0.29
Updating weights at iteration 4
Current energy: (-1.96+5.73936057095251e-18j)
Current Sz: -0.49
Updating weights at iteration 5
Current energy: (-0.96+6.857054125199013e-18j)
Current Sz: -0.24
Updating weights at iteration 6
Current energy: (-1.56+0j)
Current Sz: -0.39
Updating weights at iteration 7
Current energy: (-2.32+0j)
Current Sz: -0.58
Updating weights at iteration 8
Current energy: (-0.76+0j)
Current Sz: -0.19
Updating weights at iteration 9
Current energy: (-1.52+4.9458343384645615e-18j)
Current Sz: -0.38


(array([ 0.31466344+0.01420613j, -0.17781718+0.46356555j,
        -0.56087053+0.007771j  ,  0.3094495 +0.45058636j,
        -0.49944333+0.43138525j,  0.43036907-0.47256118j]),
 array([ 0.00629268-0.10409859j,  0.12635185+0.30579703j,
        -0.21771644+0.38645257j, -0.42647314+0.52715781j,
        -0.31079108+0.13845925j, -0.31027842-0.09731303j,
         0.25961849+0.0531559j ,  0.53681481+0.11855583j,
         0.36300586-0.40619028j]),
 array([[ 0.00679065-0.34423366j, -0.44590654+0.263989j  ,
          0.42339262+0.28324037j, -0.21313809-0.31825113j,
          0.29275988+0.35861221j, -0.27482315-0.40352174j],
        [ 0.39140067-0.5150925j , -0.11621116+0.1003381j ,
          0.46934634+0.3705505j ,  0.39381741+0.27205987j,
         -0.1699463 +0.31992682j, -0.03149516-0.07186602j],
        [ 0.24818596+0.22526289j,  0.44133083+0.33749219j,
         -0.21044856-0.02471558j,  0.12605445+0.37861303j,
          0.18056459+0.34234216j,  0.36909677+0.39050799j],
        [-0.29868073+0.

In [4]:
rbm.get_deltas(Ham, 1)

(array([0.0108162 +0.j, 0.01036182+0.j, 0.01164751+0.j, 0.00111059+0.j,
        0.01658372+0.j, 0.01598826+0.j]),
 array([ 8.86446255e-03+0.j,  6.97707050e-06+0.j, -5.90236836e-03+0.j,
        -8.85677324e-03+0.j, -6.40614383e-03+0.j, -9.30838656e-03+0.j,
         2.27635936e-03+0.j, -7.17698741e-03+0.j,  5.07102749e-03+0.j]),
 array([[ 0.00279604+0.j,  0.0035785 +0.j,  0.00168032+0.j,
          0.00241863+0.j,  0.00525859+0.j,  0.00452512+0.j],
        [ 0.00083935+0.j,  0.00193649+0.j,  0.00072205+0.j,
          0.00047375+0.j,  0.00203627+0.j,  0.00216819+0.j],
        [ 0.00304152+0.j,  0.0038329 +0.j,  0.00461035+0.j,
         -0.00120154+0.j,  0.00558658+0.j,  0.00499   +0.j],
        [-0.00404507+0.j, -0.00322722+0.j, -0.00306601+0.j,
         -0.00318194+0.j, -0.00533932+0.j, -0.00522522+0.j],
        [ 0.00420897+0.j,  0.00430301+0.j,  0.00443395+0.j,
         -0.00116703+0.j,  0.00690323+0.j,  0.00706373+0.j],
        [ 0.00303222+0.j,  0.00275894+0.j,  0.00412535+0.j,
      

In [3]:
for spin in batch:
    #This gets theta_spin
    print(rbm.theta(spin))
    #This gets E(s)
    print(rbm.expectation_value(Ham, spin))
    #This gets the average of sigma_z
    print(rbm.expectation_value_Sz(spin))

[ 0.14435618  0.3768973  -0.19366966 -0.56948488 -0.99481265 -0.85582976
 -0.2821961  -0.93864762  0.87933583]
(4.508225285640611+0j)
1.0
[-0.59409852  0.27184292  0.1574936  -0.38034904 -0.47300434 -0.30486181
  0.69757572 -0.18282368  0.35413301]
(1.3231432298490877+0j)
0.0
[-0.77593477  0.60332223 -0.07556099 -0.57118212 -0.37680926 -0.74358579
  0.4109133  -0.26514011  0.60281849]
(5.6613620987431155+0j)
1.0
[ 0.33208158  0.6189694  -0.56930629 -0.4942077  -1.07872507 -0.68477643
 -0.72341633 -0.72390129  0.42001181]
(8.982271455693532+0j)
2.0
[-0.14827851  0.41047176  0.05836203 -0.81667613 -0.61839423 -0.49944117
  0.41999488 -0.41501895  0.26809093]
(-2.9979916108965994+0j)
-1.0
[-0.7477346   0.57007438 -0.10110272 -0.28184757 -0.22507829 -0.26309392
 -0.23143179  0.0876631  -0.17986254]
(8.210360625128997+0j)
1.0
[ 0.02707904  0.29265075 -0.35898737  0.37683656 -0.46262703 -0.19336898
 -0.81440219 -0.07988083  0.61437522]
(10.440798070302165+0j)
2.0
[-0.16064636  0.05057865  0.

In [39]:
test = [np.mean(np.sum(rbm.create_batch(400, burn_in = 100, skip = 10)[:,:,:,0], axis = (1,2))) for _ in range(40)]
print(np.mean(test), np.std(test))

3.4515625 0.05705449012785935


### Symmetry of the RBM.

Consider a translation operator working as $\sigma_j(k) = T_k \sigma_j$. An obvious requirement for translation symmetry is that $\Psi_\theta(\sigma) = \Psi_\theta(T_s\sigma)$, and an obvious way to implement this is to just artificially sum over contributions on the output $\sum_i \Psi_\theta(T_{s_i}\sigma)$, but this won't improve the efficiency of the algorithm.

A more tractable wa is to use convolutions. Before we have RBM as 

\begin{align*}
\Psi_M(S;W) &= \sum_{h_i} e^{\sum_j a_j \sigma^z_j + \sum_i b_i h_i + \sum_{ij} W_{ij} h_i \sigma^{z}_j}\\
\end{align*}

Now, we can rewrite it as below, where $f = 1, \alpha_s$ is a number of feature maps, and in particular $W_{j}^{(f)} $ has $\alpha_s \times N$ elements. Note that because the spins are all summed over all translations this is translation invariant.

\begin{equation}
\Psi_{\alpha}(\mathbf{S}; \mathbf{W}) = \sum_{h_{i,s}} \exp \left[ \sum_{f}^{a} \left( a^{(s)} \sum_{s}^{S} \sum_{j}^{N} \tilde{\sigma}_{j}^{z}(s) + b_{f}^{(s)} h_{f,s} + \sum_{s}^{S} \sum_{j}^{N} h_{f,s} W_{j}^{(f)} \tilde{\sigma}_{j}^{z}(s) \right) \right],
\end{equation}
