# l2hmc-qcd

 * Available at:
[https://github.com/saforem2/l2hmc-qcd](https://github.com/saforem2/l2hmc-qcd)

$\newcommand{\bfL}{\mathbf{L}}$
$\newcommand{\bfF}{\mathbf{F}}$
$\newcommand{\FLxi}{\mathbf{F}\mathbf{L}\xi}$
$\newcommand{\floor}[1]{\lfloor #1 \rfloor}$

## <font color="#87ff00">Motivation: </font>


### <font color='#f92672'>Lattice QCD:</font>

 - Non perturbative approach to solving the QCD theory of the strong interaction between quarks and gluons.
 
 - Calculations proceed in 3 steps:
   1.  <font color='#03a9f4'>**Gauge field generation**:</font> Use MCMC methods for sampling independent gauge field (gluon) configurations.
   2. **Propagator calculations**: Compute how quarks propagate in these fields ("quark propagators')
   3. **Contractions** Method for combining quark propagators into correlation functions and observables.
   
   
Gauge field generation remains one of the major bottlenecks in LatticeQCD, due to <font color='#03a9f4'>*critical slowing down*</font>.

### <font color='#f92672'>Critical slowing down:</font>

 * Simulations are performed using some finite lattice spacing, $a$.
 
 
 * We wish to extrapolate from these results to the *continuum limit $a \rightarrow 0$.
 
 
 * As $a\rightarrow 0$, MCMC updates get stuck in sectors of fixed gauge topology.
  
  
 * This causes the number of steps needed to adequately sample different sectors to grow exponentially (divergent *autocorrelation time* $\tau_{\mathcal{O}}^{\mathrm{int}}$)!
  
  
  * <font color='#ffff00'>  **Idea**</font>: *Use ML to more efficiently generate sample configurations.*

## <font color='#87ff00'>Markov Chain Monte Carlo (MCMC)</font>
---

**GOAL**: Generate a sequence of random samples which converge to being distributed according to a target probability distribution $p(x)$ for which direct sampling is difficult.

In Lattice QCD, this sequence can be used to estimate integrals with respect to the target distribution (i.e. expected values of physical observables).

### <font color='#f92672'>Metropolis-Hastings algorithm</font>
---

Given a transition kernel $q(x^{\prime}|x)$ and an initialization distribution $p_{0}$ proceed as follows:

  1. Initialize $x_{0} \sim p_{0}$
     
     
  2. for $t = 0$ to $N$:   
     
    1. Sample $x^{\prime} \sim q(\cdot|x_{t})$
       
    2. Compute the acceptance probability:
    \begin{equation}
    A(x^{\prime}|x_{t}) =
      \min\left(1, \frac{p(x^{\prime})\,q(x_{t}|x^{\prime})}
      {p(x_{t})\,q(x^{\prime}|x_{t})}\right)
      = \underbrace{\min\left(1, \frac{p(x^{\prime})}
      {p(x)}\right)}_{\text{if}\,\, q(x^{\prime}|x) 
      \,=\, q(x|x^{\prime})}
    \end{equation}
       
    3. With probability $A$, accept the proposed value and set $x_{t+1} = x^{\prime}$. Otherwise, set $x_{t+1} = x_{t}$.

## <font color='#87ff00'>Hamiltonian Monte Carlo (HMC)</font>
---

<font color="#ffff00">GOAL:</font>

  - Sample from target distribution, $p(x)$
  - Use Hamiltonian Dynamics to "guide" the simulation through phase space

<img src="./img/hmc_phase_space.png" width=80% height=auto/>

 * Rewrite target density $\longrightarrow$ **potential energy**:
 
     \begin{equation}
     p(x) \propto \exp\left(-\beta U(x)\right)
     \end{equation}
     
    where $\beta = 1 / T$ is the *inverse temperature*
     
 
 
 * Introduce $v \sim \mathcal{N}(0, \mathbb{1})$ distributed independently of x.
 
     \begin{align}
       p(v) &\propto \exp\left(-\frac{1}{2}v^{T}v\right)\\
       &= \exp\big[-K(v)\big]
     \end{align}
 
 
 
 * In terms of the **Hamiltonian**, $H(x, v) = U(x) + K(v)$, then:
 
    \begin{align}
    p(x, v) &= \frac{1}{\mathcal{Z}}\exp\left(-H(x, v)\right) \\
    &\propto p(x) p(v)
    \end{align}


### <font color='#f92672'>Hamiltonian Dynamics / Leapfrog Integrator:</font>

Recall Hamilton's equations:

\begin{align}
  \begin{split}
    \frac{d x}{dt} &= \frac{\partial H}{\partial v} = v \\
    \frac{d v}{dt} &= -\frac{\partial H}{\partial x} = -\frac{\partial U}{\partial x} \
  \end{split}
\label{eq:hamiltons_eqs}
\end{align}


<!---
 * Evolve the system using Eq. \ref{eq:hamiltons_eqs}:
 
     \begin{align}
     x_{n+1} \longleftarrow x_{n}
     \end{align}
 
     \begin{equation}
       \big[x(0), v(0)\big] 
           \xrightarrow{\int_{0}^{dT} (\ref{eq:hamiltons_eqs})\,\, dt}
         \big[x(T), v(T)\big]
     \end{equation}
--->
     
    

 1. Draw a fresh momentum from $v_{n} \sim \mathcal{N}(0, \mathbb{1})$
 
 
 2. Alternate half-step updates of the momentum and full-step updates of the position:

     \begin{align}
       v_{n+\frac{1}{2}} &= v_{n} - \frac{\varepsilon}{2} \frac{\partial U(x_{n})}{\partial x_{n}} \\
       x^{\prime} \equiv x_{n+1} &= x_{n} + \varepsilon v_{n+\frac{1}{2}} \\
       v^{\prime} \equiv v_{n+1} &= v_{n+\frac{1}{2}} - \frac{\varepsilon}{2} \frac{\partial U(x_{n+1})}{\partial x_{n+1}}
     \end{align}
 
     where $\varepsilon$ is the *step size*.
     
     
<!---
the operator that performs $M$ successive applications of the above updates:
--->
  3. Let $\xi \equiv (x, v)$ and $\mathbf{L}$ and $\mathbf{F}$ denote the operators:
  
      \begin{align}
        \text{(leapfrog operator)}\quad 
        \bfL &:
          \xi\longrightarrow\left(\xi^{\prime}\right)^{\times M} \\
        \text{(momentum flip)}\quad
        \bfF &:
          \xi = (x, v) \longrightarrow (x, -v)
      \end{align}
      
      
&nbsp;

 * To account for the numerical errors accumulated during the integration, we apply a Metropolis-Hastings accept/reject step.
 

 * The probability of accepting the proposed configuration is given by:

    \begin{equation}
        A\left(\FLxi|\xi\right) 
        = \min\bigg[1, 
            \frac{p(\FLxi)}{p(\xi)}\underbrace{\,\,
                \left|\frac{\partial\,[\FLxi]} 
                {\partial \xi^{T}}\right|
            \,\,}_{\equiv 1\,\,\textbf{for HMC}}
        \bigg]
    \end{equation}

   where $\left|\frac{\partial\,[\FLxi]}{\partial \xi^{T}}\right|$
   is the determinant of the **Jacobian** of the transformation.

### <font color='#f92672'>DEMO:</font>

In [57]:
from IPython.display import IFrame
IFrame('https://chi-feng.github.io/mcmc-demo/app.html', height=750,  width=1000)

### <font color='#f92672'>Aside (HMC):</font>

 * Add momentum variable $v$ of same dimensionality as $x$ and consider
 
     \begin{equation}
     p(x, v) = p(v|x)p(x)
     \end{equation}
 
 where we are free to choose $p(v|x)$.
 
* *In practice*, we often choose $p(v|x) = \mathcal{N}(0, \mathbb{1})$.
  
  
 * Define the **Hamiltonian**:
 
     \begin{align}
         H(x, v) &= -\log p(x, v) \\
         &= -\log p(v|x) - \log p(x) \\
         &= K(x, v) + U(x)
     \end{align}

## <font color='#87ff00'>Learning to Hamiltonian Monte Carlo: *l2hmc*</font>
---

### <font color='#f92672'> Motivation:</font> <a class="tocSkip">
   

#### <font color='#03a9f4'> Issues with HMC:</font> <a class="tocSkip">

 - Randomly selected energy levels $\Rightarrow$ *slow mixing*
 - Difficult to traverse low-density zones

#### <font color='#03a9f4'> A "good" sampler should:</font>  <a class="tocSkip">
    
 - Mix quickly
    
 - Converge (burn-in) quickly
    
 - Be able to mix across energy levels
    
 - Be able to Mix between modes
    
 
    

### <font color='#f92672'>Idea:</font> <a class="tocSkip">
    

 * Minimize the **autocorrelation time** by maximizing the expected-squared jump-distance (ESJD).
 
    
 * Introduce into the leapfrog integrator 6 new functions: $S_{i},\, T_{i},\, Q_{i}$, for $i = x, v$.
    
    
 * <font color='#ffff00'> Each of these are parameterized by weights in a neural network that can be trained by minimizing a suitably-chosen loss function.


 * Let $x, v \in \mathbb{R}^{n}$ with $v \sim \mathcal{N}(0, \mathbb{1})$.
 
 
 * Introduce a binary direction variable $d \in \{-1, +1\}$, drawn from a uniform distribution.
 
 
 * Denote the complete state by $\xi = (x, v, d)$, with probability density:
 
     \begin{equation}
     p(\xi) = p(x)\, p(v)\, p(d)
     \end{equation}
     
     
<!---
 * Begin with a subset of the augmented state space, *independent of the momentum*, $\zeta_{1} = (x, \partial_{x}U(x), t)$, and introduce three new functions of $\zeta_{1}$: $S_{v},\, T_{v},\, Q_{v}$
--->

<!---
 * Now, however, we first update a subset of the coordinates of 
 $x$, followed by the complementary subset.
--->

 * <font color='#03a9f4'>**Binary masks**:</font>

   &nbsp;
   - <font color='#03a9f4'>$m^{t} \in \{0, 1\}^{n}$   </font>
     * Fixed random binary mask with half of its entries equal to 0 and the other half equal to 1.
 
   &nbsp;
   - <font color='#03a9f4'>$\bar{m}^{t} \equiv \mathbb{1} - m^{t}$   </font>
     * The complement of $m^{t}$
   
   &nbsp;
   - <font color='#03a9f4'>$x_{m^{t}} \equiv x\odot m^{t}$ </font>
     * Masking applied to $x$, where $\odot$ denotes elementwise multiplication.
 
---
<font color='#ff005b'>**Note:** </font> In the limit $S_{i}, Q_{i}, T_{i} \rightarrow 0$, we recover generic HMC (as expected).

### <font color='#f92672'>Augmented leapfrog integrator, $\mathbf{L}_{\theta}$:</font>

 1. <font color='#ffff00'>**Momentum (half-step) update** </font>$\,\,v_{n}\,\,\,\,\longrightarrow v_{n+\frac{1}{2}}$: 
 
     \begin{equation}
     v^{\prime}
     = v\odot\overbrace{%
             \exp\left(\frac{\varepsilon}{2}S_{v}(\zeta_{1})\right)
         }^{\textbf{Momentum scaling}}
         - \frac{\varepsilon}{2} \bigg[\,\,%
             \partial_{x}U(x)\odot\overbrace{
                   \exp(\varepsilon Q_{v}(\zeta_{1}))
             }^{\textbf{Gradient scaling}}
             + \overbrace{%
                 T_{v}(\zeta_{1})
             }^{\textbf{Translation}}\,\,
           \bigg]
     \end{equation}
     
   - **Inputs**: $\zeta_{1} = (x, \partial_{x}U(x), t)$
     
   - **Jacobian factor**: $\mathcal{J} = \exp\left(\frac{\varepsilon}{2}\mathbb{1}\cdot S_{v}(\zeta_{1})\right)$
---

 2. <font color='#ffff00'>**Position (first) sub-update** </font>$\,\,x_{m^{t}}\,\,\,\,\longrightarrow x^{\prime}$:
 
     \begin{equation}
       x^{\prime} = x_{\bar{m}^{t}} + m^{t}\odot\left[
         x\odot\exp(\varepsilon S_{x}(\zeta_{2}))
         + \varepsilon\left(v^{\prime}\odot\exp(\varepsilon Q_{x}(\zeta_{2}) + T_{x}(\zeta_{2})\right)
       \right]
     \end{equation}
     
   - **Inputs**: $\zeta_{2} = (x_{\bar{m}^{t}}, v^{\prime}, t)$
   - **Jacobian factor**: $\mathcal{J} = \exp\left(\varepsilon m^{t} \cdot S_{x}(\zeta_{2})\right)$
---

 3. <font color='#ffff00'> **Position (second) sub-update** </font>$\,\,x^{\prime}\,\,\,\,\longrightarrow x^{\prime\prime}$:
 
     \begin{equation}
       x^{\prime\prime} = x^{\prime}_{m^{t}} + \bar{m}^{t}\odot\left[
         x^{\prime}\odot\exp(\varepsilon S_{x}(\zeta_{3}))
         + \varepsilon\left(
           v^{\prime}\odot\exp(\varepsilon Q_{x}(\zeta_{3}) 
           + T_{x}(\zeta_{3})
         \right)
       \right]
     \end{equation}
 
   - **Inputs**: $\zeta_{3} = (x^{\prime}_{m^{t}}, v^{\prime}, t)$
   - **Jacobian factor**: $\mathcal{J} = \exp\left(\varepsilon m^{t} \cdot S_{x}(\zeta_{2})\right)$
---

4. <font color='#ffff00'>**Momentum (half-step) update**</font> $\,\,v^{\prime}\,\,\,\,\longrightarrow v^{\prime\prime}$: 
 
     \begin{equation}
       v^{\prime}
       = v\odot\exp\left(\frac{\varepsilon}{2}S_{v}(\zeta_{1})\right) 
         - \frac{\varepsilon}{2} \left[%
             \partial_{x}U(x)\odot\exp(\varepsilon Q_{v}(\zeta_{1})) 
             + T_{v}(\zeta_{1})
         \right]
     \end{equation}
     
   - **Inputs**: $\zeta_{4} = (x^{\prime\prime}, \partial_{x}U(x^{\prime\prime}), t)$
     
   - **Jacobian factor**: $\mathcal{J} = \exp\left(\frac{\varepsilon}{2}\mathbb{1}\cdot S_{v}(\zeta_{1})\right)$

---


### <font color='#f92672'>Jacobian determinant:</font> <a class="tocSkip">
    

\begin{align*}
   \log|\mathcal{J}| &= \log\left|\frac{\partial[\bfF \bfL_{\theta} \xi]}{\partial \xi^{T}}\right|\\
    &= d \sum_{t\leq M}\left[\frac{\varepsilon}{2}\mathbb{1}\cdot S_{v}(\zeta_{1}^{t})
      + \varepsilon m^{t}\cdot S_{x}(\zeta_{2}^{t})
      + \varepsilon \bar{m}^{t}\cdot S_{x}(\zeta_{3}^{t})
      + \frac{\varepsilon}{2}\mathbb{1}\cdot S_{v}(\zeta_{1}^{t}
    \right]
 \end{align*}

### <font color='#f92672'>MCMC Transitions:</font>

 * Sampling then consists of $M$ repeated applications of <font color='#ffff00'>(1.) - (4.)</font> $\bfL_{\theta}$, immediately followed by a momentum-flip $\bfF$:
 
     \begin{equation}
       \xi^{\prime} \equiv \bfF \bfL_{\theta} \xi 
         = \bfL_{\theta}(x, v, d) 
         = \left({x^{\prime\prime}}^{\times M},\, {v^{\prime\prime}}^{\times M},\, d\right)
     \end{equation}

### <font color='#f92672'>Network architecture:</font>

 * Each of the functions $S, T, Q$ are implemented using multi-layer perceptrons.
 
 
 * <font color='#ffff00'>Note:</font> We actually have two separate networks with identical architectures:
 
   1. **VNet**: Implements $S_{v}, T_{v}, Q_{v}$, used for updating $v$
   2. **XNet**: Implements $S_{x}, T_{x}, Q_{x}$, used for updating $x$
---   

### <font color='#ffff00'>**VNet**</font> <a class="tocSkip">

<img src="./img/net_arch.png" width=100% height=auto/>
<!---
<img src="./img/generic_net_darkbg.png" width=100% height=auto/>
--->

To simplify notation, we denote by $F = \partial_{x}U(x)$ the gradient of the potential energy.


 1.  **Input**: $\zeta_{1} = (x, F_{x}, t)$
    
 2. Pass the inputs through separate dense layers:
 
    \begin{align*}
    \tilde x &= W_{x}\, x + b_{x} \\
    \tilde{F} &= W_{F}\, F + b_{F} \\
    \tilde t &= W_{t}\, t + b_{t} \\
    \end{align*}
    
     
 4. Compute:
 
     \begin{align}
     h_{1} &= \sigma(\tilde{x} + \tilde{F} + \tilde{t})\\
     h_{2} &= \sigma\left(W_{h}\, h_{1} + b_{h}\right)
     \end{align}
     
     
 5. **Outputs**:
 
     \begin{align}
     S_{v} &= \lambda_{S} \tanh\left(W_{S}\, h_{2} + b_{S}\right) \\
     Q_{v} &= \lambda_{Q} \tanh\left(W_{Q}\, h_{2} + b_{Q}\right) \\
     T_{v} &= W_{T} \, h_{2} + b_{T}
     \end{align}
     
 where $\sigma$ denotes the $\mathrm{ReLU}$ activation function, and $\lambda_{S}, \lambda_{Q}$ are additional trainable parameters.

### <font color='#f92672'>Loss function $\mathcal{L}(\theta)$:</font>

 * Starting from $\xi$, use $S, T, Q$ in $\bfL_{\theta}$ to get $\xi^{\prime}$.
 
 * Define:
 
     \begin{equation}
     \delta(\xi, \xi^{\prime}) = \|x - x^{\prime}\|^{2}_{2}
     \end{equation}


 * We aim to minimize the *lag-one autocorrelation*
 
  - Or equivalently, the *expected squared jump distance* ESJD
  
  
 * <font color="#ffff00">**ESJD**:</font>
     \begin{equation}
     \mathcal{L}(\theta) \equiv \mathbb{E}_{\xi\sim p(\xi)}\left[\delta(\xi^{\prime}, \xi) \cdot A(\xi^{\prime}|\xi)\right]
     \end{equation}
 
  where $A(\xi^{\prime}|\xi)$ is the probability of accepting $\xi^{\prime}$ given $\xi$.

### <font color="#f92672">Example: Gaussian Mixture Model</font>

<!---
<font size="22"><font color='#87ff00'><font style:bold>HMC</font></font> |  <font size="22"><font color='#f92672'><font style:bold>L2HMC</font></font>
:-------------------------:|:-------------------------:
<img src="./img/gmm_hmc.png" width=80% height=auto/>  |  <img src="./img/gmm_l2hmc.png" width=80% height=auto/>
--->
    
<!---
<p float="left">
  <img src="./img/gmm_hmc.png" width="45%" />
  <img src="./img/gmm_l2hmc.png" width="45%" /> 
</p>
--->

<center><font size="6"><font color='#87ff00'><font style:bold>HMC
    </font></font></center>
<img src="./img/gmm_hmc.png" width=75% height=auto/>
    
&nbsp;
<center><font size="6"><font color='#f92672'><font style:bold>
    L2HMC
    </font></font></center>
<img src="./img/gmm_l2hmc.png" width=75% height=auto/>
    

&nbsp;
<center><font size="6">Autocorrelation</font></center>
        
<img src="./img/gmm_acl.png" width=75% height=auto/>

## <font color='#87ff00'>Lattice QCD</font>
---

 * Start with simpler 2d $U(1)$ Lattice Gauge Theory
 
 
 * Dynamical variables $U_{\mu}(i)$ defined on the *links* of a two-dimensional lattice with periodic boundary conditions.
 
 
 
 * Each link $U_{\mu}(k) \in U(1)$ can be expressed in terms of an angle $\phi_{\mu}(k) \in [-\pi, \pi)$ as:
   
    \begin{equation}
      U_{\mu}(k) = \exp\left(i\phi_{\mu}(k)\right) \in U(1)
    \end{equation}

### <font color='#f92672'> Wilson gauge action:</font>

\begin{equation}
  S = \sum_{k} 1 - \cos\left(\phi_{\mu\nu}(k)\right)
\end{equation}

where

\begin{equation}
   \phi_{\mu\nu}(k)
   = \phi_{\mu}(k) + \phi_{\nu}(k+\hat{\mu}) 
     - \phi_{\mu}(k+\hat{\nu}) - \phi_{\nu}(k)
\end{equation}


is the sum of the link variables around the elementary plaquette, as shown below.

<img src="./img/plaq.png" width=100% height=auto />

### <font color='#f92672'> Topological charge, $\mathcal{Q}$:</font>

\begin{equation}
  \mathcal{Q} = \frac{1}{2\pi}\sum_{k}\tilde \phi_{\mu\nu}
\end{equation}


where

\begin{equation}
\tilde\phi_{\mu\nu} \equiv \phi_{\mu\nu} - 2\pi\left\lfloor\frac{\phi_{\mu\nu}+\pi}{2\pi}\right\rfloor \in [0, 2\pi)
\end{equation}

<font color='#87ff00'><font size="6">HMC:</font>

<img src="./img/q_tplot_hmc.png" width=100% height=auto/>

<font color='#f92672'><font size="6">L2HMC:</font>
    
<img src="./img/q_tplot_l2hmc.png" width=100% height=auto/>

## <font color="#87ff00">Implementation</font>

### <font color='#f92672'>Imports<a class='tocSkip'>

In [15]:
import sys
import os 

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from IPython.display import Image
from IPython.core.display import HTML 

modulepath = os.path.abspath('..')
if modulepath not in sys.path:
    sys.path.append(modulepath)
    
import utils.file_io as io

sns.set_palette('bright')

### <font color='#f92672'>Build `Dynamics`

In [46]:
from utils.attr_dict import AttrDict
from utils.data_containers import DataContainer
from utils.training_utils import build_dynamics, HEADER

FLAGS = AttrDict({
    'hmc': False,
    'hmc_start': False,
    'rand': True,
    'restore': False,
    'horovod': False,
    'eager_execution': True,
    'save_train_data': True,
    'eps_fixed': False,
    'eps': 0.1,
    'num_steps': 2,
    'beta_init': 3.5,
    'beta_final': 3.5,
    'train_steps': 50,
    'save_steps': 10, 
    'print_steps': 1,
    'logging_steps': 1,
    'dropout_prob': 0.25,
    'warmup_lr': True,
    'lr_init': 0.001,
    'lr_decay_steps': 10,
    'lr_decay_rate': 0.96,
    'plaq_weight': 0.1,
    'charge_weight': 0.1,
    'network_type': 'GaugeNetwork',
    'separate_networks': False,
    'units': 6 * [1024],
    'lattice_shape': [128, 16, 16, 2]  # batch_dim, s
})

dynamics, FLAGS = build_dynamics(FLAGS)

### <font color='#f92672'>Training

#### <font color='#ffff00'>Setup:</font>

In [17]:
# Setup directories, etc.
log_dir = io.make_log_dir(FLAGS, 'GaugeModel')
train_dir = os.path.join(log_dir, 'training')
data_dir = os.path.join(train_dir, 'train_data')
ckpt_dir = os.path.join(train_dir, 'checkpoints')
log_file = os.path.join(train_dir, 'train_log.txt')
io.check_else_make_dir([train_dir, data_dir, ckpt_dir])

Creating directory: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429
Creating directory: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/training
Creating directory: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/training/train_data
Creating directory: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/training/checkpoints


In [22]:
from config import TF_FLOAT

train_data = DataContainer(FLAGS.train_steps, header=HEADER)
ckpt = tf.train.Checkpoint(dynamics=dynamics, optimizer=dynamics.optimizer)
manager = tf.train.CheckpointManager(ckpt, directory=ckpt_dir, max_to_keep=5)

if manager.latest_checkpoint:
    io.log(f'Restored from: {manager.latest_checkpoint}')
    ckpt.restore(manager.latest_checkpoint)
    train_data.restore(data_dir)
    current_step = dynamics.optimizer.iterations.numpy()
else:
    current_step = tf.convert_to_tensor(0, dtype=tf.int64)
    x = tf.random.uniform(shape=dynamics.config.input_shape,
                          minval=-np.pi, maxval=np.pi)
    
train_steps = tf.range(FLAGS.train_steps)
betas = tf.convert_to_tensor(
    tf.cast(np.linspace(FLAGS.beta_init, FLAGS.beta_final, num=FLAGS.train_steps), dtype=TF_FLOAT)
)

betas = betas[current_step:]
train_steps = train_steps[current_step:]

#### <font color='#ffff00'>Training loop:</font>

In [29]:
io.log(HEADER)
steps = tf.range(FLAGS.train_steps)
for step, beta in zip(steps, betas):
    x, metrics = dynamics.train_step(x, beta)
    
    if step % FLAGS.print_steps == 0:
        data_str = train_data.get_fstr(step, metrics)
        io.log(data_str)
        
    if step % FLAGS.logging_steps == 0 and FLAGS.save_train_data:
        train_data.update(step, metrics)
        
    if step % FLAGS.save_steps == 0:
        manager.save()
        train_data.save_data(data_dir)
        train_data.flush_data_strs(log_file, rank=0, mode='a')
        
manager.save()
train_data.flush_data_strs(log_file, rank=0, mode='a')

------------------------------------------------------------------------------------------------------------
    STEP         dt         LOSS         px         eps         BETA     sumlogdet       dQ       plaq_err  
------------------------------------------------------------------------------------------------------------
     0/50       0.4193      -450.7       0.695      0.1325        3.5     -0.0001699    0.1573     0.006343   
     1/50       0.4138      -431.8      0.6552      0.1328        3.5     -6.806e-05    0.1605     0.007405   
     2/50       0.4618      -459.9      0.6999      0.1331        3.5      4.874e-05    0.1496     0.006409   
     3/50       0.4183      -438.7      0.6542      0.1335        3.5     -0.0001358    0.1487     0.005778   
     4/50       0.4348      -444.8      0.6591      0.1338        3.5     -0.0001011    0.1353     0.004829   
     5/50       0.5002      -409.3      0.6008      0.1341        3.5      1.482e-05    0.1457     0.005037   
     6/

### <font color='#f92672'>Inference loop:</font>

In [42]:
run_steps = 100
run_data = DataContainer(run_steps, header=HEADER)
run_dir = io.make_run_dir(FLAGS, os.path.join(log_dir, 'inference'))

for step in tf.range(run_steps):
    x, metrics = dynamics.test_step(x, beta)
    run_data.update(step, metrics)
    if step % dynamics.print_steps == 0:
        data_str = run_data.get_fstr(step, metrics)
        io.log(data_str)

    if step % 100 == 0:
        io.log(HEADER)

Creating directory: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247
     0/100      0.4549      -456.9       0.608      0.1426        3.5      6.473e-05    0.03906    0.001858   
------------------------------------------------------------------------------------------------------------
    STEP         dt         LOSS         px         eps         BETA     sumlogdet       dQ       plaq_err  
------------------------------------------------------------------------------------------------------------
     1/100      0.4952      -433.8       0.578      0.1426        3.5     -0.0001515    0.05469    0.001919   
     2/100      0.4523      -446.6      0.5863      0.1426        3.5     -0.0003302    0.07031    0.002384   
     3/100      0.4393      -468.9      0.6179      0.1426        3.5      0.0001929    0.05469    0.002752   
     4/100      0.4316      -445.7      0.5918      

    70/100      0.6067      -494.6      0.6602      0.1426        3.5      9.044e-05    0.02344    0.001043   
    71/100       0.623      -432.6      0.5696      0.1426        3.5      2.121e-05    0.03125    0.001757   
    72/100      0.7012      -457.4      0.6002      0.1426        3.5     -0.0001271    0.07813    0.001224   
    73/100      0.6829      -459.8      0.6121      0.1426        3.5      5.612e-05    0.03125    0.001893   
    74/100      0.6754      -475.6      0.6389      0.1426        3.5     -0.0005095    0.03906    0.0005029  
    75/100      0.7197      -476.2      0.6357      0.1426        3.5      0.0001793    0.02344    0.0003347  
    76/100      0.7423      -427.5      0.5663      0.1426        3.5     -0.0001553    0.03125    0.0003972  
    77/100      0.6565       -450        0.602      0.1426        3.5      2.506e-05    0.05469    0.0005099  
    78/100      0.7027      -428.3      0.5686      0.1426        3.5      5.36e-05     0.03906    0.001144   
 

In [44]:
from utils.plotting_utils import plot_data

plot_data(run_data, run_dir, FLAGS, thermalize=True)

Creating directory: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/dt.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/beta.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/loss.png.




Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/plaqs_traceplot.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/eps.png.




Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/charges_traceplot.png.




Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/accept_prob_traceplot.png.




Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/sumlogdet_traceplot.png.




Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/dq_traceplot.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/plaqs_avg.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/charges_avg.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/accept_prob_avg.png.
Saving figure to: /Users/saforem2/l2hmc-qcd/gauge_logs_eager/test/2020_07/DEBUG_L16_b128_lf2_qw01_pw01_dp025-2020-07-02-092429/inference/beta35_lf2_eps01_b128-2020-07-02-1247/plots/sumlogdet_avg.png.
Savin