# Grand canonical ensemble transition-matrix Monte Carlo

In this example, flat histogram methods are employed for a small macrostate range from 0 to 5 particles.
Flat histogram acceptance criteria and Monte Carlo are defined using `fh.py`.
To begin, the system is initialized with the minimum number of particles by setting Metropolis acceptance criteria with favorable conditions for adding particles.
The Metropolis criteria are then replaced with the flat histogram criteria.
At this point, typical analysis from the previous tutorials are added.
In addition, we also add checkpoint files, criteria status, and average energy of a given macrostate.
Finally, the simulation is run until the requested number of iterations of the flat histogram algorithm are complete.

A small macrostate range allows the simulation to run quickly with good sampling, and thus it is an ideal starting point to test the simulations. To begin, read the previous SRSW values from file for comparison.

In [1]:
import pandas as pd
import feasst as fst

ln_prob_srsw = fst.LnProbability(pd.read_csv("../test/data/stat150.csv")["lnPI"].values[:6])
ln_prob_srsw.normalize() # normalize to account for a smaller macrostate range
df = pd.DataFrame(data=ln_prob_srsw.values(), columns={"ln_prob_srsw"})
df['ln_prob_srsw_std'] = 0.04
df['u_srsw'] = pd.read_csv("../test/data/stat150.csv")["energy"]
df['u_srsw_std'] = pd.read_csv("../test/data/stat150.csv")["energystd"]
df

Unnamed: 0,ln_prob_srsw,ln_prob_srsw_std,u_srsw,u_srsw_std
0,-18.70757,0.04,-2.312265e-10,6.689238e-10
1,-14.037373,0.04,-0.0006057402,6.709198e-10
2,-10.050312,0.04,-0.03057422,9.649147e-06
3,-6.458921,0.04,-0.08992832,0.0001387472
4,-3.145637,0.04,-0.1784571,3.315245e-05
5,-0.045677,0.04,-0.296192,1.348791e-05


In [2]:
import unittest
import pyfeasst

def run_sample_lj_tm_mc(checkpoint_file_name):
    monte_carlo = fst.MonteCarlo()
    monte_carlo.set(fst.lennard_jones(fst.args({"cubic_box_length": "8"})))
    monte_carlo.set(fst.MakeFlatHistogram(
        fst.MakeMacrostateNumParticles(fst.Histogram(fst.args({"width": "1", "min": "0", "max": "5"}))),
        fst.MakeTransitionMatrix(fst.args({"min_sweeps": "50"})),
        fst.args({"beta": str(1./1.5), "chemical_potential": "-2.352321"})))
    monte_carlo.add(fst.MakeTrialTranslate(fst.args({"weight": "0.25", "tunable_param": "1."})))
    monte_carlo.add(fst.MakeTrialTransfer(fst.args({"weight": "1", "particle_type": "0"})))
    monte_carlo.add(fst.MakeLogAndMovie(fst.args({"steps_per": str(1e5), "file_name": "lj"})))
    monte_carlo.add(fst.MakeCheckEnergyAndTune(fst.args({"steps_per": str(1e5), "tolerance": str(1e-8)})))
    monte_carlo.add(fst.MakeCriteriaUpdater(fst.args({"steps_per": str(1e5)})))
    monte_carlo.add(fst.MakeCriteriaWriter(fst.args({"steps_per": str(1e5), "file_name": "lj_fh.txt"})))
    monte_carlo.add(fst.MakeEnergy(fst.args({"file_name": "lj_en.txt", "steps_per_update": "1",
        "steps_per_write": str(1e5), "multistate": "true"})))
    monte_carlo.set(fst.MakeCheckpoint(fst.args({"file_name": checkpoint_file_name, "num_hours": "0.001"})))
    monte_carlo.run_until_complete()

class TestFlatHistogramLJ(unittest.TestCase):
    """Test flat histogram grand canonical ensemble Monte Carlo simulations"""
    def test_serial_5max(self):
        """Compare the free energies and potential energies with the NIST SRSW
        https://www.nist.gov/programs-projects/nist-standard-reference-simulation-website
        https://mmlapps.nist.gov/srs/LJ_PURE/eostmmc.htm
        """
        # To emulate post-processing, obtain monte_carlo from checkpoint file
        checkpoint_file_name='checkpoint.txt'
        run_sample_lj_tm_mc(checkpoint_file_name)
        monte_carlo = fst.MonteCarlo().deserialize(pyfeasst.read_checkpoint(checkpoint_file_name))

        # To compare with previous values, make a deep copy of the FlatHistogram derived class
        criteria = fst.FlatHistogram(monte_carlo.criteria())
        print('lnpi energy')
        for macro in range(criteria.num_states()):
            self.assertAlmostEqual(
                df["ln_prob_srsw"][macro],
                criteria.bias().ln_prob().value(macro),
                delta=df["ln_prob_srsw_std"][macro])
            energy_analyzer = monte_carlo.analyze(monte_carlo.num_analyzers() - 1)
            energy_accumulator = energy_analyzer.analyze(macro).accumulator()
            stdev = (df["u_srsw_std"][macro]**2 + energy_accumulator.block_stdev()**2)**(1./2.)
            #print(criteria.bias().ln_prob().value(macro), energy_accumulator.average())
            self.assertAlmostEqual(
                df["u_srsw"][macro],
                energy_accumulator.average(),
                delta=5*stdev)

In [3]:
%%time
unittest.main(argv=[''], verbosity=2, exit=False)

test_serial_5max (__main__.TestFlatHistogramLJ)
Compare the free energies and potential energies with the NIST SRSW ... 

lnpi energy
CPU times: user 10.1 s, sys: 1.37 s, total: 11.5 s
Wall time: 11.5 s


ok

----------------------------------------------------------------------
Ran 1 test in 11.453s

OK


<unittest.main.TestProgram at 0x7f623aa4ccf8>

A number of files should also have been created.
If the flat histogram method is sampling perfectly, the simulation performs a random walk along the macrostate.
For larger ranges of macrostates, or for more difficult sampling cases, monitoring the macrostate can help you determine what conditions are preventing convergence.
For example, a plot of the macrostate as a function of the number of attempts may look like the following:


In [4]:
pd.read_csv("lj.txt", header=0).dropna(axis='columns')

Unnamed: 0,volume,p0,state,energy,attempt,TrialTranslate,tunable,TrialAdd,TrialRemove
0,512,5,5,-0.9423,100000,0.939505,1,0.0430282,0.0429021
1,512,2,2,-0.00242296,200000,0.965169,1.05,0.565513,0.565051
2,512,5,5,-0.10326,300000,0.974583,1.1025,0.771117,0.977977
3,512,2,2,-0.00242296,400000,0.966711,1.15763,0.815167,0.973676
4,512,2,2,-0.00242296,500000,0.968584,1.21551,0.807139,0.972768
...,...,...,...,...,...,...,...,...,...
100,512,2,2,-0.00242296,4800000,0.956784,3.92013,0.812945,0.972064
101,512,0,0,1.70979e-16,4900000,0.956441,3.92013,0.814978,0.974317
102,512,4,4,-0.76314,5000000,0.957303,3.92013,0.813699,0.973621
103,512,5,5,-0.894897,5100000,0.959727,3.92013,0.814327,0.973249


Note that states are index integer values starting from 0 (e.g., 0, 1, 2, ..., criteria.num_states() - 1)
The state and macrostate happen to be the same when the minimum macrostate is 0, and the macrostate is the integer number of particles.
But if the minimum macrostate was 1, then state 0 would correspond to macrostate 1.0.
Obtain an arbitrary macrostate value from the state as follows.

In [5]:
monte_carlo = fst.MonteCarlo().deserialize(pyfeasst.read_checkpoint("checkpoint.txt"))
criteria = fst.FlatHistogram(monte_carlo.criteria())
print('state macrostate')
for state in range(criteria.num_states()):
    print(state, criteria.macrostate().value(state))

state macrostate
0 0.0
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0


Many simulation parameters may be obtained from the checkpoint file to automate your analysis.

In [6]:
print('volume', monte_carlo.configuration().domain().volume())
print('beta', monte_carlo.criteria().beta())
print('beta_mu', monte_carlo.criteria().beta_mu())
print('macro_min', criteria.macrostate().value(0))  # monte_carlo.critera() doesn't know macrostate. Use copy of derived class
print('macro_max', criteria.macrostate().value(criteria.num_states() - 1))
print('macro_max', criteria.macrostate().histogram().center_of_last_bin())

volume 512.0
beta 0.6666666666666666
beta_mu -1.5682139999999998
macro_min 0.0
macro_max 5.0
macro_max 5.0


The energy of each macrostate may also be compared with the published values in the NIST SRSW.

In [7]:
en = pd.read_csv("lj_en.txt").rename(columns={"average": "u", "block_stdev": "u_std"})
pd.concat([pd.DataFrame(en[["u", "u_std"]]), df[["u_srsw", "u_srsw_std"]]], axis=1)

Unnamed: 0,u,u_std,u_srsw,u_srsw_std
0,4.588091e-16,3.784382e-16,-2.312265e-10,6.689238e-10
1,-0.00060574,0.0,-0.0006057402,6.709198e-10
2,-0.03053308,0.0002170894,-0.03057422,9.649147e-06
3,-0.08954303,0.0003221287,-0.08992832,0.0001387472
4,-0.1775581,0.0008888823,-0.1784571,3.315245e-05
5,-0.2964494,0.002584105,-0.296192,1.348791e-05


You may also compare the natural logarithm of the macrostate probability

In [8]:
pd.concat([df["ln_prob_srsw"], pd.read_csv("lj_fh.txt", header=3)['ln_prob']], axis=1)

Unnamed: 0,ln_prob_srsw,ln_prob
0,-18.70757,-18.701628
1,-14.037373,-14.031816
2,-10.050312,-10.046629
3,-6.458921,-6.456832
4,-3.145637,-3.145035
5,-0.045677,-0.045708


The macrostate probability distribution depends upon the choice of the chemical potential, but can be reweighted to different chemical potentials.

Did this tutorial work as expected? Did you find any inconsistencies or have any comments? Please [contact](../../../CONTACT.rst) us. Any feedback is appreciated!