# Prepare from bootstrap paths

Now we will use the initial trajectories we obtained from bootstrapping to run an MSTIS simulation. This will show both how objects can be regenerated from storage and how regenerated equivalent objects can be used in place of objects that weren't stored.

Tasks covered in this notebook:
* Loading OPS objects from storage
* Ways of assigning initial trajectories to initial samples
* Setting up a path sampling simulation with various move schemes
* Visualizing trajectories while the path sampling is running

In [1]:
%matplotlib inline
import openpathsampling as paths
import numpy as np
import math

In [2]:
# Imports for plotting
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
from matplotlib.legend_handler import HandlerLine2D

### Loading things from storage

First we'll reload some of the stuff we stored before. Of course, this starts with opening the file.

In [3]:
old_store = paths.AnalysisStorage("ala.nc")

A lot of information can be recovered from the old storage, and so we don't have the recreate it. However, we did not save our network, so we'll have to create a new one. Since the network creates the ensembles, that means we will have to translate the trajectories from the old ensembles to new ensembles.

In [4]:
print "PathMovers:", len(old_store.pathmovers)
print "Engines:", len(old_store.engines)
print "Samples:", len(old_store.samples)
print "Ensembles:", len(old_store.ensembles)
print "SampleSets:", len(old_store.samplesets)
print "Snapshots:", len(old_store.snapshots)
print "Networks:", len(old_store.networks)

PathMovers: 0
Engines: 3
Samples: 28
Ensembles: 351
SampleSets: 1
Snapshots: 6010
Networks: 1


Loading from storage is very easy. Each store is a list. We take the 0th snapshot as a template (it doesn't actually matter which one) for the next storage we'll create. There's only one engine stored, so we take the only one.

In [5]:
template = old_store.snapshots[0]

In [6]:
engine = old_store.engines['default']

In [8]:
mstis = old_store.networks[0]

Now we need to set up real trajectories that we can use for each of these. We can start by loading the stored sample set.

In [9]:
sset = paths.SampleSet(old_store.samplesets[0])

In [10]:
sset.sanity_check()
assert(len(sset)==28)

### Setting up special ensembles

Whichever way we initially set up the `SampleSet`, at this point it only contains samples for the main sampling trajectories of each transition. Now we need to put trajectories into various auxiliary ensembles.

#### Multiple state outer ensemble

The multiple state outer ensemble is, in fact, sampled during the bootstrapping. However, it is actually sampled once for every state that shares it. It is very easy to find a trajectory that satisfies the ensemble and to load add that sample to our sample set.

In [11]:
for outer_ens in mstis.special_ensembles['ms_outer']:
    # doesn't matter which we take, so we take the first
    traj = next(s.trajectory for s in old_store.samplesets[0] if outer_ens(s.trajectory)==True)
    samp = paths.Sample(
            replica=None,
            ensemble=outer_ens,
            trajectory=traj
    )
    # now we apply it and correct for the replica ID
    sset.append_as_new_replica(samp)

In [12]:
sset.sanity_check()
assert(len(sset)==29)

#### Minus interface ensemble

The minus interface ensembles do not yet have a trajectory. We will generate them by starting with same-state trajectories (A-to-A, B-to-B, C-to-C) in each interface, and extending into the minus ensemble.

* check whether the traj is A-to-A
* extend

First we need to make sure that the trajectory in the innermost ensemble of each state also ends in that state. This is necessary so that when we extend the trajectory, it can extends into the minus ensemble.

If the trajectory isn't right, we run a shooting move on it until it is.

In [13]:
for transition in mstis.sampling_transitions:
    innermost_ensemble = transition.ensembles[0]
    shooter = None
    if not transition.stateA(sset[innermost_ensemble].trajectory[-1]):
        shooter = paths.OneWayShootingMover(ensemble=innermost_ensemble,
                                            selector=paths.UniformSelector(),
                                            engine=engine)
        pseudoscheme = paths.LockedMoveScheme(root_mover=shooter)
        pseudosim = paths.PathSampling(storage=None, 
                                       move_scheme=pseudoscheme, 
                                       globalstate=sset)
    while not transition.stateA(sset[innermost_ensemble].trajectory[-1]):
        pseudosim.run(1)
        sset = pseudosim.globalstate

    

Working on Monte Carlo cycle number 25.
DONE! Completed 25 Monte Carlo cycles.


Now that all the innermost ensembles are safe to use for extending into a minus interface, we extend them into a minus interface:

In [14]:
def traj_info(traj, ensemble):
    return traj.summarize_by_volumes_str(
            {"A" : ensemble.state_vol,
             "I" : ~ensemble.state_vol & ensemble.innermost_vol,
             "X" : ~ensemble.innermost_vol})

In [15]:
minus_samples = []
for transition in mstis.sampling_transitions:
#    paths.tools.refresh_output('Creating minus sample for %s' % transition.name) 
    print transition.name,
    choices = list()
    from_state = transition.stateA
    max_length = 10
    while len(choices) == 0: 
        leaveSt = paths.SequentialEnsemble([
                paths.AllInXEnsemble(from_state),
                paths.SingleFrameEnsemble(
                    paths.AllOutXEnsemble(from_state)
                )
            ])
        extend = paths.LengthEnsemble(slice(0,max_length)) & paths.SequentialEnsemble([
                paths.AllOutXEnsemble(from_state),
                paths.SingleFrameEnsemble(
                    paths.AllInXEnsemble(from_state)
                )
            ])
        part1 = sset[transition.ensembles[0]].trajectory
        part2 = engine.generate_forward(part1[-1], leaveSt)
        part3 = engine.generate_forward(part2[-1], extend)
        
#        print len(part1), len(part2), len(part3)
        
        attempt = part1 + part2[1:] + part3[1:]
        
#         attempt = transition.minus_ensemble.populate_minus_ensemble(
#             partial_traj=sset[transition.ensembles[0]].trajectory,
#             minus_replica_id=-len(minus_samples)-1,
#             engine=engine
#         )
        choices = transition.minus_ensemble.split(attempt)
    
        print '%2d' % max_length,
        
        max_length += 5

    print '(%2d) %s' % (len(attempt), traj_info(attempt, transition.minus_ensemble))
        
    minus_samples.append(
        paths.Sample(
            replica=-len(minus_samples)-1,
            trajectory=choices[0],
            ensemble=transition.minus_ensemble
        )        
    )
    
sset = sset.apply_samples(minus_samples)

Out B0 10 15 20 (14) A-X-A-X-A
Out A0 10 15 (10) A-X-A-X-A
Out D0 10 (17) A-X-A-X-A
Out E0 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 (70) A-X-A-X-A
Out C0 10 (12) A-X-A-X-A
Out F0 10 (35) A-X-A-X-A


In [16]:
sset.sanity_check()
assert(len(sset)==35)

In [17]:
for s in sset:
    print s.replica, len(s.trajectory), s.ensemble.name, s.ensemble(s.trajectory)

0 3 I'face 0 True
1 29 I'face 1 True
2 52 I'face 2 True
3 71 I'face 3 True
4 71 [TISEnsemble] True
5 9 I'face 0 True
6 28 I'face 1 True
7 24 I'face 2 True
8 24 [TISEnsemble] True
9 3 I'face 0 True
10 25 I'face 1 True
11 48 I'face 2 True
12 46 I'face 3 True
13 46 [TISEnsemble] True
14 10 I'face 0 True
15 16 I'face 1 True
16 94 I'face 2 True
17 136 I'face 3 True
18 136 [TISEnsemble] True
19 15 I'face 0 True
20 16 I'face 1 True
21 71 I'face 2 True
22 71 [TISEnsemble] True
24 86 I'face 1 True
25 70 I'face 2 True
26 70 I'face 3 True
27 70 [TISEnsemble] True
28 52 [UnionEnsemble] True
23 30 I'face 0 True
-1 14 [MinusInterfaceEnsemble] True
-2 10 [MinusInterfaceEnsemble] True
-3 17 [MinusInterfaceEnsemble] True
-4 70 [MinusInterfaceEnsemble] True
-5 12 [MinusInterfaceEnsemble] True
-6 35 [MinusInterfaceEnsemble] True


## Equilibration

In molecular dynamics, you need to equilibrate if you don't start with an equilibrium frame (e.g., if you start with solvent molecules on a grid, your system should equilibrate before you start taking statistics). Similarly, if you start with a set of paths which are far from the path ensemble equilibrium, you need to equilibrate. This could either be because your trajectories are not from the real dynamics (generated with metadynamics, high temperature, etc.) or because your trajectories are not representative of the path ensemble (e.g., if you put transition trajectories into all interfaces).

As with MD, running equilibration can be the same process as running the total simulation. However, in path sampling, it doesn't have to be: we can equilibrate without replica exchange moves or path reversal moves, for example. In the example below, we create a `MoveScheme` that only includes shooting movers.

In [18]:
equil_scheme = paths.MoveScheme(mstis)
import openpathsampling.analysis.move_strategy as strat
equil_scheme.append([
        strat.OneWayShootingStrategy(engine=engine), 
        strat.OrganizeByMoveGroupStrategy()
    ])

In [19]:
equilibration = paths.PathSampling(
    storage=None,
    globalstate=sset,
    move_scheme=equil_scheme
)

In [20]:
equilibration.run(20)

Working on Monte Carlo cycle number 20.
DONE! Completed 20 Monte Carlo cycles.


In [21]:
sset = equilibration.globalstate

In [22]:
storage = paths.storage.Storage("weina_pre.nc", "w")

In [24]:
storage.save(template)

(store.snapshots[BaseSnapshot],
 2,
 UUID('646b9f7a-4750-11e6-8d6d-00000000000c'))

In [71]:
storage.snapshots.store_snapshot_list

[store.snapshot0[Snapshot(BaseSnapshot)]]

In [84]:
s0 = storage.stores['snapshot0']

In [89]:
storage.variables.keys()

['stores_json',
 'stores_uuid',
 'stores_name',
 'transitions_json',
 'transitions_uuid',
 'transitions_name',
 'pathmovechanges_uuid',
 'pathmovechanges_details',
 'pathmovechanges_mover',
 'pathmovechanges_cls',
 'pathmovechanges_subchanges',
 'pathmovechanges_samples',
 'engines_json',
 'engines_uuid',
 'engines_name',
 'topologies_json',
 'topologies_uuid',
 'topologies_name',
 'pathsimulators_json',
 'pathsimulators_uuid',
 'pathsimulators_name',
 'trajectories_json',
 'trajectories_uuid',
 'trajectories_snapshots',
 'tag_json',
 'tag_uuid',
 'tag_name',
 'cvs_json',
 'cvs_uuid',
 'cvs_name',
 'snapshots_uuid',
 'snapshots_store',
 'snapshottype',
 'cvcache',
 'pathmovers_json',
 'pathmovers_uuid',
 'pathmovers_name',
 'schemes_json',
 'schemes_uuid',
 'schemes_name',
 'steps_uuid',
 'steps_change',
 'steps_active',
 'steps_previous',
 'steps_simulation',
 'steps_mccycle',
 'details_json',
 'details_uuid',
 'samples_uuid',
 'samples_trajectory',
 'samples_ensemble',
 'samples_repl

In [75]:
storage.snapshots.cache._cache.keys()

[6,
 8,
 10,
 12,
 14,
 16,
 18,
 20,
 22,
 24,
 26,
 28,
 30,
 32,
 34,
 36,
 38,
 40,
 42,
 44,
 46,
 48,
 50,
 52,
 54,
 56,
 58,
 60,
 62,
 64,
 66,
 68,
 70,
 72,
 74,
 76,
 78,
 80,
 82,
 84,
 86,
 88,
 90,
 92,
 94,
 96,
 98,
 100,
 102,
 104,
 106,
 108,
 110,
 112,
 114,
 116,
 118,
 120,
 122,
 124,
 126,
 128,
 130,
 132,
 134,
 136,
 138,
 140,
 142,
 144,
 146,
 148,
 150,
 152,
 154,
 156,
 158,
 160,
 162,
 164,
 166,
 168,
 170,
 172,
 174,
 176,
 178,
 180,
 182,
 184,
 186,
 188,
 190,
 192,
 194,
 196,
 198,
 200,
 202,
 204,
 206,
 208,
 210,
 212,
 214,
 216,
 218,
 220,
 222,
 224,
 226,
 228,
 230,
 232,
 234,
 236,
 238,
 240,
 242,
 244,
 246,
 248,
 250,
 252,
 254,
 256,
 258,
 260,
 262,
 264,
 266,
 268,
 270,
 272,
 274,
 276,
 278,
 280,
 282,
 284,
 286,
 288,
 290,
 292,
 294,
 296,
 298,
 300,
 302,
 304,
 306,
 308,
 310,
 312,
 314,
 316,
 318,
 320,
 322,
 324,
 326,
 328,
 330,
 332,
 334,
 336,
 338,
 340,
 342,
 344,
 346,
 348,
 350,
 352,
 354,

In [72]:
storage.snapshots[10000]

In [70]:
storage.snapshots.vars['store'

IndexError: 

In [36]:
len(storage.snapshots)

2056

In [41]:
f = next(iter(sset))




In [49]:
f.__dict__

{'__uuid__': UUID('0812e2b8-4752-11e6-bcc8-000000074a3a'),
 '_lazy': {<openpathsampling.netcdfplus.proxy.DelayedLoader at 0x122d3a2d0>: None,
  <openpathsampling.netcdfplus.proxy.DelayedLoader at 0x122d3a310>: None,
  <openpathsampling.netcdfplus.proxy.DelayedLoader at 0x122d3a350>: None},
 'bias': 1.0,
 'ensemble': <openpathsampling.ensemble.TISEnsemble at 0x123a34750>,
 'replica': 1,
 'trajectory': Trajectory[29]}

In [45]:
f.trajectory[0].engine

<openpathsampling.engines.openmm.engine.OpenMMEngine at 0x128ebe690>

In [56]:
storage.cvs.save(old_store.cvs['opA'])

AttributeError: 'NoneType' object has no attribute 'engine'

In [51]:
storage.ensembles.save(f.ensemble)

AttributeError: 'NoneType' object has no attribute 'engine'

In [50]:
for samp in sset:
#    print samp, storage.samples.save(samp)
    if samp.ensemble:
        storage.ensembles.save(samp.ensemble)
#    storage.trajectories.save(samp.trajectory)
#    for s in samp.trajectory:
#        print samp, s, storage.save(s)

AttributeError: 'NoneType' object has no attribute 'engine'

In [38]:
storage.save(sset)
storage.save(mstis)
storage.save(engine)

AttributeError: 'NoneType' object has no attribute 'engine'

In [24]:
for tag in ['states', 'state_centers', 'state_letter', 'interface_levels']:
    storage.tag[tag] = old_store.tag[tag]

In [25]:
storage.tag['engine'] = engine
storage.tag['initial_sampleset'] = sset
storage.tag['mstis'] = mstis

In [26]:
import os

In [27]:
print 'filesize : %d MB' % (os.stat('weina_pre.nc').st_size / 1024 / 1024)

filesize : 27 MB


In [28]:
storage.close()