In [1]:
import openpathsampling as paths

# used to make the pictures below
class Plotter(object):
    pass

# used to make it easy to generate trajectories
def trajectory_1D(l):
    ll = []
    for s in l:
        ll.append(paths.engines.toy.Snapshot(coordinates=[[s]], velocities=[[0.0]]))
    return paths.Trajectory(ll)

# Understanding sequential ensembles

The `SequentialEnsemble` object is one of the more powerful, but also more difficult, tools in the OpenPathSampling toolkit.

At first, it looks deceptively simple: it is just a list of path ensembles which must be applied in order. However, in practice there are several subtle points to pay attention to.

To understand all of this, we'll consider one dimensional trajectories: the time will be plotted along the horizontal axis, with the value along the vertical axis.

## Simple example

First let's consider an easy example. We'll define a single state. The trajectory we're interested in will begin in the state, then exit the state, then again return to the state. So we have the sequence `[AllInXEnsemble(state), AllOutXEnsemble(state), AllInXEnsemble(state)]`.

In [2]:
xval = paths.FunctionCV(name="x", f=lambda snap : snap.coordinates[0][0])
state = paths.CVDefinedVolume(xval, lambda_min=float("-inf"), lambda_max=0.0)

# define "in" and "out"
x_in = -1.0
x_out = 1.0

In [3]:
traj1 = trajectory_1D([x_in, x_out, x_in])
traj2 = trajectory_1D([x_in, x_in, x_out, x_out, x_in, x_in])
traj3 = trajectory_1D([x_in, x_out, x_in, x_in])
all_in_traj = trajectory_1D([x_in, x_in, x_in])

In [4]:
seq_ens_01 = paths.SequentialEnsemble([
    paths.AllInXEnsemble(state),
    paths.AllOutXEnsemble(state),
    paths.AllInXEnsemble(state)
])

In [5]:
seq_ens_01(all_in_traj)

False

In [6]:
seq_ens_01(traj1)

True

In [7]:
seq_ens_01(traj2)

True

## Slightly less simple: cap ends with `LengthEnsemble(1)`

In [8]:
seq_ens_02 = paths.SequentialEnsemble([
    paths.AllInXEnsemble(state) & paths.LengthEnsemble(1),
    paths.AllOutXEnsemble(state),
    paths.AllInXEnsemble(state) & paths.LengthEnsemble(1)
])

In [9]:
seq_ens_01(all_in_traj)

False

In [10]:
seq_ens_01(traj1)

True

In [11]:
seq_ens_01(traj2)

True

### Exercises

1. How would you implement a sequential ensemble which only has the length-1 cap at the beginning of the trajectory? Which of the given trajectories would satisfy that ensemble? 

2. How could you extend trajectories that do satisfy that ensemble so that they do not satisfy it?

3. Implement that ensemble. Test your predictions from question 1. Create trajectories and test your predictions from question 2.

## Can-append in the sequential ensemble

### Exercises

## Optional parts of the sequential ensemble

Up to here, this should be pretty straightforward: everything works exactly the way you'd expect it to work. But now we start to make things a little more complicated. In order to carefully define a generic version of a sequential ensemble, there are many cases where you might want to consider some sort of "optional" step in the sequence.

???


As a technical side-note, this is implemented by forming the union of the ensemble with a zero-length trajectory. This means that the zero-length trajectory must be excluded from any other ensemble used to build a sequential ensemble.

## Using the `OptionalEnsemble` to generalize to the case of interstitials

One example where we have frequently used the `OptionalEnsemble` class is when we have interstitial space between the edge of the state and the innermost interface. In simple cases, the innermost interface of TIS is usually set to be exactly the boundary of the state. But this isn't a requirement of the method.

### Exercises

1. Consider an ensemble of trajectories which start ???. If the innermost interface is equivalent to the state border, then it can be defined as ???. How would you adjust this ensemble to account for the possibility of interstitial space between the state border and the innermost ensemble?

2. Which of the following trajectories should satisfy the ensemble with interstitials? Implement your proposal from question 1 and test it.
   1. traj1
   2. traj2

## Sequential ensembles with non-disjoint adjacent ensembles

Up to this point, we've used sequential ensembles where the successive ensembles can't have overlapping frames. That is, you can't have a frame which is simultaneously inside and outside of the same state. You *can* describe sequential ensembles where such overlaps are allowed, but they become much more complicated and more subtle. In particular, you must pay special attention to whether you can sample the same ensemble using either the `can_append` or `can_prepend` functions.

In this, we've described some parts of the concepts of "consistent ensembles" and "efficient ensembles" from path ensemble theory. If you're working on problems where you are developing very complicated ensembles, we highly recommend that you become familiar with the mathematical structure described in path ensemble theory, because it will allow you to safely develop new path ensembles.

## Sequential ensembles check status in the forward direction

Since it is possible for a trajectory to match a sequential ensemble when checked with the forward propagation, but not when checked with backward propagation (or vice-versa), it is important to know that, in practice, the code uses forward propagation to check. But we emphasize that an ensemble which doesn't give the same results in both directions is probably not suitable for most practical purposes.

Here is a small function `backward_check` which takes an ensemble and a trajectory and checks whether the trajectory satisfies the ensemble with checked with backward propagation. You will use it in the exercises below.

### Exercises

1. Given the ensemble ???, which of the following trajectories satisfy the ensemble based on forward propagation? Which satisfy backward propagation? Test your predictions
  1.  traj1
  2.  traj2

3. Redefine the ensemble so that all the trajectories in question 1 work for either forward or backward propagation.