## Advanced Universe creation

This notebook contains examples of more complicated `Universe` construction.

In [1]:
import MDAnalysis as mda
import MDAnalysisData as data

### transfer_to_memory

The MDAnalysis data model only loads a single frame of trajectory data into memory at any point.  This is because loading an entire trajectory at once would require a large amount of memory.

Using the `in_memory` keyword in `Universe` creation, (or calling the `Universe.transfer_to_memory()` method,
the entire trajectory can be read into memory.
This will require significantly more memory on the workstation,
typically a similar amount to the filesize of the trajectory.

In [2]:
adk = data.datasets.fetch_adk_equilibrium()

In [3]:
regular_u = mda.Universe(adk['topology'], adk['trajectory'])

%timeit [ts.frame for ts in regular_u.trajectory]

600 ms ± 92.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Iterating through a trajectory can be much faster without having to read from the trajectory file for each frame.

In [4]:
memory_u = mda.Universe(adk['topology'], adk['trajectory'], in_memory=True)

%timeit [ts.frame for ts in memory_u.trajectory]

24.1 ms ± 672 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


Transferring a trajectory to memory converts the `Universe.trajectory` object to a `MemoryReader`.
One notable difference of this `Reader` is any changes made to atom positions are permanent!
This can be useful when you want to apply a coordinate transformation (ie align the structure) and then analyse afterwards.

In [6]:
print(memory_u.trajectory)

<MemoryReader with 4187 frames of 3341 atoms>


### guess_bonds

By default, bond information is only present in a `Universe` if the topology file had these.
This means that various methods such as `.fragments` will not work

In [7]:
nhaa = data.datasets.fetch_nhaa_equilibrium()

nhaa_u = mda.Universe(nhaa['topology'])

nhaa_u.atoms.fragments

NoDataError: AtomGroup has no fragments; this requires Bonds

It is possible to try and guess bonds based upon the separations between atoms.
Bonds are guessed by comparing the distance between two atoms ($d_{ij}$) to the sum of their vdw radii ($r$) multiplied by a fudge factor ($f = 0.72$ by default).

$$ d_{ij} <= f * (r_i + r_j) $$

Some vdw_radii are built in to `MDAnalysis`, however any missing radii can be given via the `vdwradii` keyword:

In [8]:
nhaa_u = mda.Universe(nhaa['topology'], guess_bonds=True, vdwradii={'CL': 2.0, 'NA': 2.0})

In [9]:
nhaa_u.atoms.fragments

(<AtomGroup with 5812 atoms>,
 <AtomGroup with 5812 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup with 127 atoms>,
 <AtomGroup 

### ChainReader

MD Trajectories are often created in a series of discrete simulations.
By supplying a list of trajectory filenames to `Universe` creation,
these will be read in sequence by the `ChainReader` class.

In [10]:
adk_dims = data.datasets.fetch_adk_transitions_DIMS()

print(adk_dims['trajectories'][:5])

['/Users/richardgowers/MDAnalysis_data/adk_transitions_DIMS/DIMS/trajectories/dims0138_fit-core.dcd', '/Users/richardgowers/MDAnalysis_data/adk_transitions_DIMS/DIMS/trajectories/dims0192_fit-core.dcd', '/Users/richardgowers/MDAnalysis_data/adk_transitions_DIMS/DIMS/trajectories/dims0048_fit-core.dcd', '/Users/richardgowers/MDAnalysis_data/adk_transitions_DIMS/DIMS/trajectories/dims0195_fit-core.dcd', '/Users/richardgowers/MDAnalysis_data/adk_transitions_DIMS/DIMS/trajectories/dims0180_fit-core.dcd']


In [11]:
chain_u = mda.Universe(adk_dims['topology'], adk_dims['trajectories'])

In [12]:
print(chain_u.trajectory)

<ChainReader containing dims0138_fit-core.dcd and 199 more with 19691 frames of 3341 atoms>


### Universe.empty

Universes can be created even without a file, this is useful for building a system from scratch.
This is done via the `Universe.empty()` construction method.
For more details on this, check out the other Notebook in the Advanced tutorials section.

In [13]:
blank_u = mda.Universe.empty(20)

print(blank_u)

<Universe with 20 atoms>


### fetch_mmtf

You can load structures from the Protein Data Bank using the `fetch_mmtf` method.
This will download the `mmtf` data from the PDB, and create a Universe from this:

In [39]:
u = mda.fetch_mmtf('5YVL')

print(u)

<Universe with 6799 atoms>




### Creating new systems with MDAnalysis

Whilst `MDAnalysis` is designed for reading pre existing simulation files, there is also some features which allow the construction of systems

### Universe.empty and adding new attributes

The `Universe` object can also be constructed from the `Universe.empty` method, which is similar to `np.zeros`.

In [1]:
import MDAnalysis as mda

mda.Universe.empty?

Here we create an 20 atom Universe, with a trajectory attached.  The positions of all atoms will initially be zero

In [2]:
u = mda.Universe.empty(n_atoms=21, n_residues=7,
                       trajectory=True)



In [3]:
print(u.atoms)
print(u.residues)

<AtomGroup [<Atom 1:>, <Atom 2:>, <Atom 3:>, ..., <Atom 19:>, <Atom 20:>, <Atom 21:>]>
<ResidueGroup [<Residue>, <Residue>, <Residue>, <Residue>, <Residue>, <Residue>, <Residue>]>


In [4]:
for i, res in enumerate(u.residues):
    u.atoms[i * 3: (i + 1) * 3].residue = res

We can then add various topology attributes to these atoms

In [5]:
u.add_TopologyAttr('masses', values=[10.0] * 21)
u.add_TopologyAttr('names', values=['A'] * 21)
u.add_TopologyAttr('types', values=['Ca'] * 21)
u.add_TopologyAttr('resids', values=range(7))


And finally we can write this `Universe` out to a file

In [6]:
u.atoms.write('new.gro')

  "".format(miss=', '.join(missing_topology)))
