%intersphinx https://python-ihm.readthedocs.io/en/latest
# Deposition of integrative models {#mainpage}
[TOC]

In this tutorial we will introduce the procedure used to deposit integrative modeling studies in the [PDB-Dev](https://pdb-dev.wwpdb.org/) database in mmCIF format.

We will demonstrate the procedure using [IMP](https://integrativemodeling.org/) and its PMI module, but the database will accept integrative models from any software, as long as they are compliant mmCIF files (e.g. there are several HADDOCK and Rosetta models already in PDB-Dev). 

## Why PDB-Dev?

PDB-Dev is a database run by the wwPDB. It is specifically for the deposition of *integrative* models, i.e. models generated using more than one source of experimental data. (Models that use only one experiment generally go in PDB; models that use no experimental information - theoretical models - go in [ModelArchive](https://www.modelarchive.org/)).

## Why mmCIF?

wwPDB already uses the mmCIF file format for X-ray structures, and has a formal data model ([PDBx](http://mmcif.wwpdb.org/)) to describe these structures. The format is *extensible*; extension dictionaries exist to describe NMR, SAS, EM data, etc. For integrative models, we use PDBx plus an "integrative/hybrid methods" (IHM) [extension dictionary](http://mmcif.wwpdb.org/dictionaries/mmcif_ihm.dic/Index/). This supports coarse-grained structures, multiple input experimental data sources, multiple states, multiple scales, and ensembles related by time or other order.

## Why can't we convert PDB/RMF directly to mmCIF?

This generally isn't possible because PDB, RMF and ``IMP::Model`` are designed to store one or more output models, where each model is a set of coordinates for a single conformation of the system being studied. A deposition, on the other hand, aims to cover a complete **modeling study**, and should capture not just the entire ensemble of output models, but also all of the input data needed to reproduce the modeling, and quality metrics such as the precision of the ensemble and the degree to which it fits the output data. A deposition is also designed to be visualized, and so may contain additional data not used in the modeling itself, such as preset colors or views to match figures in the publication or highlight regions of interest, and more human-descriptive names for parts of the system. Thus, deposition is largely a data-gathering exercise, and benefits from a modeling study being tidy and well organized (for example, by storing it in a [GitHub](https://github.com/) repository) so that data is easy to find and track to its source.

## Generation of mmCIF files

mmCIF is a text format with a well-defined syntax, so in principle files could be generated by hand or with simple scripts. However, it is generally easier to use the existing [python-ihm library](https://github.com/ihmwg/python-ihm). This stores the same data as in an mmCIF file, but represents it as a set of Python classes, so it is easier to manipulate.

For deposition, we could use the python-ihm library directly, by writing a Python script that reads in output models and input data, adds annotations, and writes out an mmCIF file. However, since in this case we used ``IMP.pmi`` to do the modeling, we can make use of a class in PMI called [ProtocolOutput](@ref IMP.pmi.mmcif.ProtocolOutput) that automatically captures an entire ``IMP.pmi`` modeling protocol.

## Basic usage of ProtocolOutput

``~IMP.pmi.mmcif.ProtocolOutput`` is designed to be attached to a top-level PMI object (usually ``IMP.pmi.topology.System``). Then, as the script is run, it will capture all of the information IMP knows about the modeling study, in an ``ihm.System`` object. Additional information not in the modeling script itself, such as the resulting publication, can then be added using the [python-ihm API](https://python-ihm.readthedocs.io/en/latest/usage.html).

We now proceed by modifying the script from the previous tutorial to attach a ProtocolOutput object and capture modeling protocol information as mmCIF.

The first modification is to import the PMI mmCIF and python-ihm Python modules: 

In [2]:
from __future__ import print_function

# Imports needed to use ProtocolOutput
import IMP.pmi.mmcif
import ihm

The script then proceeds as before until we have set up our top-level ``IMP.pmi.topology.System`` object:

In [3]:
import IMP
import IMP.core
import IMP.pmi.restraints.crosslinking
import IMP.pmi.restraints.stereochemistry
import IMP.pmi.restraints.em
import IMP.pmi.representation
import IMP.pmi.tools

import IMP.pmi.macros
import IMP.pmi.topology

import os
import sys

import warnings
warnings.filterwarnings('ignore')

datadirectory = "../rnapoliii/data/"
topology_file = datadirectory+"topology_poliii.txt" 
target_gmm_file = datadirectory+'emd_1883.map.mrc.gmm.50.txt' # The EM map data
output_directory = "./output"

# Initialize IMP model
m = IMP.Model()

# Read in the topology file.  
# Specify the directory where the PDB files, FASTA files and GMM files are
topology = IMP.pmi.topology.TopologyReader(topology_file, 
                                  pdb_dir=datadirectory, 
                                  fasta_dir=datadirectory, 
                                  gmm_dir=datadirectory)

# Use the BuildSystem macro to build states from the topology file
bs = IMP.pmi.macros.BuildSystem(m)

Now we can attach a ProtocolOutput object (BuildSystem contains a `system` member):

In [7]:
# Record the modeling protocol to an mmCIF file
po = IMP.pmi.mmcif.ProtocolOutput(open('rnapoliii.cif', 'w'))
bs.system.add_protocol_output(po)
po.system.title = "Modeling of RNA Pol III"
# Add publication
po.system.citations.append(ihm.Citation.from_pubmed_id(25161197))

Note that the `ProtocolOutput` object `po` simply wraps an `ihm.System` object as `po.system`. We can then customize the `ihm.System` by setting a human-readable title and adding a citation (here we use ``ihm.Citation.from_pubmed_id``, which looks up a citation by PubMed ID - this particular PubMed ID is actually for the previously-published [modeling of the Nup84 complex](https://salilab.org/nup84/)).

Now the original script proceeds as before, setting up the representation and restraints:

In [5]:
# Each state can be specified by a topology file.
bs.add_state(topology)
bs.add_state(topology)

root_hier, dof = bs.execute_macro(max_rb_trans=4.0, 
                                  max_rb_rot=0.3, 
                                  max_bead_trans=4.0, 
                                  max_srb_trans=4.0,
                                  max_srb_rot=0.3)

# Fix all rigid bodies but not Rpb4 and Rpb7 (the stalk)
# First select and gather all particles to fix.
fixed_particles=[]
for prot in ["ABC23"]:
    fixed_particles+=IMP.atom.Selection(root_hier,molecule=prot).get_selected_particles()
    

# Fix the Corresponding Rigid movers and Super Rigid Body movers using dof
# The flexible beads will still be flexible (fixed_beads is an empty list)!
fixed_beads,fixed_rbs=dof.disable_movers(fixed_particles,
                                         [IMP.core.RigidBodyMover,IMP.pmi.TransformMover])

# Shuffle the rigid body and beads configuration of only the molecules we are interested in (Rpb4 and Rpb7)
IMP.pmi.tools.shuffle_configuration(root_hier,
                                    excluded_rigid_bodies=fixed_rbs,
                                    max_translation=50, 
                                    verbose=False,
                                    cutoff=5.0,
                                    niterations=100)

outputobjects = [] # reporter objects...output is included in the stat file

# Connectivity keeps things connected along the backbone (ignores if inside same rigid body)
mols = IMP.pmi.tools.get_molecules(root_hier)
for mol in mols:
    molname=mol.get_name()        
    IMP.pmi.tools.display_bonds(mol)
    cr = IMP.pmi.restraints.stereochemistry.ConnectivityRestraint(mol,scale=2.0)
    cr.add_to_model()
    cr.set_label(molname)
    outputobjects.append(cr)

ev = IMP.pmi.restraints.stereochemistry.ExcludedVolumeSphere(
                                         included_objects=root_hier,
                                         resolution=10)
ev.add_to_model()         # add to scoring function
outputobjects.append(ev)  # add to output

# We then initialize a CrossLinkDataBase that uses a keywords converter to map column to information.
# The required fields are the protein and residue number for each side of the crosslink.
xldbkwc = IMP.pmi.io.crosslink.CrossLinkDataBaseKeywordsConverter()
xldbkwc.set_protein1_key("Protein1")
xldbkwc.set_protein2_key("Protein2")
xldbkwc.set_residue1_key("AbsPos1")
xldbkwc.set_residue2_key("AbsPos2")
xldbkwc.set_id_score_key("ld-Score")

xl1 = IMP.pmi.io.crosslink.CrossLinkDataBase(xldbkwc)
xl1.create_set_from_file(datadirectory+'FerberKosinski2016_apo.csv')
xl1.set_name("APO")

xl2 = IMP.pmi.io.crosslink.CrossLinkDataBase(xldbkwc)
xl2.create_set_from_file(datadirectory+'FerberKosinski2016_DNA.csv')
xl2.set_name("DNA")

# Append the xl2 dataset to the xl1 dataset to create a larger dataset
xl1.append_database(xl2)

# Rename one protein name
xl1.rename_proteins({"ABC14.5":"ABC14_5"})

# Create 3 confidence classes
# xl1.classify_crosslinks_by_score(3)

# Now, we set up the restraint.
xl1rest = IMP.pmi.restraints.crosslinking.CrossLinkingMassSpectrometryRestraint(
                                   root_hier=root_hier,  # The root hierarchy
                                   CrossLinkDataBase=xl1,# The XLDB defined above
                                   length=21.0,          # Length of the linker in angstroms
                                   slope=0.002,          # A linear term that biases XLed
                                                         # residues together
                                   resolution=1.0,       # Resolution at which to apply the restraint. 
                                                         # Either 1 (residue) or 0 (atomic)
                                   label="XL",           # Used to label output in the stat file
                                   weight=10.)           # Weight applied to all crosslinks 
                                                         # in this dataset
xl1rest.add_to_model()
outputobjects.append(xl1rest)


# First, get the model density objects that will be fitted to the EM density.
em_components = IMP.pmi.tools.get_densities(root_hier)

gemt = IMP.pmi.restraints.em.GaussianEMRestraint(em_components,
                                      target_gmm_file,  # EM map GMM file
                                      scale_target_to_mass=True,  # True if the mass of the map and model are identical.
                                      slope=0.0000001,  # A small funneling force pulling towards the center of the EM density.
                                      weight=80.0)           
gemt.add_to_model()
outputobjects.append(gemt)

BuildSystem.add_state: setting up molecule ABC23 copy number 0
BuildSystem.add_state: molecule ABC23 sequence has 155 residues
BuildSystem.add_state: ---- setting up domain 0 of molecule ABC23
BuildSystem.add_state: -------- domain 0 of molecule ABC23 extends from residue 1 to residue 155 
BuildSystem.add_state: -------- domain 0 of molecule ABC23 represented by pdb file ../rnapoliii/data/Pol3_core_Model_on4c3i.pdb 
BuildSystem.add_state: -------- domain 0 of molecule ABC23 represented by gaussians 
BuildSystem.add_state: setting up molecule ABC10beta copy number 0
BuildSystem.add_state: molecule ABC10beta sequence has 70 residues
BuildSystem.add_state: ---- setting up domain 0 of molecule ABC10beta
BuildSystem.add_state: -------- domain 0 of molecule ABC10beta extends from residue 1 to residue 70 
BuildSystem.add_state: -------- domain 0 of molecule ABC10beta represented by pdb file ../rnapoliii/data/Pol3_core_Model_on4c3i.pdb 
BuildSystem.add_state: -------- domain 0 of molecule ABC1

BuildSystem.add_state: -------- domain 0 of molecule C53 represented by gaussians 
BuildSystem.add_state: setting up molecule C37 copy number 0
BuildSystem.add_state: molecule C37 sequence has 282 residues
BuildSystem.add_state: ---- setting up domain 0 of molecule C37
BuildSystem.add_state: -------- domain 0 of molecule C37 extends from residue 1 to residue 181 
BuildSystem.add_state: -------- domain 0 of molecule C37 represented by pdb file ../rnapoliii/data/C37_C53_dimer.Model_on4c3i_MN.pdb 
BuildSystem.add_state: -------- domain 0 of molecule C37 represented by gaussians 
BuildSystem.add_state: ---- setting up domain 1 of molecule C37
BuildSystem.add_state: -------- domain 1 of molecule C37 extends from residue 182 to residue 282 
BuildSystem.add_state: -------- domain 1 of molecule C37 represented by BEADS 
BuildSystem.add_state: setting up molecule C82 copy number 0
BuildSystem.add_state: molecule C82 sequence has 654 residues
BuildSystem.add_state: ---- setting up domain 0 of mo

BuildSystem.add_state: -------- domain 0 of molecule C53 represented by gaussians 
BuildSystem.add_state: setting up molecule C37 copy number 0
BuildSystem.add_state: molecule C37 sequence has 282 residues
BuildSystem.add_state: ---- setting up domain 0 of molecule C37
BuildSystem.add_state: -------- domain 0 of molecule C37 extends from residue 1 to residue 181 
BuildSystem.add_state: -------- domain 0 of molecule C37 represented by pdb file ../rnapoliii/data/C37_C53_dimer.Model_on4c3i_MN.pdb 
BuildSystem.add_state: -------- domain 0 of molecule C37 represented by gaussians 
BuildSystem.add_state: ---- setting up domain 1 of molecule C37
BuildSystem.add_state: -------- domain 1 of molecule C37 extends from residue 182 to residue 282 
BuildSystem.add_state: -------- domain 1 of molecule C37 represented by BEADS 
BuildSystem.add_state: setting up molecule C82 copy number 0
BuildSystem.add_state: molecule C82 sequence has 654 residues
BuildSystem.add_state: ---- setting up domain 0 of mo

BuildSystem.execute_macro: -------- building rigid body ['C31..0']
BuildSystem.execute_macro: -------- adding C31..0
BuildSystem.execute_macro: -------- creating rigid body with max_trans 4.0 max_rot 0.3 non_rigid_max_trans 4.0
BuildSystem.execute_macro: -------- building rigid body ['C34..0']
BuildSystem.execute_macro: -------- adding C34..0
BuildSystem.execute_macro: -------- creating rigid body with max_trans 4.0 max_rot 0.3 non_rigid_max_trans 4.0
BuildSystem.execute_macro: -------- building rigid body ['C34..1']
BuildSystem.execute_macro: -------- adding C34..1
BuildSystem.execute_macro: -------- creating rigid body with max_trans 4.0 max_rot 0.3 non_rigid_max_trans 4.0
BuildSystem.execute_macro: -------- building rigid body ['C34..2']
BuildSystem.execute_macro: -------- adding C34..2
BuildSystem.execute_macro: -------- creating rigid body with max_trans 4.0 max_rot 0.3 non_rigid_max_trans 4.0
BuildSystem.execute_macro: -------- building rigid body ['C34..3']
BuildSystem.execute_m

Adding sequence connectivity restraint between 1-10_bead  and  11-20_bead of distance 7.2
Adding sequence connectivity restraint between 11-20_bead  and  21-30_bead of distance 7.2
Adding sequence connectivity restraint between 21-30_bead  and  31-33_bead of distance 7.2
Adding sequence connectivity restraint between 31-33_bead  and  Residue_34 of distance 7.2
Adding sequence connectivity restraint between Residue_337  and  338-339_bead of distance 7.2
Adding sequence connectivity restraint between 338-339_bead  and  Residue_340 of distance 7.2
Adding sequence connectivity restraint between Residue_410  and  411_bead of distance 7.2
Adding sequence connectivity restraint between 411_bead  and  Residue_412 of distance 7.2
Adding sequence connectivity restraint between Residue_549  and  550_bead of distance 7.2
Adding sequence connectivity restraint between 550_bead  and  Residue_551 of distance 7.2
Adding sequence connectivity restraint between Residue_616  and  617-626_bead of distance

Adding sequence connectivity restraint between 1-10_bead  and  11-20_bead of distance 7.2
Adding sequence connectivity restraint between 11-20_bead  and  21-30_bead of distance 7.2
Adding sequence connectivity restraint between 21-30_bead  and  31-33_bead of distance 7.2
Adding sequence connectivity restraint between 31-33_bead  and  Residue_34 of distance 7.2
Adding sequence connectivity restraint between Residue_337  and  338-339_bead of distance 7.2
Adding sequence connectivity restraint between 338-339_bead  and  Residue_340 of distance 7.2
Adding sequence connectivity restraint between Residue_410  and  411_bead of distance 7.2
Adding sequence connectivity restraint between 411_bead  and  Residue_412 of distance 7.2
Adding sequence connectivity restraint between Residue_549  and  550_bead of distance 7.2
Adding sequence connectivity restraint between 550_bead  and  Residue_551 of distance 7.2
Adding sequence connectivity restraint between Residue_616  and  617-626_bead of distance

Adding sequence connectivity restraint between 1-2_bead  and  Residue_3 of distance 7.2
Adding sequence connectivity restraint between Residue_42  and  43-44_bead of distance 7.2
Adding sequence connectivity restraint between 43-44_bead  and  Residue_45 of distance 7.2
Adding sequence connectivity restraint between Residue_60  and  61_bead of distance 7.2
Adding sequence connectivity restraint between 61_bead  and  Residue_62 of distance 7.2
Adding sequence connectivity restraint between Residue_167  and  168-173_bead of distance 7.2
Adding sequence connectivity restraint between 168-173_bead  and  Residue_174 of distance 7.2
Adding sequence connectivity restraint between Residue_182  and  183-192_bead of distance 7.2
Adding sequence connectivity restraint between 183-192_bead  and  193-202_bead of distance 7.2
Adding sequence connectivity restraint between 193-202_bead  and  203-212_bead of distance 7.2
Adding sequence connectivity restraint between 203-212_bead  and  213-218_bead of 

Adding sequence connectivity restraint between 1-10_bead  and  11-20_bead of distance 7.2
Adding sequence connectivity restraint between 11-20_bead  and  21-30_bead of distance 7.2
Adding sequence connectivity restraint between 21-30_bead  and  31-39_bead of distance 7.2
Adding sequence connectivity restraint between 31-39_bead  and  Residue_40 of distance 7.2
Adding sequence connectivity restraint between Residue_112  and  113-114_bead of distance 7.2
Adding sequence connectivity restraint between 113-114_bead  and  Residue_115 of distance 7.2
Adding sequence connectivity restraint between Residue_145  and  146-151_bead of distance 7.2
Adding sequence connectivity restraint between 146-151_bead  and  Residue_152 of distance 7.2
Adding sequence connectivity restraint between Residue_207  and  208-217_bead of distance 7.2
Adding sequence connectivity restraint between 208-217_bead  and  218-227_bead of distance 7.2
Adding sequence connectivity restraint between 218-227_bead  and  228-23

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 166 of chain C160 and residue 149 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_166 and Residue_149

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 166 of chain C160 and residue 149 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_166 and Residue_149

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 479 of chain C128 and residue 482 of chain C128
CrossLinkingMassSpectrometryRes

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 216 of chain C53 and residue 840 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 211-220_bead and Residue_840

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 216 of chain C53 and residue 840 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 211-220_bead and Residue_840

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 538 of chain C160 and residue 123 of chain ABC23
CrossLinkingMassSpectrometryRe

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 348 of chain C53 and residue 367 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 340-349_bead and Residue_367

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 348 of chain C53 and residue 367 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 340-349_bead and Residue_367

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 322 of chain C53 and residue 359 of chain C53
CrossLinkingMassSpectrometryRestrai

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 1216 of chain C160 and residue 1242 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_1216 and 1242-1251_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 216 of chain C53 and residue 162 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 211-220_bead and 161-170_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 216 of chain C53 and residue 162 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 230 of chain C128 and residue 482 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_230 and Residue_482

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 891 of chain C160 and residue 371 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_891 and Residue_371

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 891 of chain C160 and residue 371 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGM

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 1028 of chain C128 and residue 790 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_1028 and Residue_790

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 1028 of chain C128 and residue 790 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_1028 and Residue_790

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 243 of chain C128 and residue 482 of chain C128
CrossLinkingMassSpectrometr

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 49 of chain C11 and residue 99 of chain C11
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_49 and Residue_99

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 126 of chain C34 and residue 65 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_126 and Residue_65

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 126 of chain C34 and residue 65 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
Cro

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 216 of chain C53 and residue 194 of chain C37
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 211-220_bead and 192-201_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 216 of chain C53 and residue 194 of chain C37
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 211-220_bead and 192-201_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 100 of chain C37 and residue 216 of chain C53
CrossLinkingMassSpectrometryRestr

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 25 of chain C53 and residue 605 of chain C82
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 21-30_bead and Residue_605

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 100 of chain C37 and residue 162 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_100 and 161-170_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 100 of chain C37 and residue 162 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi P

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 14 of chain C53 and residue 111 of chain C31
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 11-20_bead and 111-120_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 14 of chain C53 and residue 111 of chain C31
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 11-20_bead and 111-120_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 226 of chain C53 and residue 236 of chain C53
CrossLinkingMassSpectrometryRestraint: 

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 144 of chain C31 and residue 145 of chain C31
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 141-150_bead and 141-150_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 144 of chain C31 and residue 145 of chain C31
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 141-150_bead and 141-150_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 991 of chain C160 and residue 201 of chain ABC27
CrossLinkingMassSpectrometryRe

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 629 of chain C128 and residue 623 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 627-636_bead and 617-626_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 266 of chain C53 and residue 840 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 261-270_bead and Residue_840

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 266 of chain C53 and residue 840 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIG

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 123 of chain C34 and residue 65 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_123 and Residue_65

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 91 of chain C31 and residue 50 of chain C82
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 91-100_bead and Residue_50

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 91 of chain C31 and residue 50 of chain C82
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
Cro

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 92 of chain C17 and residue 49 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 84-93_bead and 41-50_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 92 of chain C17 and residue 49 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 84-93_bead and 41-50_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 213 of chain C128 and residue 230 of chain C128
CrossLinkingMassSpectrometryRestraint: with

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 266 of chain C53 and residue 9 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 261-270_bead and 1-10_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 32 of chain C17 and residue 25 of chain C17
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_32 and Residue_25

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 32 of chain C17 and residue 25 of chain C17
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
Cros

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 1127 of chain C160 and residue 1134 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_1127 and Residue_1134

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 145 of chain C25 and residue 49 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_145 and 41-50_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 145 of chain C25 and residue 49 of chain C53
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA p

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 325 of chain C53 and residue 322 of chain C82
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 320-329_bead and Residue_322

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 325 of chain C53 and residue 322 of chain C82
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 320-329_bead and Residue_322

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 91 of chain C31 and residue 159 of chain C25
CrossLinkingMassSpectrometryRestrain

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 1242 of chain C160 and residue 65 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 1242-1251_bead and Residue_65

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 1242 of chain C160 and residue 65 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 1242-1251_bead and Residue_65

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 115 of chain C53 and residue 1227 of chain C160
CrossLinkingMassSpectrometryR

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 180 of chain C53 and residue 100 of chain C37
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 171-180_bead and Residue_100

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 919 of chain C128 and residue 371 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_919 and Residue_371

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 919 of chain C128 and residue 371 of chain C160
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 878 of chain C160 and residue 911 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_878 and Residue_911

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 179 of chain C31 and residue 146 of chain C31
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 171-180_bead and 141-150_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 179 of chain C31 and residue 146 of chain C31
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA 

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 75 of chain C128 and residue 78 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_75 and Residue_78

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 75 of chain C128 and residue 78 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_75 and Residue_78

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 49 of chain C11 and residue 216 of chain C53
CrossLinkingMassSpectrometryRestraint: wit

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 109 of chain C128 and residue 170 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_109 and Residue_170

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 109 of chain C128 and residue 170 of chain C128
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles Residue_109 and Residue_170

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 313 of chain C82 and residue 301 of chain C82
CrossLinkingMassSpectrometryRestr

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 196 of chain C34 and residue 204 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 193-202_bead and 203-212_bead

--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 196 of chain C34 and residue 204 of chain C34
CrossLinkingMassSpectrometryRestraint: with sigma1 SIGMA sigma2 SIGMA psi PSI
CrossLinkingMassSpectrometryRestraint: between particles 193-202_bead and 203-212_bead

generating a new crosslink restraint
--------------
CrossLinkingMassSpectrometryRestraint: generating cross-link restraint between
CrossLinkingMassSpectrometryRestraint: residue 313 of chain C82 and residue 294 of chain C82
CrossLinkingMassSpectrometryRestr

ValueError: ../rnapoliii/data/emd_1883.map.mrc.gmm.50.txt does not exist

We can save time when it comes to the actual sampling by skipping it entirely (and using the previously-generated trajectory) by turning on ``~IMP.pmi.macros.ReplicaExchange0``'s `test_mode`:

In [None]:
# total number of saved frames
num_frames = 5000

# This object defines all components to be sampled as well as the sampling protocol
mc1=IMP.pmi.macros.ReplicaExchange0(m,
              root_hier=root_hier,                         # The root hierarchy
              monte_carlo_sample_objects=dof.get_movers()+xl1rest.get_movers(), # All moving particles and parameters
              rmf_output_objects=outputobjects,            # Objects to put into the rmf file
              crosslink_restraints=[xl1rest],      # allows XLs to be drawn in the RMF files
              monte_carlo_temperature=1.0,                 
              simulated_annealing=True,
              simulated_annealing_minimum_temperature=1.0,
              simulated_annealing_maximum_temperature=2.5,
              simulated_annealing_minimum_temperature_nframes=200,
              simulated_annealing_maximum_temperature_nframes=20,
              number_of_best_scoring_models=10,
              monte_carlo_steps=10,
              number_of_frames=num_frames,
              global_output_directory=output_directory,
              test_mode=True)

# Start Sampling
mc1.execute_macro()

## Linking to other data

Integrative modeling draws on data from a variety of sources, so for a complete deposition all of this data needs to be available. The data is not placed directly in the mmCIF file - rather, the file contains links. These links can be:

 - an identifier in a domain-specific database, such as PDB or EMDB.
 - a DOI where the files can be obtained.
 - a path to a file on the local disk.

Database identifiers are preferable because the databases are curated by domain experts and include domain-specific information, and the files are in standard formats. ProtocolOutput will attempt to use these where possible. For example, in this case ProtocolOutput is able to read (using the [ihm.metadata module](@ref ihm.metadata)) the annotations of the input crystal structure used for the modeling (see below) and determine that it is stored in the PDB, so the relevant 1WCM identitifer is included in the mmCIF file (see the `_ihm_dataset_related_db_reference` table). (Similarly, the EMDB EMD-1883 identifier is used for the EM density map.)

When a file is used for the modeling which cannot be tracked back to a database, ProtocolOutput will include its path (relative to that of the mmCIF file). For example, in this case the cross-links used are stored in simple CSV files. In addition, the Python script itself is linked from the mmCIF file. Such local paths won't be available to end users, so for deposition we need to replace these paths with database IDs or DOIs (see below).

As a further example of linkage, see the links in the previously-published [modeling of the Nup84 complex](https://salilab.org/nup84/) below. The mmCIF file links to the data directly used in the modeling (cross-links, crystal structures, electron microscopy class averages, comparative models, and Python scripts) via database IDs or DOIs. Furthermore, where available links are provided from this often-processed data back to the original data, such as templates for comparative models, mass spectometry spectra for cross-links, or micrographs for class averages:

<img src="images/links.png" width="700px" title="Nup84 file linkage" />

## Annotation of input files

ProtocolOutput, using python-ihm, will look at all input files to try to extract as much metadata as possible. As described above this is used to look up database identifiers, but it can also detect other inputs, such as the templates used for comparative modeling. Thus, it is important for deposition that all input files are annotated as well as possible:

 - deposit input files in a domain-specific database where possible and use the deposited file (which typically will contain identifying headers) in the modeling repository.
 - for PDB crystal structures, do not remove the original headers, such as the `HEADER` and `TITLE` lines.
 - for MODELLER comparative models, leave in the REMARK records and make sure that any files mentioned in `REMARK 6 ALIGNMENT:` or `REMARK 6 SCRIPT:` records are available (modify the paths if necessary, for example if you moved the PDB file into a different directory from the modeling script and alignment).
 - for manually generated PDB files, such as those extracted from a published work or generated by docking or other means, add suitable `EXPDTA` and `TITLE` records to the files for ProtocolOutput to pick up. See the [python-ihm docs](@ref ihm.metadata.PDBParser.parse_file) for more information.
 - for GMM files used for the EM density restraint, keep the original MRC file around and make sure that the `# data_fn:` header in the GMM file points to it.


At the end of the modeling, we then simply write the entire study out to the mmCIF file by using ProtocolOutput's `flush` method:

In [8]:
po.flush()