<!--NOTEBOOK_HEADER-->
*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);
content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*

<!--NAVIGATION-->
< [Visualization with the `PyMOLMover`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.06-Visualization-and-PyMOL-Mover.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Visualization and `pyrosetta.distributed.viewer`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.08-Visualization-and-pyrosetta.distributed.viewer.ipynb) ><p><a href="https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.07-RosettaScripts-in-PyRosetta.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>

# RosettaScripts in PyRosetta

Keywords: RosettaScripts, script, xml, XMLObjects

## Overview

RosettaScripts in another way to script custom modules in PyRosetta.  It is much simpler than PyRosetta, but can be extremely powerful, and with great documentation. There are also many publications that give `RosettaScript` examples, or whole protocols as a `RosettaScript` instead of a mover or application. In addition, some early Rosetta code was written with `RosettaScripts` in mind, and still may only be fully accessible via `RosettaScripts` in order to change important variables. 

Recent versions of Rosetta have enabled full RosettaScript protocols to be run in PyRosetta. A new class called `XMLObjects`, has also enabled the setup of specific rosetta class types in PyRosetta instead of constructing them from code. This tutorial will introduce how to use this integration to get the most out of Rosetta.  Note that some tutorials use RosettaScripts almost exclusively, such as the parametric protein design notebook, as it is simpler to use RS than setting up everything manually in code. 


## RosettaScripts

A RosettaScript is made up of different sections where different types of Rosetta classes are constructed. You will see many of these types throughout the notebooks to come. Briefly:

`ScoreFunctions`:  A scorefunction evaluates the energy of a pose through physical and statistal energy terms
`ResidueSelectors`: These select a list of residues in a pose according to some criteria
`Movers`:  These do things to a pose. They all have an `apply()` method that you will see shortly.
`TaskOperations`:  These control side-chain packing and design
`SimpleMetrics`: The return some metric value of a pose.  This value can be a real number, string, or a composite of values. 

### Skeleton RosettaScript Format

```xml
<ROSETTASCRIPTS>
    <SCOREFXNS>
    </SCOREFXNS>
    <RESIDUE_SELECTORS>
    </RESIDUE_SELECTORS>
    <TASKOPERATIONS>
    </TASKOPERATIONS>
    <SIMPLE_METRICS>
    </SIMPLE_METRICS>
    <FILTERS>
    </FILTERS>
    <MOVERS>
    </MOVERS>
    <PROTOCOLS>
    </PROTOCOLS>
    <OUTPUT />
</ROSETTASCRIPTS>
```
Anything outside of the \< \> notation is ignored and can be used to comment the xml file


### RosettaScript Example
```xml
<ROSETTASCRIPTS>
	<SCOREFXNS>
	</SCOREFXNS>
	<RESIDUE_SELECTORS>
		<CDR name="L1" cdrs="L1"/>
	</RESIDUE_SELECTORS>
	<MOVE_MAP_FACTORIES>
		<MoveMapFactory name="movemap_L1" bb="0" chi="0">
			<Backbone residue_selector="L1" />
			<Chi residue_selector="L1" />
		</MoveMapFactory>
	</MOVE_MAP_FACTORIES>
	<SIMPLE_METRICS>
		<TimingProfileMetric name="timing" />
		<SelectedResiduesMetric name="rosetta_sele" residue_selector="L1" rosetta_numbering="1"/>
		<SelectedResiduesPyMOLMetric name="pymol_selection" residue_selector="L1" />
		<SequenceMetric name="sequence" residue_selector="L1" />
		<SecondaryStructureMetric name="ss" residue_selector="L1" />
	</SIMPLE_METRICS>
	<MOVERS>
		<MinMover name="min_mover" movemap_factory="movemap_L1" tolerance=".1" /> 
		<RunSimpleMetrics name="run_metrics1" metrics="pymol_selection,sequence,ss,rosetta_sele" prefix="m1_" />
		<RunSimpleMetrics name="run_metrics2" metrics="timing,ss" prefix="m2_" />
	</MOVERS>
	<PROTOCOLS>
		<Add mover_name="run_metrics1"/>
		<Add mover_name="min_mover" />
		<Add mover_name="run_metrics2" />
	</PROTOCOLS>
</ROSETTASCRIPTS>
```

Rosetta will carry out the order of operations specified in PROTOCOLS.  An important point is that SimpleMetrics and Filters never change the sequence or conformation of the structure.

The movers do change the pose, and the output file will be the result of sequentially applying the movers in the protocols section. The standard scores of the output will be carried over from any protocol doing scoring, unless the OUTPUT tag is specified, in which case the corresponding score function from the SCOREFXNS block will be used.  

## RosettaScripts Documentation

It is recommended to read up on RosettaScripts here.  Note that each type of Rosetta class has a list and documentation of ALL accessible components.  This is extremely useful to get an idea of what Rosetta can do and how to use it in PyRosetta. 

https://www.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/RosettaScripts


In [None]:
!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()


## Running Whole Protocols via RosettaScriptsParser

Here we will use a whole the parser to generate a ParsedProtocol (mover).  This mover can then be run with the apply method on a pose of interest. 

Lets run the protocol above.  We will be running this on the file itself.  

In [1]:
from pyrosetta import *
from rosetta.protocols.rosetta_scripts import *

init('-no_fconfig @inputs/rabd/common')

PyRosetta-4 2019 [Rosetta PyRosetta4.Release.python36.mac 2019.39+release.93456a567a8125cafdf7f8cb44400bc20b570d81 2019-09-26T14:24:44] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: [0mRosetta version: PyRosetta4.Release.python36.mac r233 2019.39+release.93456a567a8 93456a567a8125cafdf7f8cb44400bc20b570d81 http://www.pyrosetta.org 2019-09-26T14:24:44
[0mcore.init: [0mcommand: PyRosetta -no_fconfig @inputs/rabd/common -database /Users/jadolfbr/Library/Python/3.6/lib/python/site-packages/pyrosetta-2019.39+release.93456a567a8-py3.6-macosx-10.6-intel.egg/pyrosetta/database
[0mbasic.random.init_random_generator: [0m'RNG device' seed mode, using '/dev/urandom', seed=1563992860 seed_offset=0 real_seed=1563992860
[0mbasic.random.init_random_generator: [0mRandomGenerator:init: Normal mode, seed=1563992860 RG_type=mt19937


  


In [2]:
pose = pose_from_pdb("inputs/rabd/my_ab.pdb")
original_pose = pose.clone()

[0mcore.chemical.GlobalResidueTypeSet: [0mFinished initializing fa_standard residue type set.  Created 980 residue types
[0mcore.chemical.GlobalResidueTypeSet: [0mTotal time to initialize 0.899691 seconds.
[0mcore.import_pose.import_pose: [0mFile 'inputs/rabd/my_ab.pdb' automatically determined to be of type PDB
[0mcore.io.pdb.pdb_reader: [0mParsing 993 .pdb records with unknown format to search for Rosetta-specific comments.
[0mcore.conformation.Conformation: [0mFound disulfide between residues 771 845
[0mcore.conformation.Conformation: [0mcurrent variant for 771 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 845 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 771 CYD
[0mcore.conformation.Conformation: [0mcurrent variant for 845 CYD
[0mcore.conformation.Conformation: [0mFound disulfide between residues 891 956
[0mcore.conformation.Conformation: [0mcurrent variant for 891 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 

In [7]:

parser = RosettaScriptsParser()
protocol = parser.generate_mover("inputs/min_L1.xml")

if not os.getenv("DEBUG"):
    protocol.apply(pose)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mParsed script:
<ROSETTASCRIPTS>
	<SCOREFXNS/>
	<RESIDUE_SELECTORS>
		<CDR cdrs="L1" name="L1"/>
	</RESIDUE_SELECTORS>
	<MOVE_MAP_FACTORIES>
		<MoveMapFactory bb="0" chi="0" name="movemap_L1">
			<Backbone residue_selector="L1"/>
			<Chi residue_selector="L1"/>
		</MoveMapFactory>
	</MOVE_MAP_FACTORIES>
	<SIMPLE_METRICS>
		<TimingProfileMetric name="timing"/>
		<SelectedResiduesMetric name="rosetta_sele" residue_selector="L1" rosetta_numbering="1"/>
		<SelectedResiduesPyMOLMe

[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for H3
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 13 Omega: TTTTTTTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for L1
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 11 Omega: TTTTTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for L2
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 8 Omega: TTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for L3
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 9 Omega: TTTTTTCTT
[0mprotocols.analysis.simple_metrics.RunSimpleMetricsMover: [0mRunning: SelectedResiduesMetric - calculating selection
[0mbasic.io.database: [0mDatabase file opened: sampling/antibodies/cluster_center_dihedrals.txt
[0mprotocols.antibody.AntibodyNumberingParser: [0mAntibody numbering scheme definitions read successfully
[0mprotocols.antibody.AntibodyNumberingParser: [0mAntibody CDR definition read successfully
[0m

## Running via XMLObjects and strings

Next, we will use XMLObjects to create a protocol from a string.  Note that in-code, XMLOjbects uses special functionality of the `RosettaScriptsParser`.  Also note that the `XMLObjects` also has a `create_from_file` method that will take a path to an XML file.  

In [17]:
pose = original_pose.clone()

min_L1 = """
<ROSETTASCRIPTS>
	<SCOREFXNS>
	</SCOREFXNS>
	<RESIDUE_SELECTORS>
		<CDR name="L1" cdrs="L1"/>
	</RESIDUE_SELECTORS>
	<MOVE_MAP_FACTORIES>
		<MoveMapFactory name="movemap_L1" bb="0" chi="0">
			<Backbone residue_selector="L1" />
			<Chi residue_selector="L1" />
		</MoveMapFactory>
	</MOVE_MAP_FACTORIES>
	<SIMPLE_METRICS>
		<TimingProfileMetric name="timing" />
		<SelectedResiduesMetric name="rosetta_sele" residue_selector="L1" rosetta_numbering="1"/>
		<SelectedResiduesPyMOLMetric name="pymol_selection" residue_selector="L1" />
		<SequenceMetric name="sequence" residue_selector="L1" />
		<SecondaryStructureMetric name="ss" residue_selector="L1" />
	</SIMPLE_METRICS>
	<MOVERS>
		<MinMover name="min_mover" movemap_factory="movemap_L1" tolerance=".1" /> 
		<RunSimpleMetrics name="run_metrics1" metrics="pymol_selection,sequence,ss,rosetta_sele" prefix="m1_" />
		<RunSimpleMetrics name="run_metrics2" metrics="timing,ss" prefix="m2_" />
	</MOVERS>
	<PROTOCOLS>
		<Add mover_name="run_metrics1"/>
		<Add mover_name="min_mover" />
		<Add mover_name="run_metrics2" />
	</PROTOCOLS>
</ROSETTASCRIPTS>
"""


xml = XmlObjects.create_from_string(min_L1)
protocol = xml.get_mover("ParsedProtocol")

if not os.getenv("DEBUG"):
    protocol.apply(pose)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mParsed script:
<ROSETTASCRIPTS>
	<SCOREFXNS/>
	<RESIDUE_SELECTORS>
		<CDR cdrs="L1" name="L1"/>
	</RESIDUE_SELECTORS>
	<MOVE_MAP_FACTORIES>
		<MoveMapFactory bb="0" chi="0" name="movemap_L1">
			<Backbone residue_selector="L1"/>
			<Chi residue_selector="L1"/>
		</MoveMapFactory>
	</MOVE_MAP_FACTORIES>
	<SIMPLE_METRICS>
		<TimingProfileMetric name="timing"/>
		<SelectedResiduesMetric name="rosetta_sele" residue_selector="L1" rosetta_numbering="1"/>
		<SelectedResiduesPyMOLMe

[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for H3
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 13 Omega: TTTTTTTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for L1
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 11 Omega: TTTTTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for L2
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 8 Omega: TTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for L3
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 9 Omega: TTTTTTCTT
[0mprotocols.analysis.simple_metrics.RunSimpleMetricsMover: [0mRunning: SelectedResiduesMetric - calculating selection
[0mbasic.io.database: [0mDatabase file opened: sampling/antibodies/cluster_center_dihedrals.txt
[0mprotocols.antibody.AntibodyNumberingParser: [0mAntibody numbering scheme definitions read successfully
[0mprotocols.antibody.AntibodyNumberingParser: [0mAntibody CDR definition read successfully
[0m

## Constructing Rosetta objects using XMLObjects

### Pulling from whole script

Here we will use our previous XMLObject that we setup using our script to pull a specific component from it.  Note that while this is very useful for running pre-defined Rosetta objects, we will not have any tab completion for it as it will be a generic type - which means we will be unable to further modify it.  

Lets grab the residue selector and then see which residues are L1.

In [24]:

L1_sele = xml.get_residue_selector("L1")
L1_res = L1_sele.apply(pose)
for i in range(1, len(L1_res)+1):
    if L1_res[i]:
        print("L1 Residue: ", pose.pdb_info().pose2pdb(i), ":", i )

[0mbasic.io.database: [0mDatabase file opened: sampling/antibodies/cluster_center_dihedrals.txt
[0mprotocols.antibody.AntibodyNumberingParser: [0mAntibody numbering scheme definitions read successfully
[0mprotocols.antibody.AntibodyNumberingParser: [0mAntibody CDR definition read successfully
[0mantibody.AntibodyInfo: [0mSuccessfully finished the CDR definition
[0mantibody.AntibodyInfo: [0mAC Detecting Regular CDR H3 Stem Type
[0mantibody.AntibodyInfo: [0mSRWGGDGFYAMDYW
[0mantibody.AntibodyInfo: [0mAC Finished Detecting Regular CDR H3 Stem Type: KINKED
[0mantibody.AntibodyInfo: [0mAC Finished Detecting Regular CDR H3 Stem Type: Kink: 1 Extended: 0
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for H1
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 13 Omega: TTTTTTTTTTTTT
[0mantibody.AntibodyInfo: [0mSetting up CDR Cluster for H2
[0mprotocols.antibody.cluster.CDRClusterMatcher: [0mLength: 10 Omega: TTTTTTTTTT
[0mantibody.AntibodyInfo: [0mSettin

### Constructing from single section

Here, we instead of parsing a whole script, we'll simply create the same L1 selector from the string itself.  This can be used for nearly every Rosetta class type in the script.  The 'static' part in the name means that we do not have to construct the XMLObject first, we can simply call its function. 

In [25]:
L1_sele = XmlObjects.static_get_residue_selector('<CDR name="L1" cdrs="L1"/>')
L1_res = L1_sele.apply(pose)
for i in range(1, len(L1_res)+1):
    if L1_res[i]:
        print("L1 Residue: ", pose.pdb_info().pose2pdb(i), ":", i )

[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: [0mParsed script:
<ROSETTASCRIPTS>
	<RESIDUE_SELECTORS>
		<CDR cdrs="L1" name="L1"/>
	</RESIDUE_SELECTORS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: [0mSCOREFUNCTION: [32mref2015[0m
[0mprotocols.antibody.residue_selector.CDRResidueSelector: [0mSetting CDRs from settings
[0mprotocols.rosetta_scripts.ParsedProtocol: [0mParsedProtocol mover with the following movers and filters
[0mbasic.io.database: [0mDatabase file opened: sampling/antibodies

Do these residues match what we had before?  Why do both of these seem a bit slower?  The actual residue selection is extremely quick, but validating the XML against a schema (which checks to make sure the string that you passed is valid and works) takes time. 

And that's it!  That should be everything you need to know about RosettaScripts in PyRosetta.  Enjoy! 

For XMLObjects, each type has a corresponding function (with and without static), these are listed below, but tab completion will help you here.  As you have seen above, the static functions are called on the class type, `XmlObjects`, while the non-static objects are called on an instance of the class after parsing a script, in our example, it was called `xml`.

```
.get_score_function   / .static_get_score_function

.get_residue_selector / .static_get_residue_selector

.get_simple_metric    / .static_get_simple_metric

.get_filter           / .static_get_filter

.get_mover            / .static_get_mover

.get_task_operation   / .static_get_task_operation

```

<!--NAVIGATION-->
< [Visualization with the `PyMOLMover`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.06-Visualization-and-PyMOL-Mover.ipynb) | [Contents](toc.ipynb) | [Index](index.ipynb) | [Visualization and `pyrosetta.distributed.viewer`](http://nbviewer.jupyter.org/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.08-Visualization-and-pyrosetta.distributed.viewer.ipynb) ><p><a href="https://colab.research.google.com/github/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/02.07-RosettaScripts-in-PyRosetta.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open in Google Colaboratory"></a>