### PyRosettaCluster 
## Tutorial 4. Ligand params

Tutorial 4 is an example of using ligand params with PyRosettaCluster. 

If a structure contains a ligand that requires a params file, Rosetta must be initialized prior to job distribution with PyRosettaCluster. For reproducibility outside PyRosettaCluster, Rosetta should be initialized with a constant seed.

### 1. Import packages

In [1]:
import bz2
import glob
import json
import logging
logging.basicConfig(level=logging.INFO)
import os
import pyrosetta
import pyrosetta.distributed.io as io
import pyrosetta.distributed.packed_pose as packed_pose
import pyrosetta.distributed.tasks.rosetta_scripts as rosetta_scripts
import pyrosetta.distributed.tasks.score as score
import pyrosetta.distributed.viewer as viewer
import random
import tempfile

from pyrosettacluster import PyRosettaCluster, get_instance_kwargs, reproduce

### 2. Initialize a compute cluster using `dask`

1. Click the "Dask" tab in Jupyter Lab <i>(arrow, left)</i>
2. Click the "+ NEW" button to launch a new compute cluster <i>(arrow, lower)</i>

![title](images/dask_labextension_1.png)

3. Once the cluster has started, click the brackets to "inject client code" for the cluster into your notebook

![title](images/dask_labextension_2.png)

Inject client code here, then run the cell:

In [2]:
from dask.distributed import Client

client = Client("tcp://127.0.0.1:45657")
client

0,1
Client  Scheduler: tcp://127.0.0.1:45657  Dashboard: http://127.0.0.1:8787/status,Cluster  Workers: 4  Cores: 4  Memory: 16.63 GB


### 3. Define the user-provided paths:

In [3]:
my_PyRosettaCluster_git_repo = '/shared/home/aloshbaugh/PyRosettaCluster'

in_dir = os.path.join(my_PyRosettaCluster_git_repo, 'tutorials/input')
work_dir = os.path.join( 
    my_PyRosettaCluster_git_repo, 
    'tutorials/4_Ligand_params' 
)

### 4. Define ligand params file and initialize Rosetta with a constant seed

The `-run` flags define a constant seed and are necessary for reproducibility.

Initialization is necessary prior to distributing jobs that modify the ligand. If you do not propery initialize Rosetta within the Jupyter Notebook, distribution may fail.

In [4]:
params = os.path.join(in_dir, 'FEN.params')
pyrosetta.init(extra_options = f"-extra_res_fa {params} -run:constant_seed 1 \
               -run:jran 1111111")

INFO:pyrosetta.rosetta:Found rosetta database at: /shared/home/aloshbaugh/.conda/envs/jupyterlab/lib/python3.7/site-packages/pyrosetta/database; using it....
INFO:pyrosetta.rosetta:PyRosetta-4 2020 [Rosetta PyRosetta4.conda.linux.cxx11thread.serialization.CentOS.python37.Release 2020.15+release.3121c734db02d2b62dd1974dcb8daface3f50057 2020-04-10T09:29:24] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
INFO:rosetta:[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
INFO:rosetta:[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.linux.cxx11thread.serialization.CentOS.python37.Release r251 2020.15+release.3121c73 3121c734db02d2b62dd1974dcb8daface3f50057 http://www.pyrosetta.org 2020-04-10T09:29:24
INFO:rosetta:[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2aro -extra_res_fa /shared/home/aloshbaugh/PyRosettaCluster/tutorials/input/FEN.params -run:constant_see

PyRosetta-4 2020 [Rosetta PyRosetta4.conda.linux.cxx11thread.serialization.CentOS.python37.Release 2020.15+release.3121c734db02d2b62dd1974dcb8daface3f50057 2020-04-10T09:29:24] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.


### 5. Define the user-provided protocol:

In [5]:
def protocol1(ppose=None, **kwargs):
    """
    Relax ligand residue. 
    
    Ligand is fentanyl, protein is designed fentanyl binder.
    
    Reference: https://elifesciences.org/articles/28909
    
    Args:
        packed_pose: A `PackedPose` object. Optional.
        **kwargs: PyRosettaCluster keyword arguments.

    Returns:
        A `PackedPose` object.
    
    """
    import pyrosetta
    import pyrosetta.distributed.io as io
    import pyrosetta.distributed.tasks.rosetta_scripts as rosetta_scripts

    input_protocol = """
        <ROSETTASCRIPTS>

          <RESIDUE_SELECTORS>
            <Index name="fen" resnums="1X"/>
            <Not name="notfentanyl" selector="fentanyl"/>
          </RESIDUE_SELECTORS>

          <TASKOPERATIONS>
            <ResfileCommandOperation 
              name="NATAA_fentanyl" command="NATAA" residue_selector="fen"
            />
            <OperateOnResidueSubset 
              name="restrict_others" selector="notfentanyl">
              <PreventRepackingRLT/>
            </OperateOnResidueSubset>
          </TASKOPERATIONS>

          <MOVERS>
            <FastRelax 
              name="relax_mover" 
              task_operations="NATAA_fentanyl,restrict_others" 
            />
          </MOVERS>

          <PROTOCOLS>
            <Add mover="relax_mover"/>
          </PROTOCOLS>
        </ROSETTASCRIPTS>
        """

    design_protocol = rosetta_scripts.SingleoutputRosettaScriptsTask(
        input_protocol
    )
    
    in_pose = io.pose_from_file(kwargs['s'])
    
    out_pose = design_protocol(in_pose.pose.clone())
    
    return out_pose 

### 6. Launch the original simulation using `distribute()`

The protocol produces a decoy, which we will reproduce at a later step.

In [6]:
def create_tasks():
    yield {
        "options": "-ex1",
        "extra_options":
        f"-out:level 300 -multithreading:total_threads 1 -extra_res_fa \
            {params}",
        "s":os.path.join( in_dir, '2qz3_fen_renumber.pdb' ), 
    }

protocols = [protocol1]
    
PyRosettaCluster(
    tasks=create_tasks,
    client=client,
    scratch_dir=work_dir,
    output_path=work_dir,
).distribute(protocols=protocols)



While jobs are running, you may monitor their progress using the dask dashboard diagnostics within Jupyter Lab!

In the "Dask" tab, click the various diagnostic tools _(arrows)_ to open new tabs:

![title](images/dask_labextension_4.png)

Arrange the diagnostic tool tabs within Jupyter Lab how you best see fit by clicking and dragging them:

![title](images/dask_labextension_3.png)

### 7. Visualize the resultant decoy

Gather pose from disk into memory:

In [7]:
results = glob.glob(os.path.join(work_dir, "decoys/*/*.pdb.bz2"))
packed_poses = []
for bz2file in results:
    with open(bz2file, "rb") as f:
        packed_poses.append(
            io.pose_from_pdbstring(bz2.decompress(f.read()).decode())
        )

View the pose in memory. 

Your relaxed (2qz3) is shown in rainbow ribbon, with side chains and fentanyl ligand in white sticks.

In [8]:
view = viewer.init(packed_poses, window_size=(800, 600))
view.add(viewer.setStyle())
view.add(viewer.setStyle(colorscheme="whiteCarbon", radius=0.25))
view.add(viewer.setHydrogenBonds())
view.add(viewer.setHydrogens(polar_only=True))
view.add(viewer.setDisulfides(radius=0.25))
view()

### 8. Reproduce the decoy.

The protocol produced only one decoy, which is accessed by index zero of results:

In [9]:
decoy = results[0]

Optionally, `PyRosettaCluster` instance keyword arguments to reproduce this decoy can be viewed using `get_instance_kwargs()`.

In [10]:
instance_kwargs = get_instance_kwargs(input_file=decoy)
instance_kwargs

{'ami_id': '',
 'compressed': True,
 'cores': 1,
 'dashboard_address': ':8787',
 'decoy_dir_name': 'decoys',
 'decoy_ids': [0],
 'dry_run': False,
 'ignore_errors': False,
 'instance_id': '',
 'logging_level': 'INFO',
 'logs_dir_name': 'logs',
 'max_workers': 1000,
 'memory': '4g',
 'min_workers': 1,
 'nstruct': 1,
 'output_path': '/shared/home/aloshbaugh/PyRosettaCluster/tutorials/4_Ligand_params',
 'processes': 1,
 'project_name': '2020.05.19.16.00.55.251064',
 'protocols': ['protocol1'],
 'save_all': False,
 'scheduler': None,
 'scorefile_name': 'scores.json',
 'scratch_dir': '/shared/home/aloshbaugh/PyRosettaCluster/tutorials/4_Ligand_params',
 'seeds': ['-1197484353'],
 'sha1': '',
 'simulation_name': '2020.05.19.16.00.55.251064',
 'tasks': {'options': '-ex1',
  'extra_options': '-out:level 300 -multithreading:total_threads 1 -extra_res_fa /shared/home/aloshbaugh/PyRosettaCluster/tutorials/input/FEN.params',
  's': '/shared/home/aloshbaugh/PyRosettaCluster/tutorials/input/2qz3_fen

### Congrats! 
You have successfully run a Rosetta simulation that modifies a ligand using `PyRosettaCluster`! This ends Tutorial 4.