# Calculating Polymer Interactions

Demo how to calculate inter-chain and intra-chain polymer-polymer interactions. Polymers are chains of proteins, DNA, or RNA.

In [1]:
from pyspark.sql import SparkSession                  
from mmtfPyspark.io import mmtfReader
from mmtfPyspark.interactions import InteractionExtractor, InteractionFilter                                

#### Configure Spark 

In [2]:
spark = SparkSession.builder.master("local[4]").appName("PolymerInteractionsDemo").getOrCreate()

## Download a sample PDB structures

In [3]:
pdb = mmtfReader.download_mmtf_files(['1OHR', '6CUH'])

## Define Interaction Filter
Interaction filters define the criteria for find interactions. Here, we present a couple of simple examples. More advanced options are available.

#### Define disulfide bond

In [4]:
disulfide_bond = InteractionFilter(distanceCutoff=3.0)
disulfide_bond.set_query_groups(True, 'CYS')
disulfide_bond.set_query_atom_names(True, 'SG')
disulfide_bond.set_target_groups(True, 'CYS')
disulfide_bond.set_target_atom_names(True, 'SG')

#### Define salt bridge interactions

In [5]:
salt_bridge = InteractionFilter(distanceCutoff=3.5)
salt_bridge.set_query_groups(True, ['ASP', 'GLU'])
salt_bridge.set_query_atom_names(True, ['OD1', 'OD2', 'OE1', 'OE2'])
salt_bridge.set_target_groups(True, ['ARG', 'LYS', 'HIS'])
salt_bridge.set_target_atom_names(True, ['NH1', 'NH2', 'NZ', 'ND1', 'NE2'])

## Find Interactions
Using the interaction filters defined above we can find these interactions between chains (inter-chain) and within chains (intra-chain).

By default, inter-chain interactions are calculated.

#### Find inter-chain disulfide bonds
Note that the queryChainId and targetChainId for each interaction is different.

In [6]:
interactions = InteractionExtractor.get_polymer_interactions(pdb, disulfide_bond)
interactions.toPandas()

Unnamed: 0,structureChainId,queryGroupId,queryChainId,queryGroupNumber,targetGroupId,targetChainId,targetGroupNumber,sequenceIndex,sequence
0,6CUH.A,CYS,B,172,CYS,A,161,160,GNSVTQMEGPVTLSEEAFLTINCTYTATGYPSLFWYVQYPGEGLQL...
1,6CUH.B,CYS,A,161,CYS,B,172,171,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...


#### Find intra-chain disulfide bonds

By setting the flags ```intra='True'``` and/or ```inter='True'```, intra-chain, inter-chain, or both can be reported. 

Here, we calculate the intra-chain disulfide bonds.

In [7]:
interactions = InteractionExtractor.get_polymer_interactions(pdb, disulfide_bond, inter=False, intra=True)
interactions.toPandas()

Unnamed: 0,structureChainId,queryGroupId,queryChainId,queryGroupNumber,targetGroupId,targetChainId,targetGroupNumber,sequenceIndex,sequence
0,6CUH.A,CYS,A,23,CYS,A,90,89,GNSVTQMEGPVTLSEEAFLTINCTYTATGYPSLFWYVQYPGEGLQL...
1,6CUH.A,CYS,A,90,CYS,A,23,22,GNSVTQMEGPVTLSEEAFLTINCTYTATGYPSLFWYVQYPGEGLQL...
2,6CUH.B,CYS,B,146,CYS,B,211,210,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
3,6CUH.A,CYS,A,136,CYS,A,186,185,GNSVTQMEGPVTLSEEAFLTINCTYTATGYPSLFWYVQYPGEGLQL...
4,6CUH.B,CYS,B,91,CYS,B,23,22,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
5,6CUH.B,CYS,B,23,CYS,B,91,90,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
6,6CUH.A,CYS,A,186,CYS,A,136,135,GNSVTQMEGPVTLSEEAFLTINCTYTATGYPSLFWYVQYPGEGLQL...
7,6CUH.B,CYS,B,211,CYS,B,146,145,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...


## Report interactions at the group and atom level
By default, interactions are reported at the group (residue) level

#### Find inter-chain salt bridges at the group level

In [8]:
interactions = InteractionExtractor.get_polymer_interactions(pdb, salt_bridge)
interactions.toPandas()

Unnamed: 0,structureChainId,queryGroupId,queryChainId,queryGroupNumber,targetGroupId,targetChainId,targetGroupNumber,sequenceIndex,sequence
0,6CUH.B,ASP,A,140,ARG,B,196,195,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
1,6CUH.B,ASP,A,119,HIS,B,138,137,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
2,1OHR.A,ASP,B,29,ARG,A,8,7,PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKM...
3,1OHR.B,ASP,A,29,ARG,B,8,7,PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKM...


#### Find inter-chain salt bridges at the atom level

By setting the flags ```level='atom'``` or ```level='group'```, interactions can be reported at the atom or group level.

At the atom level, the **atom names** and **distances** of the interactions are reported.

In [9]:
interactions = InteractionExtractor.get_polymer_interactions(pdb, salt_bridge, level='atom')
interactions.toPandas()

Unnamed: 0,structureChainId,queryGroupId,queryChainId,queryGroupNumber,queryAtomName,targetGroupId,targetChainId,targetGroupNumber,targetAtomName,distance,sequenceIndex,sequence
0,6CUH.B,ASP,A,140,OD2,ARG,B,196,NH1,2.963164,195,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
1,6CUH.B,ASP,A,140,OD1,ARG,B,196,NH2,2.748617,195,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
2,6CUH.B,ASP,A,119,OD2,HIS,B,138,NE2,2.739698,137,NAGVTQTPKFRVLKTGQSMTLLCAQDMNHEYMYWYRQDPGMGLRLI...
3,1OHR.A,ASP,B,29,OD2,ARG,A,8,NH2,2.875426,7,PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKM...
4,1OHR.B,ASP,A,29,OD2,ARG,B,8,NH2,2.774146,7,PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKM...


## Terminate Spark

In [10]:
spark.stop()