# Rosetta Filter API

@Author: 吴炜坤 @email：weikun.wu@xtalpi.com

更多参考: https://new.rosettacommons.org/docs/latest/scripting_documentation/RosettaScripts/Filters/Filters-RosettaScripts



本章节将详细介绍Pyrosetta中一些常用的filter的使用，并给出示例。请读者根据自己需求，需要使用时进行查询即可。

In [1]:
from pyrosetta.rosetta.protocols.rosetta_scripts import *
from pyrosetta import *
init()

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.26+release.b308454c455dd04f6824cc8b23e54bbb9be2cdd7 2021-07-02T13:01:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r288 2021.26+release.b308454c455 b308454c455dd04f6824cc8b23e54bbb9be2cdd7 http://www.pyrosetta.org 2021-07-02T13:01:54
[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2aro -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=566875228 seed_offset=0 real_seed=566875228 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:init: Normal mode, seed=566875228 RG_t

### 1. SimpleMetricFilter（简单介绍）
基于SimpleMetric计算的值判断是否保留构象的过滤器。

In [2]:
from pyrosetta.rosetta.protocols.simple_filters import SimpleMetricFilter
from pyrosetta.rosetta.core.select.residue_selector import ResidueIndexSelector
from pyrosetta.rosetta.core.simple_metrics.metrics import SasaMetric
from pyrosetta.rosetta.protocols.simple_filters import comparison_type

# 读取pose
pose = pose_from_pdb('./data/1ubq_clean.pdb')
print(pose.pdb_info())

[0mcore.chemical.GlobalResidueTypeSet: {0} [0mFinished initializing fa_standard residue type set.  Created 984 residue types
[0mcore.chemical.GlobalResidueTypeSet: {0} [0mTotal time to initialize 0.640331 seconds.
[0mcore.import_pose.import_pose: {0} [0mFile './data/1ubq_clean.pdb' automatically determined to be of type PDB
PDB file name: ./data/1ubq_clean.pdb
 Pose Range  Chain    PDB Range  |   #Residues         #Atoms

0001 -- 0076    A 0001  -- 0076  |   0076 residues;    01234 atoms
                           TOTAL |   0076 residues;    01234 atoms



In [3]:
# 定义SimpleMetrics计算器
sasa_sel = ResidueIndexSelector('1-76')  # 比如计算1-76号残基每个残基的sasa值
sasa_metrics = SasaMetric(sasa_sel)

In [4]:
# 定义SimpleMetricFilter
sasa_filter = SimpleMetricFilter()
sasa_filter.set_simple_metric(sasa_metrics)  # 设定SimpleMetrics
sasa_filter.set_cutoff(500)  # 设定截断半径;
sasa_filter.set_comparison_type(comparison_type.gt) # gt 等于great than, filter的判断逻辑
sasa_filter.apply(pose)

[0mprotocols.simple_filters.SimpleMetricFilter: {0} [0m4738.4 gt 500 ?
[0mprotocols.simple_filters.SimpleMetricFilter: {0} [0mFilter passed: 1


True

**点评**:其实在python操作中，完全没必要去设定SimpleMetricFilter。直接根据SimpleMetric返回的内容进行判断True or False。这种python语言中是非常容易实现的。
此处仅做一个简单的案例，阐明SimpleMetricFilter的基本作用。

### 2. Basic Filters
此部分根据官方的Filter文档介绍，ResidueCount和NetCharge的用法。

#### 2.1 ResidueCount
根据残基类型、残基性质、Pack状态的计数/计频filter，可设置过滤阈值。当多个性质或类型被设置是，处理的逻辑是“或”。

In [5]:
from pyrosetta.rosetta.protocols.simple_filters import ResidueCountFilter

# 添加过滤的残基性质类型
res_count_filter = ResidueCountFilter()
res_count_filter.add_residue_property_by_name('POLAR')
res_count_filter.score(pose)

41.0

In [6]:
from pyrosetta.rosetta.core.chemical import ResidueTypeSet
from pyrosetta.rosetta.core.chemical import ChemicalManager
from pyrosetta.rosetta.core.conformation import ResidueFactory

# 获取ResidueTypeSet
chm = ChemicalManager.get_instance()
residue_type_sets = chm.residue_type_set("fa_standard")

# 添加过滤的残基名(ALA)。
res_count_filter = ResidueCountFilter()
res_count_filter.add_residue_type_by_name(residue_type_sets, 'ALA')
res_count_filter.score(pose)

2.0

#### 2.2 NetCharge
基于蛋白序列总电荷值的过滤器，NetCharge设定LYS和ARG残基电荷值为+1，酸性残基ASP和GLU电荷值为-1。

In [7]:
from pyrosetta.rosetta.protocols.simple_filters import NetChargeFilter
netcharge = NetChargeFilter()
netcharge.apply(pose)

[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  LYS 6
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  LYS 11
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  GLU 16
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  GLU 18
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  ASP 21
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  GLU 24
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  LYS 27
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  LYS 29
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  ASP 32
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  LYS 33
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  GLU 34
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  -1  ASP 39
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  ARG 42
[0mprotocols.simple_filters.NetChargeFilter: {0} [0mAA:  +1  LYS 48
[0mprotocols.simple_

True

### 3. Energy/Score Filters

#### 3.1 ScoreTypeFilter
基于某特定打分项的Filter，如果没有指定打分的能量项，将默认对总能进行判断过滤。

In [8]:
from pyrosetta.rosetta.protocols.score_filters import ScoreTypeFilter
from pyrosetta.rosetta.core.scoring import ScoreType
from pyrosetta import create_score_function

# 创建打分函数
ref2015 = create_score_function('ref2015')

st_filter = ScoreTypeFilter()
st_filter.set_scorefxn(ref2015)
st_filter.set_score_type(ScoreType.fa_atr)  # 对范德华吸引势能量项打分，更多请参见ScoreType类型。
st_filter.set_threshold(-400)
st_filter.apply(pose)

[0mcore.scoring.etable: {0} [0mStarting energy table calculation
[0mcore.scoring.etable: {0} [0msmooth_etable: changing atr/rep split to bottom of energy well
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing lj etables (maxdis = 6)
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing solvation etables (max_dis = 6)
[0mcore.scoring.etable: {0} [0mFinished calculating energy tables.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/AccStrength.csv
[0mbasic.i

False

#### 3.2 TaskAwareScoreType
TaskAwareScoreType过滤器与ScoreTypeFilter最大的区别在于，最对那些TaskOperation中可被Repack的部分进行能量评估。

mode：可选"total", "average", or "individual"

此Filter可以对Interface上的残基进行特定的过滤，特别结合individual模式可以识别出异常的Residue或Rotamer

In [9]:
from pyrosetta.rosetta.protocols.simple_filters import TaskAwareScoreTypeFilter
from pyrosetta.rosetta.core.scoring import ScoreType
from pyrosetta import create_score_function
from pyrosetta.rosetta.core.pack.task import TaskFactory
from pyrosetta.rosetta.core.pack.task.operation import PreventRepackingRLT
from pyrosetta.rosetta.core.pack.task.operation import OperateOnResidueSubset
from pyrosetta.rosetta.core.select.residue_selector import ResidueIndexSelector
from pyrosetta.rosetta.core.pack.task import TaskFactory

# 选择氨基酸范围
select_pos = ResidueIndexSelector('2,3,4,5,6,7,8,9,10,11,12,13')
# 使用OperateOnResidueSubset生成TaskOperations
packing_taskop = OperateOnResidueSubset(PreventRepackingRLT(), select_pos, False)

# 创建打分函数
ref2015 = create_score_function('ref2015')

# 创建tf
tf = TaskFactory()
tf.push_back(packing_taskop)

tast_filter = TaskAwareScoreTypeFilter()
tast_filter.bb_bb(True)  # 考虑骨架的能量项
tast_filter.score_type(ScoreType.fa_atr)
tast_filter.scorefxn(ref2015)
tast_filter.task_factory(tf)
tast_filter.threshold(-1.0)
tast_filter.unbound(False)  # 必须手动设置为False
tast_filter.mode('individual')  # 单独过滤每一个打分项
tast_filter.score(pose)

0.0

#### 3.3 BindingStrain
在结合态的单体的能量张力的Filter, 此Filter可以自动检测对称性。

ps: 看了下源码，这个Filter其实就是把两个刚体组分拉开，然后进行repack。然后计算bind状态下的能量-unbind状态下的能量差。

如果能量差的绝对值越大，说明bind状态以unbind状态下的能量差较大。

In [10]:
from pyrosetta.rosetta.protocols.protein_interface_design.filters import BindingStrainFilter
from pyrosetta.rosetta.core.pack.task.operation import PreventRepackingRLT
from pyrosetta.rosetta.core.select.residue_selector import ChainSelector
init('-ex1 -ex2 -corrections::beta_nov16')
complex_pose = pose_from_pdb('./data/denovo_binder.pdb')

# 创建打分函数
beta_16 = create_score_function('beta_nov16')
receptor_chain = ChainSelector('A')

# 创建tf
no_repack_receptor_op = OperateOnResidueSubset(PreventRepackingRLT(), receptor_chain)
tf = TaskFactory()
tf.push_back(no_repack_receptor_op)

# 计算
bsf = BindingStrainFilter()
bsf.scorefxn(beta_16)
bsf.threshold(0)
bsf.jump(1)   # 定义binder与receptor之间的jump值。
bsf.task_factory(tf)
bsf.compute(complex_pose)

PyRosetta-4 2021 [Rosetta PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release 2021.26+release.b308454c455dd04f6824cc8b23e54bbb9be2cdd7 2021-07-02T13:01:54] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.cxx11thread.serialization.python37.Release r288 2021.26+release.b308454c455 b308454c455dd04f6824cc8b23e54bbb9be2cdd7 http://www.pyrosetta.org 2021-07-02T13:01:54
[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2 -corrections::beta_nov16 -database /opt/miniconda3/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=-1429836209 seed_offset=0 real_seed=-1429836209 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:init: Normal

-9.942585665340005

#### 3.4 ConstraintScore(有bug.不起效)
从ConstraintGenerators产生的一系列constraints计算的打分项的Filter

注意: 
1. Generators产生的约束必须通过AddConstraintsMover已经添加到Pose中
2. 对应的score term打分必须开启。


In [37]:
# 通过ConstraintGenerators产生约束
from pyrosetta.rosetta.protocols.simple_moves import VirtualRootMover

# load pose from 1ubq_clean.pdb
pose = pose_from_pdb("./data/1ubq_clean.pdb")

# Score reweight
score = create_score_function('ref2015')
score.set_weight(ScoreType.atom_pair_constraint, 1.0) # reweight score

from pyrosetta.rosetta.protocols.constraint_generator import TerminiConstraintGenerator
termin_cst = TerminiConstraintGenerator()
termin_cst.set_min_distance(8)
termin_cst.set_max_distance(20)
termin_cst.set_sd(1.0)
termin_cst.set_id('test_nc')

# add TerminiConstraintGenerator to pose;
from pyrosetta.rosetta.protocols.constraint_generator import AddConstraints
add_cst = AddConstraints()
add_cst.add_generator(termin_cst)
add_cst.apply(pose)

[0mcore.import_pose.import_pose: {0} [0mFile './data/1ubq_clean.pdb' automatically determined to be of type PDB
[0mprotocols.constraint_generator.TerminiConstraintGenerator: {0} [0mConstraining atoms  atomno= 2 rsd= 1  and  atomno= 2 rsd= 76 , min_distance=8 max_distance=20
[0mprotocols.constraint_generator.AddConstraints: {0} [0mAdding 1 constraints generated by ConstraintGenerator named test_nc


In [40]:
from pyrosetta.rosetta.protocols.constraint_filters import ConstraintScoreFilter
from pyrosetta.rosetta.protocols.relax import FastRelax

# 破坏NC构象代码(转为线性肽):
for i in range(1, pose.total_residue()+1):
    pose.set_phi(i, -150)
    pose.set_psi(i, 150)

# filter
cst_score_filter = ConstraintScoreFilter()
cst_score_filter.set_user_defined_name('test_nc')
cst_score_filter.apply(pose)

[0mprotocols.constraint_filters.ConstraintScoreFilter: {0} [0m
------------------------------------------------------------
 Scores                       Weight   Raw Score Wghtd.Score
------------------------------------------------------------
 atom_pair_constraint         1.000       0.000       0.000
 coordinate_constraint        1.000       0.000       0.000
 angle_constraint             1.000       0.000       0.000
 dihedral_constraint          1.000       0.000       0.000
 res_type_constraint          1.000       0.000       0.000
 backbone_stub_constraint     1.000       0.000       0.000
---------------------------------------------------
 Total weighted score:                        0.000


False

#### 3.5 ScorePoseSegmentFromResidueSelectorFilter
该filter可以根据用户指定的ResidueSelector的范围进行能量打分并过滤。比如可以针对特殊region或某条链进行打分。

in_context选项: 可以选择是否在打分前，将selection的区域提取到一个单独的Pose中。

In [13]:
from pyrosetta.rosetta.protocols.fold_from_loops.filters import ScorePoseSegmentFromResidueSelectorFilter
from pyrosetta.rosetta.core.select.residue_selector import ChainSelector

chain_A = ChainSelector('A')

score_from_selector_filter = ScorePoseSegmentFromResidueSelectorFilter()
score_from_selector_filter.residue_selector(chain_A)
score_from_selector_filter.in_context(True)
score_from_selector_filter.scorefxn(ref2015)
score_from_selector_filter.compute(pose)

[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mbeta_nov16.wts[0m


-190.54896910443185

#### 3.6 ReadPoseExtraScoreFilter
从Pose中的ExtraScore信息中提取score，并且设置是否进行过滤。

In [14]:
# set ExtraScore to pose:
from pyrosetta.rosetta.core.pose import setPoseExtraScore
setPoseExtraScore(pose, 'test_score', '100')

下面来进行score提取并过滤。

In [15]:
from pyrosetta.rosetta.protocols.simple_filters import ReadPoseExtraScoreFilter
extra_score_filter = ReadPoseExtraScoreFilter()
extra_score_filter.set_term_name('test_score') # 要过滤的score term
extra_score_filter.set_threshold(300)  # returns false if the score is greater than this threshold
extra_score_filter.apply(pose)

True

#### 3.7 Delta（完全没必要使用!）
计算filter中的值与input结构能量差值，简单来说就是指定一个Filter后，比对native和当前pose的差异值。

（略），在python中直接比较native pose和pose的值并不困难。

### 4. Distance Filter

#### 4.1 ResidueDistance
计算两个残基之间距离，以每个残疾的邻原子作为计算（通常为C-β原子），此Filter支持PDB编号或Pose编号。

In [46]:
from pyrosetta.rosetta.protocols.simple_filters import ResidueDistanceFilter
res1 = '5'
res2 = '10'
two_res_dis = ResidueDistanceFilter(res1, res2, distance_threshold=10)
two_res_dis.apply(pose)

[0mprotocols.simple_filters.ResidueDistanceFilter: {0} [0mDistance between residues 5 and 10 is 18.0233


False

#### 4.2 AtomicContact
判定两个残基之间在cutoff distance范围内，是否存在原子相互作用？

In [59]:
from pyrosetta.rosetta.protocols.simple_filters import AtomicContactFilter
is_atom_between_res = AtomicContactFilter(res1=1, res2=5, distance=10.0, sidechain=True, backbone=True, protons=False)
is_atom_between_res.apply(pose)

False

#### 4.3 AtomicContactCount(xmlobject)
计算两个残基之间contact的数量，此filter运行设置taskoperation，此时filter只统计packable残基侧链上的碳原子contact数量。

这个filter有3种运行模式:
1. "All" mode: 计算所有侧链碳原子的contact的数量。（适合单链结构计算使用）
2. "jump" mode: 计算所有复合物界面原子contact的数量。（适合相互作用界面使用）
3. "chain" mode: 计算链之间的原子contact的数量。（适合两两链之间计算使用）

In [87]:
# "All" mode
xml = rosetta_scripts.XmlObjects.create_from_string('''
<FILTERS>
    <AtomicContactCount name="all_atomic_contact" partition="none" distance="4.5"/>
</FILTERS>
''')
all_atomic_contact_filter = xml.get_filter('all_atomic_contact')
all_atomic_contact_filter.compute(pose)

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<FILTERS>
		<AtomicContactCount distance="4.5" name="all_atomic_contact" partition="none"/>
	</FILTERS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mbeta_nov16.wts[0m
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mDefined filter named "all_atomic_contact" of type AtomicContactCount
[0mprotocols.rosetta_scripts.ParsedProtocol: {0} [0mParsedProtoco

28.0

In [90]:
# "jump" mode
xml = rosetta_scripts.XmlObjects.create_from_string('''
<FILTERS>
    <AtomicContactCount name="all_atomic_contact" partition="jump" distance="4.5" jump="1"/>
</FILTERS>
''')
all_atomic_contact_filter = xml.get_filter('all_atomic_contact')

# 读取复合物结构
complex_pose = pose_from_pdb('./data/denovo_binder.pdb')
# all_atomic_contact_filter.compute(complex_pose) # 输出有点多，用户自行运行

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<FILTERS>
		<AtomicContactCount distance="4.5" jump="1" name="all_atomic_contact" partition="jump"/>
	</FILTERS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mbeta_nov16.wts[0m
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mDefined filter named "all_atomic_contact" of type AtomicContactCount
[0mprotocols.rosetta_scripts.ParsedProtocol: {0} [0mPars

In [89]:
# "chain" mode
xml = rosetta_scripts.XmlObjects.create_from_string('''
<FILTERS>
    <AtomicContactCount name="all_atomic_contact" partition="jump" distance="4.5" jump="1"/>
</FILTERS>
''')
all_atomic_contact_filter = xml.get_filter('all_atomic_contact')

# 读取复合物结构
complex_pose = pose_from_pdb('./data/denovo_binder.pdb')
# all_atomic_contact_filter.compute(complex_pose) 输出有点多，用户自行运行

[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mParsed script:
<ROSETTASCRIPTS>
	<FILTERS>
		<AtomicContactCount distance="4.5" jump="1" name="all_atomic_contact" partition="jump"/>
	</FILTERS>
	<PROTOCOLS/>
</ROSETTASCRIPTS>
[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mbeta_nov16.wts[0m
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mDefined filter named "all_atomic_contact" of type AtomicContactCount
[0mprotocols.rosetta_scripts.ParsedProtocol: {0} [0mPars

#### 4.4 AtomicDistance
计算指定两个原子之间的距离是否在cutoff距离之内呢？

In [71]:
from pyrosetta.rosetta.protocols.simple_filters import AtomicDistanceFilter

# 获取atom type基本信息:
atom_NZ_index = pose.residue(11).atom_index("NZ")
atom_type = pose.residue(11).atom_type(atom_NZ_index)
print(atom_type)

atom_NZ_index = pose.residue(34).atom_index("C")
atom_type = pose.residue(34).atom_type(atom_NZ_index)
print(atom_type)

# 原子的AtomType: atom_desig1, atom_desig2
# res1、res2: 残基的PDB名，
# distance_filter = AtomicDistanceFilter(res1=11, res2=34, atom_desig1='NZ', atom_desig2='OE1')
distance_filter = AtomicDistanceFilter(11, 34, 'Nlys', 'CObb', True, True, 3.0)
print(distance_filter.score(pose))
print(distance_filter.apply(pose))

Atom Type: Nlys
	element: N
	Lennard Jones: radius=1.80245 wdepth=0.161725
	Lazaridis Karplus: lambda=3.5 volume=16.514 dgfree=-20.8646
	properties: DONOR 
Extra Parameters: 1.75 1.55 0.79 1.55 1.44 1.5 1.55 -20 -10.695 -1.145 -20 -0.62 0 0 0 1.85 8.52379 0.025 0.01 0.005 -289.292 -0.697267 -1933.88 -1.56243 -93.2613 93.2593 0.00202205 715.165 74.6559 -74.6539 0.00268963 -1282.36 0.633 -0.367 0.926 -0.537 0.633 -0.367

Atom Type: CObb
	element: C
	Lennard Jones: radius=1.91666 wdepth=0.141799
	Lazaridis Karplus: lambda=3.5 volume=13.221 dgfree=3.10425
	properties: 
Extra Parameters: 2.14 1.7 0.72 1.7 1.89 1.76 1.65 0 0 0 1 0.51 0 0 0 2 8.81363 0.025 0.01 0.005 147.227 -0.811304 -8117.41 -2.17625 -85.8924 85.8904 0.00196363 900.14 168.481 -168.287 0.00113765 -6725.43 0 0 0 0 0 0

76.1201477108032
False


#### 4.5 TerminusDistance(xmlobject)
如果所有蛋白-蛋白相互作用界面上的残基离蛋白质的N或C端一级序列上距离。这个filter的意义在于不希望flexible的N或C端有氨基酸在相互作用界面上。

In [77]:
from pyrosetta.rosetta.protocols import rosetta_scripts 

# 读取复合物结构
complex_pose = pose_from_pdb('./data/denovo_binder.pdb')

xml = rosetta_scripts.XmlObjects.create_from_string('''
<FILTERS>
    <TerminusDistance name="nc_filter" jump_number="1" distance="5"/>
</FILTERS>
''')

terminus_distance_filter = xml.get_filter('nc_filter')
terminus_distance_filter.apply(complex_pose)

[0mcore.import_pose.import_pose: {0} [0mFile './data/denovo_binder.pdb' automatically determined to be of type PDB
[0mcore.conformation.Conformation: {0} [0mFound disulfide between residues 148 186
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 148 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 186 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 148 CYD
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 186 CYD
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mGenerating XML Schema for rosetta_scripts...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mInitializing schema validator...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0mValidating input script...
[0mprotocols.rosetta_scripts.RosettaScriptsParser: {0} [0m...done
[0mprotocols.rosetta

True

### 5. Sequence analysis

#### 5.1 LongestContinuousPolarSegment
侦查Pose一级序列上，极性氨基酸残基最大连续长度的Filter。

选项:
- exclude_chain_termini： false表示极性区域能够延展到N端或C端的将被计算；true表示不被计算（默认为true，仅内部的极性残基块被计算）
- count_gly_as_polar： true表示gly会被考虑为极性氨基酸，（默认为true）
- filter_out_high ：true表示高于cutoff设定值的极性残基长度的pose会被reject掉；false表示低于cutoff会被reject（默认为true）
- cutoff：最长极性残基长度的阈值，默认值为5
- residue_selector：氨基酸选择器，应预先定义(可选)

In [99]:
from pyrosetta.rosetta.protocols.simple_filters import LongestContinuousPolarSegmentFilter

# 读取复合物结构
complex_pose = pose_from_pdb('./data/1ubq_clean.pdb')

lps = LongestContinuousPolarSegmentFilter()
lps.set_exclude_chain_termini(True)
# lps.residue_selector() # 需要时使用
# lps.filter_out_high(False)  # 需要时使用
lps.set_count_gly_as_polar(False)
lps.set_cutoff(10)
lps.score(pose)

[0mcore.import_pose.import_pose: {0} [0mFile './data/1ubq_clean.pdb' automatically determined to be of type PDB


5.0

#### 5.2 LongestContinuousApolarSegment
侦查Pose一级序列上，非极性氨基酸残基最大连续长度的Filter。

In [106]:
from pyrosetta.rosetta.protocols.simple_filters import LongestContinuousApolarSegmentFilter

# 读取复合物结构
complex_pose = pose_from_pdb('./data/1ubq_clean.pdb')

lps = LongestContinuousApolarSegmentFilter()
lps.set_exclude_chain_termini(True)
# lps.residue_selector() # 需要时使用
# lps.filter_out_high(False)  # 需要时使用
lps.set_count_gly_as_polar(False)
lps.set_cutoff(10)
lps.score(pose)

[0mcore.import_pose.import_pose: {0} [0mFile './data/1ubq_clean.pdb' automatically determined to be of type PDB


5.0

#### 5.3 SequenceDistanceFilter
计算两个序列之间的hamming distance。

https://zh.wikipedia.org/wiki/%E6%B1%89%E6%98%8E%E8%B7%9D%E7%A6%BB

In [113]:
from pyrosetta.rosetta.protocols.simple_filters import SequenceDistance

# 突变序列
mut_seq = pose.sequence()
mut_seq.replace('Q','C')
print(mut_seq)

seq_dis_filter = SequenceDistance()
seq_dis_filter.target_seq(mut_seq)
seq_dis_filter.threshold(10)
seq_dis_filter.score(pose)

MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG


73.0

### 6. Alignment analysis

#### 6.1 AlignmentAAFinder

#### 6.2 AlignmentGapInserter

### 7. Geometry

#### 7.1 AngleToVectorFilter

#### 7.2 TorsionFilter

#### 7.3 HelixPairing

#### 7.4 SSMotifFinder

#### 7.5 SecondaryStructure

#### 7.6 SecondaryStructureCount

#### 7.7 SecondaryStructureHasResidue

#### 7.8 HelixKink

#### 7.9 Geometry

#### 7.10 HSSTriplet

#### 7.11 PreProlineFilter

#### 7.12 LoopAnalyzerFilter

### 8. Packing/Connectivity

#### 8.1 AverageDegree

#### 8.2 PackStat

#### 8.3 Holes

#### 8.4 InterfaceHoles

#### 8.5 NeighborType

#### 8.6 ResInInterface

#### 8.7 ShapeComplementarity

#### 8.8 SSShapeComplementarity

#### 8.9 SpecificResiduesNearInterface

### 9. Burial

#### 9.1 TotalSasa

#### 9.2 Sasa

#### 9.3 ResidueBurial

#### 9.4 BuriedSurfaceArea

#### 9.5 ExposedHydrophobics

### 10. Comparison

#### 10.1 Rmsd

#### 10.2 SidechainRmsd

#### 10.3 IRmsd

#### 10.4 RmsdFromResidueSelectorFilter

#### 10.5 SequenceRecovery

### 11. bonding

#### 11.1 ChainBreak

#### 11.2 HbondsToResidue

#### 11.3 SimpleHbondsToAtom

#### 11.4 HbondsToAtom

#### 11.5 PeptideInternalHbondsFilter

#### 11.6 BuriedUnsatHbonds

#### 11.7 BuriedUnsatHbonds2

#### 11.8 OversaturatedHbondAcceptorFilter

#### 11.9 DisulfideFilter

#### 11.10 AveragePathLength

#### 11.11 DisulfideEntropy