RMSD 计算  
本质为从结构的角度去判断两个蛋白质分子之间的差异  
在抗体预测中，往往将人工预测抗体与天然抗体进行比较，一般来说，与天然抗体越接近，则最终的结果就越好  

In [None]:
from rmsd.calculate_rmsd import main

if __name__ == "__main__":
    result = main()
    print(result)

有实际的库可以进行计算，原函数有2000多行，可直接由命令行获得计算结果，提供两个蛋白的pdb结果即可

In [None]:
pip install rmsd
calculate_rmsd tests/ethane.pdb tests/ethane_translate.pdb
calculate_rmsd --no-hydrogen --print tests/ethane.xyz tests/ethane_mini.xyz
calculate_rmsd --reorder tests/water_16.xyz tests/water_16_idx.xyz

有关所需PDB格式的简要说明 https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM

In [None]:
class SmartRMSD:
    def __init__(self, device="cuda"):
        self.device = device

    def from_coords(self, pred_coords, true_coords):
        """直接从坐标计算(需预对齐)"""
        return self._basic_rmsd(pred_coords, true_coords)

    def from_pdb(self, pred_pdb, true_pdb, options=""):
        """从PDB文件计算(使用rmsd库)"""
        cmd = f"calculate_rmsd {options} {pred_pdb} {true_pdb}"
        result = subprocess.check_output(cmd, shell=True)
        return float(result.decode().strip())

    def differentiable(self, pred_coords, true_coords):
        """可微分RMSD(用于训练)"""
        aligned = self._kabsch_align(pred_coords, true_coords)
        return self._basic_rmsd(aligned, true_coords)

    def _basic_rmsd(self, P, Q):
        sq_diff = torch.sum((P - Q) ** 2, dim=-1)
        return torch.sqrt(torch.mean(sq_diff))

    def _kabsch_align(self, P, Q):
        # 可微分Kabsch实现 (如前所示)
        ...

    def cdr_rmsd(self, pred_pdb, true_pdb, cdr_name="H3"):
        """CDR特异性RMSD"""
        cdr_ranges = {"H1": (26, 35), "H2": (50, 58), "H3": (95, 102)}
        start, end = cdr_ranges[cdr_name]
        return self.from_pdb(pred_pdb, true_pdb, f"--select='resid {start}-{end}'")

该方式则可将pdb格式转为tensor，并使用pytorch进行计算