## Customizing PyLIF-HIPPOS

Note: This is just an experiment and learning journey for me (hands-on approach)

Problem(s) identified in recognizing interactions as compared to Biovia's Discovery Visualizer outputs of Ct-AChBP complexes:
    - with Varenicline (PDB: 4AFG)
    - with (-)-Lobeline (PDB: 4AFH)
    - with Psychonicline (SW4; PDB: 4b5d)

Interactions short-hand:
    - van der Waals -> vdW
    - hydrophobic -> hydphob
    - hydrogen bond -> Hb


### 4AFG Case
Using:
    ```
    residue_name 	LYS100 PHE102 GLN149 PHE150 GLY151 SER152 TRP153 VAL154 TYR194 CYS196 CYS197 GLU198 TYR201 VAL202 GLU203 ILE64 LYS86 GLN116 ILE118 PHE120 LEU126 ILE128 PHE175
    residue_number  99 101 148 149 150 151 152 153 193 195 196 197 200 201 202 63 85 115 117 119 125 127 174


    proteins    4afg/protein.mol2
    ligands     4afg/ligand_QMR1214_0.mol2

    outfile     ref-results.txt
    ```
HIPPOS correctly identified:
    - (vdw, F102)
    - (Hb, W153) and (arom, W153)
    - (Hb, Y194)
    - (Hb, Y201) and (arom, W153)
Not identified while these appear in DS:
    - (Hb, S152)
    - (Hpb, V154)
    - (Hpb, C196)
    - (Hpb, C197)
    - (vdw, I118)
    - (vdw, L126)
    - (Hpb, I128)
    - (Hpb, F175)
Potential solution:
    - Adjust the distance PARAMETERS according to what is found in DSV

### Adjusting based on DSV
Fetching interactions distances (water-mediated interaction is ignored for now)

- (Hb, S152, 2.93)
- (Hb, W153, 1.69)
- (Pi-cat, Y201, 4.60)
- (Pi-sig, Y201, 3.69)
- (Pi-cat, W153, 4.72)
- (Pi-cat, W153, 4.84) #there's to of them
- (Pi-Pi, W153, 4.78)
- (Pi-sig, V154, 3.98)
- (Hpb, C196, 4.73)
- (Hpb, C197, 4.86)
- (Hpb, I128, 4.51)
- (Hpb, I128, 4.62)
- (Hpb, Y194, 4.47)
- (Hpb, F175, 4.99)

Comparing with HIPPOS default PARAMS:
HYDROPHOBIC = 4.5 -> changed to 5.0
AROMATIC = 4.0 -> changed to 50
HBOND = 3.5
ELECTROSTATIC = 4.0

Testing with new params:
- Successfully recognized:
    - (Hpb, C197) but not with C196
- Still failed:
    - especially Hpb interactions with (-) side... I wonder why

Possible solution:
    - expand typing of hydrophobes in 'ifp_processing'

"""
    def assign_atoms(ligand, charge_assignment):
    hydrophobic = []
    h_donor = []
    h_accept = []
    positive = []
    negative = []
    h_donorh = []

    rings = getRing(ligand)

    for atom in ob.OBMolAtomIter(ligand):
        # Hydrophobic
        # Improved and comprehensive SMARTS pattern for hydrophobic atoms
        if atom.MatchesSMARTS("[C&!$(C=O)&!$(C#N),S&^3,F,Cl,Br,I,c,C@C,C@H]"):
            hydrophobic.append(atom.GetId())
        # H Donor & Hydrogen bonded to H Donor
        if atom.IsHbondDonor():
            h_donor.append(atom.GetId())
            h_donorh_list = []
            for hydrogen in ob.OBMolAtomIter(ligand):
                if atom.IsConnected(hydrogen):
                    h_donorh_list.append(hydrogen.GetId())
            h_donorh.append(h_donorh_list)
        # H Acceptor
        if atom.IsHbondAcceptor():
            h_accept.append(atom.GetId())
        # Electrostatic
        if charge_assignment == "plants":
            if atom.GetPartialCharge() > 0:
                positive.append(atom.GetId())
            if atom.GetPartialCharge() < 0:
                negative.append(atom.GetId())
        if charge_assignment == "vina":
            if (atom.GetAtomicNum() == 7) and (atom.GetPartialCharge() >= -0.235):
                positive.append(atom.GetId())
            if (atom.GetAtomicNum() == 8) and (atom.GetPartialCharge() <= -0.648):
                negative.append(atom.GetId())
        if charge_assignment == "DIRECT":
            # Direct charge assignment logic (if any)
            pass
"""

With above changes, it correctly recognized:
    - (Hpb, V154)

Trying to expand more.

"""
def assign_atoms(ligand, charge_assignment):
    hydrophobic = []
    h_donor = []
    h_accept = []
    positive = []
    negative = []
    h_donorh = []

    rings = getRing(ligand)

    for atom in ob.OBMolAtomIter(ligand):
        # Hydrophobic
        if atom.MatchesSMARTS("[$([#6])&!$([#6X4H0]),$([#16])&!$([#16X3])&!$([#16X4]),F,Cl,Br,I;+0]"):
            hydrophobic.append(atom.GetId())
        # H Donor & Hydrogen bonded to H Donor
        if atom.IsHbondDonor():
            h_donor.append(atom.GetId())
            h_donorh_list = []
            for hydrogen in ob.OBMolAtomIter(ligand):
                if atom.IsConnected(hydrogen):
                    h_donorh_list.append(hydrogen.GetId())
            h_donorh.append(h_donorh_list)
        # H Acceptor
        if atom.IsHbondAcceptor():
            h_accept.append(atom.GetId())
        # Electrostatic
        if charge_assignment == "plants":
            if atom.GetPartialCharge() > 0:
                positive.append(atom.GetId())
            if atom.GetPartialCharge() < 0:
                negative.append(atom.GetId())
        if charge_assignment == "vina":
            if (atom.GetAtomicNum() == 7) and (atom.GetPartialCharge() >= -0.235):
                positive.append(atom.GetId())
            if (atom.GetAtomicNum() == 8) and (atom.GetPartialCharge() <= -0.648):
                negative.append(atom.GetId())
"""
Nothing changes.

Trying another:
"""
def assign_atoms(ligand, charge_assignment):
    hydrophobic = []
    h_donor = []
    h_accept = []
    positive = []
    negative = []
    h_donorh = []

    rings = getRing(ligand)

    for atom in ob.OBMolAtomIter(ligand):
        # Hydrophobic
        if atom.MatchesSMARTS("[$([#6])&!$([#6X4H0]),$([#16])&!$([#16X3])&!$([#16X4]),F,Cl,Br,I;+0]") or \
           atom.MatchesSMARTS("[$([CH2]),$([CH3])]") or \
           atom.MatchesSMARTS("[$([#6&X4])]") or \
           atom.MatchesSMARTS("[$([c]1[c][c][c][c][c]1)]"):
            hydrophobic.append(atom.GetId())
        # H Donor & Hydrogen bonded to H Donor
        if atom.IsHbondDonor():
            h_donor.append(atom.GetId())
            h_donorh_list = []
            for hydrogen in ob.OBMolAtomIter(ligand):
                if atom.IsConnected(hydrogen):
                    h_donorh_list.append(hydrogen.GetId())
            h_donorh.append(h_donorh_list)
        # H Acceptor
        if atom.IsHbondAcceptor():
            h_accept.append(atom.GetId())
        # Electrostatic
        if charge_assignment == "plants":
            if atom.GetPartialCharge() > 0:
                positive.append(atom.GetId())
            if atom.GetPartialCharge() < 0:
                negative.append(atom.GetId())
        if charge_assignment == "vina":
            if (atom.GetAtomicNum() == 7) and (atom.GetPartialCharge() >= -0.235):
                positive.append(atom.GetId())
            if (atom.GetAtomicNum() == 8) and (atom.GetPartialCharge() <= -0.648):
                negative.append(atom.GetId())
"""

Included:
    General Hydrophobic Atoms:
        Pattern: [$([#6])&!$([#6X4H0]),$([#16])&!$([#16X3])&!$([#16X4]),F,Cl,Br,I;+0]
            This pattern includes carbon atoms excluding tetracoordinated carbons, sulfur atoms excluding those with three or four valences, and halogens.

    Extended Hydrophobic Patterns:
        Aromatic carbons: [c]
        Aliphatic hydrocarbons: [$([CH2]),$([CH3])]
        Expanded aliphatic carbons: [$([#6&X4])]
        Aromatic systems: [$([c]1[c][c][c][c][c]1)] (e.g., benzene rings)

but nothing changes.


### Possible Problems

It seems even after making the HB distance to 10.0 A (for testing) it doesnt recognize those on the other subunit,

namely ILE64 LYS86 GLN116 ILE118 PHE120 LEU126 ILE128 PHE175

Checking the configs now.

we find the problem now, somehow the distance is non-sensical for subunit B. This is a snippet of output for Glu203 which is in subunit A:
Initializing Residue: GLU203 202
Hydrophobic: Ligand Atom 0 - Protein Atom 3213, Distance: 8.560090303261992
Hydrophobic: Ligand Atom 0 - Protein Atom 3214, Distance: 9.139143340598178
Hydrophobic: Ligand Atom 2 - Protein Atom 3213, Distance: 9.720135492882802
Hydrophobic: Ligand Atom 2 - Protein Atom 3214, Distance: 10.238229778628726
Hydrophobic: Ligand Atom 5 - Protein Atom 3213, Distance: 9.355004917155307

and here is the ILE64 which is in Subunit B:
Initializing Residue: ILE64 63
Hydrophobic: Ligand Atom 0 - Protein Atom 1004, Distance: 24.50904139292274
Hydrophobic: Ligand Atom 0 - Protein Atom 1005, Distance: 25.833213388968865
Hydrophobic: Ligand Atom 0 - Protein Atom 1006, Distance: 24.504754375426828
Hydrophobic: Ligand Atom 0 - Protein Atom 1007, Distance: 26.007543867116713
Hydrophobic: Ligand Atom 2 - Protein Atom 1004, Distance: 25.76651466535589

Next step: adding debugging prints to find the culprit

This is for LYS100:
Initializing Residue: LYS100 99
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1610 (37.563, -6.513, 12.79), Distance: 9.221350497622353
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1611 (37.233, -5.91, 14.172), Distance: 9.348985078606132
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1612 (37.413, -4.388, 14.178), Distance: 8.998190595892051

and this is for ILE64:
Initializing Residue: ILE64 63
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1004 (28.266, -20.189, 10.224), Distance: 24.50904139292274
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1005 (26.894, -20.503, 9.612), Distance: 25.833213388968865
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1006 (28.188, -20.36, 11.708), Distance: 24.504754375426828
Hydrophobic: Ligand Atom 0 (46.401, -4.107, 13.855) - Protein Atom 1007 (25.812, -19.511, 9.956), Distance: 26.007543867116713
Hydrophobic: Ligand Atom 2 (46.896, -2.732, 13.701) - Protein Atom 1004 (28.266, -20.189, 10.224), Distance: 25.76651466535589


checking the parse_mol py and adding debug prints
