Skip to content

Expanding ihm.LPeptideAlphabet #55

@gtauriello

Description

@gtauriello

When handling references to large set of sequences, we ran into a couple of non-standard amino acids which should preferably be added to ihm.LPeptideAlphabet.

Here is the class as we currently use in ModelArchive:

class _LPeptideAlphabetWithXO(ihm.LPeptideAlphabet):
    """Have the default amino acid alphabet plus 'X' for unknown residues
    and 'O' as allowed non-def. AA (U already in alphabet)."""

    # extra entry added according to LPeptideAlphabet def. in
    # https://python-ihm.readthedocs.io/en/latest/_modules/ihm.html
    # and https://files.rcsb.org/view/1NTH.cif for values for 'O'.

    def __init__(self):
        """Create the alphabet."""
        super().__init__()
        self._comps['X'] = self._comps["UNK"]
        self._comps['O'] = ihm.LPeptideChemComp(
            "PYL", "O", "O", "PYRROLYSINE", "C12 H21 N3 O3"
        )
        # B/ASX, Z/GLX defined in parent class
        # J not defined in CCD? (XLE used for something else)

The non-defined 'J' (LEU/ILE AMBIGUOUS) will be an issue as soon as we remediate the ma-jd-viral model set. That one was done before we added _struct_ref to python-modelcif.

There are 5 examples in there which reference NCBI sequences containing 'J' (e.g. YP_009337833.1 for ma-jd-viral-28831) but where the model uses an 'L' instead. So to correctly handle the _struct_ref_seq_dif category for those models, we actually need a _struct_ref_seq_dif.db_mon_id for 'J' but I could not find anything in the CCD to define an ID for it.

Any suggestions on how to handle the 'J' case? (@brindakv this may need your input)

Within ModelCIF, I can of course just define a locally defined chem. comp. (i.e. by passing ccd="local" to ihm.LPeptideChemComp) but maybe there is something in the CCD which I just could not find.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions