Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

translational recoding for UGA for Selenocysteine, UAG for Pyrrolysine, etc #1164

Open
peterjc opened this issue Apr 17, 2017 · 3 comments
Open

Comments

@peterjc
Copy link
Member

peterjc commented Apr 17, 2017

Raised by @idoerg on #1103, how can the existing Biopython translation function/method support translational recoding? e.g. UGA for Selenocysteine https://en.wikipedia.org/wiki/Selenocysteine
and UAG for Pyrrolysine https://en.wikipedia.org/wiki/Pyrrolysine

@peterjc
Copy link
Member Author

peterjc commented Apr 17, 2017

In GenBank files this can be annotated with the /transl_except=... qualifier,

Quoting ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt

/transl_except	Translational exception: single codon, the translation
		of which does not conform to the reference genetic code

Quoting https://www.ddbj.nig.ac.jp/FT/full_index.html

Qualifier       /transl_except=
Definition      translational exception: single codon the translation of which
                does not conform to genetic code defined by /organism or 
                /transl_table.
Value format    (pos:location,aa:<amino_acid>) where amino_acid is the
                amino acid coded by the codon at the base_range position
Example         /transl_except=(pos:213..215,aa:Trp)
                /transl_except=(pos:1017,aa:TERM)
                /transl_except=(pos:2000..2001,aa:TERM)
                /transl_except=(pos:X22222:15..17,aa:Ala)
Comment         if the amino acid is not on the restricted vocabulary list use
                e.g., '/transl_except=(pos:213..215,aa:OTHER)' with
                '/note="name of unusual amino acid"';
                for modified amino-acid selenocysteine use three letter code
                'Sec'  (one letter code 'U' in amino-acid sequence)
                /transl_except=(pos:1002..1004,aa:Sec);
                for partial termination codons where TAA stop codon is
                completed by the addition of 3' A residues to the mRNA
                either a single base_position or a base_range is used, e.g.
                if partial stop codon is a single base:
                /transl_except=(pos:1017,aa:TERM)
                if partial stop codon consists of two bases:
                /transl_except=(pos:2000..2001,aa:TERM) with
                '/note='stop codon completed by the addition of 3' A residues 
                to the mRNA'.

@peterjc
Copy link
Member Author

peterjc commented May 10, 2017

Slightly related #1224 raises some more tricky translation corner cases, where a codon can be an amino acid or a stop codon, and this is documented at the genetic code table level.

@Artoria2e5
Copy link

Artoria2e5 commented Dec 21, 2023

My terrible, terrible idea is that we do both transl_except and table expansion. The former would be used for cases when the recoding is definitively known, the latter for messing about on hypothetical translations.

For transl_except, we need a new parameter on _translate_str encoding a simplified version, something like a [{"pos": [213,215], aa: "O"}, {"pos": [216,218], aa: "*"}].

For table, we pass a brand-new CodonTable object with modification, which seems already supported. Just copying fields from an existing table and changing them seems easy enough, at least for the common case of recoding a stop codon without ambiguity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants