Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

processing of repeats that cross exon boundaries #19

Open
maglott opened this issue Dec 9, 2022 · 1 comment
Open

processing of repeats that cross exon boundaries #19

maglott opened this issue Dec 9, 2022 · 1 comment
Labels
enhancement New feature or request question Further information is requested

Comments

@maglott
Copy link

maglott commented Dec 9, 2022

I submitted NM_001042492.3:c.7317AGC[1] and chromosome descriptions of
NC_000017.11(NM_001042492.3):c.7320_7322del
NC_000017.11:g.31349250_31350183del were returned on GRCh38.
Is the latter representation expected?
Seems more likely that only 3 nucleotides were deleted from the genome to result in the lost of one repeat unit. Why is the projection across the intron returned?

@jfjlaros
Copy link
Member

This is an interesting one.

The RefSeq transcript contains a tri-nucleotide repeat with two repeat units. The last nucleotide of the second repeat unit resides in a different exon than the rest of the repeat.

Because a RefSeq transcript is used, the application of the 3' rule results in NM_001042492.3:c.7320_7322del. When this description is mapped to a genomic build, the deletion spans an intron.

If on the other hand, a "genomic transcript" would have been used (e.g., GRCh38(NM_001042492.3):c.7317AGC[1]), the reference sequence does not contain a repeat (there is now an intron in between) and therefore the description is normalised to NC_000017.11(NM_001042492.3):c.=. The desired deletion could have been described as GRCh38(NM_001042492.3):c.7317_7319del, which is normalised to NC_000017.11(NM_001042492.3):c.7319_7321del.

So, this is expected behaviour. However, this example shows that a mapping from a RefSeq transcript to a "genomic transcript" can sometimes be done in multiple ways. I think it would be good if Mutalyzer could at least detect these situations and report on them.

@jfjlaros jfjlaros added question Further information is requested enhancement New feature or request labels Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants