Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

How to utilize sqanti2 classification result for genome optimizaion? #60

Open
leosfan opened this issue Mar 25, 2020 · 1 comment
Open

Comments

@leosfan
Copy link

leosfan commented Mar 25, 2020

Hello @Magdoll ,

Thanks a lot for keeping up the good works!

I wonder if I can use the the diff_to_gene_TSS and diff_to_gene_TTS output from SQANTI2 such that I may know which isoform is longer than its corresponding transcripts in the reference genome. This then allows me to extend, and hence "optimize", the reference genome since the isoform is in full-length.

However, there is no information regarding the starting and ending genomic coordinate of the exact transcripts the query isoform is compared to.

For example, the definition for diff_to_gene_TSS is: "distance of query isoform 5' start to the closest start end of any transcripts of the matching gene." How can I know the genomic coordinates of that "closest start end"?

For another instance, the definition for diff_to_gene_TTS is: "distance of query isoform 3' end to the closest end of any transcripts of the matching gene." How can I know the genomic coordinates of that "the closest end"?

Thanks again for your help!
Best,
Leo

@Magdoll
Copy link
Owner

Magdoll commented Apr 24, 2020

Hi @leosfan ,

That is a fair point. Right now, SQANTI2 does not record which particular reference transcript of a particular gene that generates the "diff_to_gene_TSS" and "diff_to_gene_TTS". You can only assume it is one of the reference transcripts from the gene indicated by "associated_gene".

-Liz

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants