Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding transcriptome in a TALON-independent way? #26

Open
lscdanson opened this issue Feb 1, 2024 · 3 comments
Open

Adding transcriptome in a TALON-independent way? #26

lscdanson opened this issue Feb 1, 2024 · 3 comments

Comments

@lscdanson
Copy link

Hi your tutorial page mentioned that users could use other tools that yield transcriptomes as input to Swan; however, I saw that in the step of "Adding transcript models from a GTF", a gtf file ("all_talon_observedOnly.gtf") seems to be generated from TALON. I wonder if you could suggest a way to do the same without the need to use TALON. Many thanks!

@fairliereese
Copy link
Member

fairliereese commented Feb 1, 2024

Hi! The GTF is the common format that you can use as input. It does not have to be generated from TALON!

@lscdanson
Copy link
Author

lscdanson commented Feb 1, 2024

Thanks for your prompt reply. Sorry if I've misunderstood as I'm new to this. To my knowledge, a gtf file usually refers to the annotation gtf file, which has already been supplied in the previous step via sg.add_annotation(annot_gtf). I downloaded mine from Ensembl (https://ftp.ensembl.org/pub/release-111/gtf/homo_sapiens/). May I know what this transcriptome gtf is and where else can I obtain? coz I don't seem to see any other GTF-formatted file from my pipeline outputs. Thanks so much!

@fairliereese
Copy link
Member

Ah I see. If you are not doing transcript discovery in your data processing, you don't need to add another transcriptome GTF file, as all of the transcript that you're detecting are already from the Ensembl GTF you've mentioned. This is a characteristic I'd expect from a short-read RNA-seq data processing pipeline, am I correct in assuming this? If you are using long-read RNA-seq data, what are you using to process it?

@fairliereese fairliereese reopened this Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants