Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output fasta for TPM counting #242

Closed
ericmalekos opened this issue Jun 12, 2024 · 4 comments
Closed

Output fasta for TPM counting #242

ericmalekos opened this issue Jun 12, 2024 · 4 comments

Comments

@ericmalekos
Copy link

Thank you for the great tool.

I am wondering if there is anyway to generate a Fasta based on the gene fusion output. I would like to quantify TPMs with Salmon/Kallisto which require a Fasta version of the transcriptome.
Context: my idea is to prepend the gene fusion Fasta to a gencode transcript fasta and thereby quantify expression. One challenge I anticipate is that, depending on the fusion genes, there may be many isoform combinations to consider.

@suhrig
Copy link
Owner

suhrig commented Jun 12, 2024

If you run Arriba with the parameter -I, it gives you the full fusion transcript in the fusion_transcript column (with some exceptions where this is not possible). After removing special characters (i.e., anything other than A C T G a c t g) you can convert this to a FastA file, which should be suitable for quantification.

Note that Arriba only reports the sequence for one transcript per gene. It can't give you all the possible combinations of transcripts. It picks the one which best matches the splice pattern of the supporting reads.

@ericmalekos
Copy link
Author

That's great thank you!

@ericmalekos
Copy link
Author

How would you deal with replacing ellipsis? Or would you just remove them?

@ericmalekos ericmalekos reopened this Jul 5, 2024
@suhrig
Copy link
Owner

suhrig commented Jul 6, 2024

The meaning of an ellipsis is that there is not enough information to replace them with something else. You'll have to remove them.

@suhrig suhrig closed this as completed Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants