Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example output files? #5

Closed
stianlagstad opened this issue Mar 12, 2020 · 4 comments
Closed

Example output files? #5

stianlagstad opened this issue Mar 12, 2020 · 4 comments

Comments

@stianlagstad
Copy link

Hi,

I would like to know if the output data that Aeron produces can be used with https://github.com/stianlagstad/chimeraviz. Do you have any example output files that you can share?

Thank you!

@SchulzLab
Copy link
Owner

Hi,
currently the output files of Aeron are not prepared in a way that they can directly be used with chimeraviz. As it appears to be a useful tool, we are looking into adding chimeraviz compatible files in the future.

Thanks for the suggestion,
Marcel

@stianlagstad
Copy link
Author

@SchulzLab : Thank you very much for the response. Do you have an example output file nonetheless, so that I can think about how to possibly implement support for it?

@maickrau
Copy link
Collaborator

Hi,

Here's an example of the fusion output. The predicted fusion transcripts are in the file "fusion_transcript_..._.fa". Information about the fusion is included in the name:

>fusion_1_ENSG00000092010.14_1206bp_ENSG00000100908.13_1052bp_1reads

The format is fusion_{id}_{gene1}_{gene1 size}_{gene2}_{gene2 size}_{constructing reads}reads. id is an unique identifier per fusion, gene1 and gene2 are the ensembl IDs of the two genes involved in the fusion, gene1/2 size is the approximate size of the transcript on either side of the fusion breakpoint, and constructing reads is the number of reads used for building the predicted fusion transcript. The fusion sequence itself has an N at the fusion breakpoint location. The file does not have information about the position of the two genes in the genome and this has to be retrieved from elsewhere using the ensembl IDs.

In addition to this, there is the file "fusion_support_..._.txt" which contains a table of fusion names and the number of reads that support the fusion. A read supports a fusion if its primary alignment spans the fusion breakpoint and 150bp from both sides. This number can be and usually is different than the number of reads used for building the predicted fusion transcript.

Finally, there are bam files (not included in the example) of the read alignments to the reference transcripts + predicted fusion transcripts, one for all alignments and one filtered for alignments which support a fusion.

fusion_example.zip

@stianlagstad
Copy link
Author

Thank you very much! Support for Aeron has been added to chimeraviz in this PR: stianlagstad/chimeraviz#81

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants