-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draw_fusions.R content extending beyond page #56
Comments
This may look like a display error at first glance, but it's really not. It's an error in the GENCODE annotation. Let me explain:
In this case, GENCODE annotates exactly one transcript with a breakpoint at position 9:133729454, namely In the future, the issue you are seeing may be alleviated by two improvements that I am currently thinking about implementing in the transcript selection of
|
Hi Sebastian, Based on your suggestion, I have created a python and bash script that can cleanup GENCODE data. This basically removes any transcripts that do not comply with a standard set of features that a protein coding transcript should have. Wanted to share in case anyone else faced this problem. Archive.zip I think it will also be helpful to provide Arriba a preferred transcript input file that some users may find useful. The idea being, for clinical use case one may want to pick a different known transcript for a given gene than in a discovery setting. Since you plan to make this seamless between Arriba and draw_fusions.R, this preferred transcript gets selected by Arriba and then can be passed on to draw_fusions.R as well. Let me know what you think |
Thanks for the script! I will have a look at it. Currently, I have implemented two options:
I am thinking about a third option (pick the canonical transcript). It will be helpful for me to see what criteria you use to select the transcript. Thanks! |
I can speak for myself here and say I am really looking forward for #2. This really brings out the WYSIWYG behavior. Wondering if you thought about the canonical transcript option for Arriba itself. That way, again, the behavior is maintained and one does not see a disconnect between the tool output and the plot. Just a thought. |
Closing this one, too, as explained in issue #51. |
Hi Sebastian,
Attaching a screenshot of an example plot that extends beyond the page. ABL1 is known to have 12 exons but I guess the plot needs to resize if the first gene covers everything on the page like BCR: 23 exons. Do you know how this can be adjusted?
Here are 2 example rows for this fusion:
BCR ABL1 +/+ +/+ 22:23632600 9:133729451 splice-site splice-site translocation downstream upstream 27 27 7 423 1256 high . . duplicates(21) CATCCAGAGAGAGAAG___AGGGCGAACAAGGGCAGCAAGGCTACGGAGAGGCTGAAGAAGAAGCTGTCGGAGCAGGAGTCACTGCTGCTGCTTATGTCTCCCAGCATGGCCTTCAGGGTGCACAGCCGCAACGGCAAG___AGTTACACGTTCCTGATCTCCTCTGACTATGAGCGTGCAGAGTGGAGGGAGAACATCCGGGAGCAGCAGAAGAAGT___GTTTCAGAAGCTTCTCCCTGACATCCGTGGAGCTGCAGATGCTGACCAACTCGTGTGTGAAACTCCAGACTGTCCACAGCATTCCGCTGACCATCAATAAGGAAG___ATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAA|AAGCCCTTCAGCGGCCAGTAGCATCTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGGAAAACCTTCTCGCTGGACCCAGTGAAAATGACCCCAACCTTTTCGTTGCACTGTATGATTTTGTGGCCAGTGGAGATAACACTCTAAGCATAACTAAAG___GTGAAAAGCTCCGGGTCTTAGGCTATAATCACAATGGGGAATGGTGTGAAGCCCAAACCAAAAATGGCCAAGGCTGGGTCCCAAGCAACTACATCACGCCAGTCAACAGTCTGGAGAAACACTCCTG in-frame IQREKRANKGSKATERLKKKLSEQESLLLLMSPSMAFRVHSRNGKSYTFLISSDYERAEWRENIREQQKKCFRSFSLTSVELQMLTNSCVKLQTVHSIPLTINKEDDESPGLYGFLNVIVHSATGFKQSS|kALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHS .
BCR ABL1 +/+ +/+ 22:23632601 9:133729454 splice-site splice-site translocation downstream upstream 1 0 7 368 1256 medium . . duplicates(4) CATCCGGGAGCAGCAGAAGAAGT___GTTTCAGAAGCTTCTCCCTGACATCCGTGGAGCTGCAGATGCTGACCAACTCGTGTGTGAAACTCCAGACTGTCCACAGCATTCCGCTGACCATCAATAAGGAAG___ATGATGAGTCTCCGGGGCTCTATGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGCAGAGTTCAAG|CagccactgGatttaagcagagTGACTTTGAGCCTCAGGGTCTGAGTGAAGCCGCTCGTTGGAACTCCAAGGAAAACCTTCTCGCTGGACCCAGTGAAAATGACCCCAACCTTTTCGTTGCACTGTATGATTTTGTGGCCAGTGGAGATAACACTCTAAGCATAACTAAAG___GTGAAAAGCTCCGGGTCTTAGGCTATAATCACAATGGGGAATGGTGTGAAGCCCAAACCAAAAATGGCCAAGGCTGGGTCCCAAGCAACTACATCACGCCAGTCAACAGTCTGGA out-of-frame IREQQKKCFRSFSLTSVELQMLTNSCVKLQTVHSIPLTINKEDDESPGLYGFLNVIVHSATGFKQSSs|shwi* .
The text was updated successfully, but these errors were encountered: