-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add transcript fastas to wiki #34
Comments
For P.evermanni, I'm pretty sure I have a functioning script to extract all the CDS lines from the gff, get fastas for each, and concatenate and label by parent! The only downside is it's quite slow (I don't know enough bash tricks to make it any more efficient) -- it's processed ~15% of the gff in the last hour, so it'll be running for the rest of the day. I'll let y'all know when it's done |
Okie dokie, the rendered code for generating a transcriptome fasta (and running kallisto) for P.evermanni, and the transcriptome fasta itself are both pushed to the deep-dive repo! |
Nice work! Impressive. I think your The HTML version got rendered, though: I only glanced through the code, but figured I should ask this. Did you take into account that GFF files are 1-based (i.e. Admittedly, for alignment-free gene expression analysis, this is likely not an issue? |
Hmm, i was just assuming that since the bedtools getfasta doc lists gff files as one of the accepted inputs it would distinguish bed and gff files and process them appropriately -- it sounds like that's not the case? |
Looks like you're not following the "golden rule" of bioinformatics... 😉 I'd definitely like to make the same assumption, but... |
haha good point -- it should be straightforward to add a gff -> bed conversion before generating the transcriptome, it'll just take a while to rerun |
I've gone ahead and installed bedops on Raven:
|
@zbengt @shedurkin status on this? and if / when done, provide details on how file was selected / derived |
Transcripts fastas for Pocillopora and Acropora are added to the wiki. Acropora is the transcripts fasta from NCBI (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_013753865.1/). Pocillopora is from the CDS on the Rutgers server (http://cyanophora.rutgers.edu/Pocillopora_meandrina/). These links are included in the wiki as well. |
@shedurkin can you finish up by adding evermanni? |
Added P.evermanni transcripts fasta, as well as links and download dates for original CDS gff and scaffolds fasta files and a link to the code used to generate the transcripts fasta: https://github.com/urol-e5/deep-dive/wiki/Species-Characteristics-and-Genomic-Resources#transcripts-1 |
https://github.com/urol-e5/deep-dive/wiki/Species-Characteristics-and-Genomic-Resources
The text was updated successfully, but these errors were encountered: