Add transcript fastas to wiki #34

sr320 · 2024-02-02T17:17:41Z

https://github.com/urol-e5/deep-dive/wiki/Species-Characteristics-and-Genomic-Resources

shedurkin · 2024-02-02T23:55:07Z

For P.evermanni, I'm pretty sure I have a functioning script to extract all the CDS lines from the gff, get fastas for each, and concatenate and label by parent! The only downside is it's quite slow (I don't know enough bash tricks to make it any more efficient) -- it's processed ~15% of the gff in the last hour, so it'll be running for the rest of the day. I'll let y'all know when it's done

shedurkin · 2024-02-07T21:37:54Z

Okie dokie, the rendered code for generating a transcriptome fasta (and running kallisto) for P.evermanni, and the transcriptome fasta itself are both pushed to the deep-dive repo!

kubu4 · 2024-02-07T22:16:15Z

Nice work! Impressive.

I think your .md file didn't actually get re-rendered (commit is still from last week).

The HTML version got rendered, though:

https://htmlpreview.github.io/?https://github.com/urol-e5/deep-dive/blob/main/E-Peve/code/12-Peve-RNAseq-kallisto.html

I only glanced through the code, but figured I should ask this. Did you take into account that GFF files are 1-based (i.e. start is 1) and BED files are 0-based (i.e. start is 0)? Meaning, if using the GFF as input to bedtools getfasta, you should subtract 1 from the GFF coordinates so bedtools pulls out the proper sequence.

Admittedly, for alignment-free gene expression analysis, this is likely not an issue?

shedurkin · 2024-02-07T22:24:07Z

Hmm, i was just assuming that since the bedtools getfasta doc lists gff files as one of the accepted inputs it would distinguish bed and gff files and process them appropriately -- it sounds like that's not the case?

kubu4 · 2024-02-07T22:48:01Z

i was just assuming

Looks like you're not following the "golden rule" of bioinformatics... 😉

I'd definitely like to make the same assumption, but...

shedurkin · 2024-02-07T22:55:43Z

haha good point -- it should be straightforward to add a gff -> bed conversion before generating the transcriptome, it'll just take a while to rerun

shedurkin · 2024-02-07T23:27:36Z

I see bedops has a feature to convert gff to bed, and I saw in the handbook that it's been used before in the lab, but I don't see bedops in the /home/shared directory of tools on Raven -- is it stored somewhere else or would it need to be installed?

kubu4 · 2024-02-07T23:52:12Z

I've gone ahead and installed bedops on Raven:

/home/shared/bedops_linux_x86_64-v2.4.41/bin

sr320 · 2024-02-23T16:34:29Z

@zbengt @shedurkin status on this?

and if / when done, provide details on how file was selected / derived

zbengt · 2024-02-23T20:59:27Z

Transcripts fastas for Pocillopora and Acropora are added to the wiki. Acropora is the transcripts fasta from NCBI (https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_013753865.1/). Pocillopora is from the CDS on the Rutgers server (http://cyanophora.rutgers.edu/Pocillopora_meandrina/). These links are included in the wiki as well.

sr320 · 2024-02-25T17:57:29Z

@shedurkin can you finish up by adding evermanni?

shedurkin · 2024-02-26T19:45:48Z

Added P.evermanni transcripts fasta, as well as links and download dates for original CDS gff and scaffolds fasta files and a link to the code used to generate the transcripts fasta: https://github.com/urol-e5/deep-dive/wiki/Species-Characteristics-and-Genomic-Resources#transcripts-1

sr320 assigned shedurkin and zbengt Feb 2, 2024

shedurkin closed this as completed Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add transcript fastas to wiki #34

Add transcript fastas to wiki #34

sr320 commented Feb 2, 2024

shedurkin commented Feb 2, 2024

shedurkin commented Feb 7, 2024

kubu4 commented Feb 7, 2024

shedurkin commented Feb 7, 2024 •

edited

kubu4 commented Feb 7, 2024

shedurkin commented Feb 7, 2024

shedurkin commented Feb 7, 2024

kubu4 commented Feb 7, 2024

sr320 commented Feb 23, 2024 •

edited

zbengt commented Feb 23, 2024

sr320 commented Feb 25, 2024

shedurkin commented Feb 26, 2024

Add transcript fastas to wiki #34

Add transcript fastas to wiki #34

Comments

sr320 commented Feb 2, 2024

shedurkin commented Feb 2, 2024

shedurkin commented Feb 7, 2024

kubu4 commented Feb 7, 2024

shedurkin commented Feb 7, 2024 • edited

kubu4 commented Feb 7, 2024

shedurkin commented Feb 7, 2024

shedurkin commented Feb 7, 2024

kubu4 commented Feb 7, 2024

sr320 commented Feb 23, 2024 • edited

zbengt commented Feb 23, 2024

sr320 commented Feb 25, 2024

shedurkin commented Feb 26, 2024

shedurkin commented Feb 7, 2024 •

edited

sr320 commented Feb 23, 2024 •

edited