Megan-project

Class project for FISH 546 winter 2021

project results and methods project presentation

Software information

OS: macOS Big Sur Version 11.2 (20D64)
RStudio Version 1.3.959
kallisto 0.46.2
FastQC v0.11.9 (Win/Linux)
GitHub Desktop Version 2.5.4
JupyterLab 3.0.6

RNA seq data: coho salmon treated with a steroid and looked at gonadal transcriptional alterations

Data courtesy of Chris Monson (UW) and Giles Goetz (NOAA). A more through description can be found in the data subdirectory's readme

Data Location

All RNA seq raw data files can be found here

Only a subset of files were used for this project due to storage limitations, but all codes are written so they can be executed with the full dataset if you have space on your computer or external hard drive to do so. The subset of files are: 17104-02RT-01-10_S18_L002_R1_001.fastq.gz 17104-02RT-01-10_S18_L002_R2_001.fastq.gz 17104-02RT-01-11_S19_L002_R1_001.fastq.gz 17104-02RT-01-11_S19_L002_R2_001.fastq.gz 17104-02RT-01-13_S21_L002_R2_001.fastq.gz 17104-02RT-01-13_S21_L003_R1_001.fastq.gz 17104-02RT-01-13_S21_L003_R2_001.fastq.gz 17104-02RT-01-14_S22_L002_R1_001.fastq.gz 17104-02RT-01-7_S15_L002_R2_001.fastq.gz 17104-02RT-01-7_S15_L003_R1_001.fastq.gz 17104-02RT-01-7_S15_L003_R2_001.fastq.gz 17104-02RT-01-8_S16_L002_R1_001.fastq.gz 17104-02RT-01-8_S16_L002_R2_001.fastq.gz 17104-02RT-01-8_S16_L003_R1_001.fastq.gz 17104-02RT-01-8_S16_L003_R2_001.fastq.gz
- The R1 or R2 in the file names correspond to the read ends
- The sequences of the adapters used in library prep are R1 : AGATCGGAAGAGCACACGTCTGAACTCCAGTCA and R2: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT.

General Workflow

As of March 3rd, 2021, Week 8 Below, the code files for this project are listed and described in order. File names for the final project are formatted step#-description Old file names / draft code names are formatted MMDD-description where MMDD is the day they were created.

Origin of Data:

Giles transferred the files from a NOAA server to Steven's ostrich server. Steven then transferred them to gannet, which is linked above. The md5sum text, 0128-Giles-md5sums.txt file generated by generated by Giles during the initial transfer is located in data/raw/ subdirectory.

step1-gettingDataFromGannet.ipynb

Retrieves the data from gannet using wget. Recall, that not all of the files were used in this project, but all files are available on gannet. Saves data to data/raw/subdirectory.

step2-md5sums.ipynb

Compares the file with md5sums that Giles provided, 0128-Giles-md5sums.txtto the md5sums of the downloaded files.

step3-fastqcForMultipleFiles.ipynb

Runs fastqc for all of our raw data files. Output directed to analyses/step3-fastqc/

step4-multiqcOnFastqc.ipynb

Runs multiqc using all of the fastqc outputs directed to analyses/step3-fastqc/in order to visualize all our sequences' qualities. Output directed to analyses/step4-multiqc/and the html output is multiqc_report.html within this subdirectory. Multiqc showed that the first ~15 bp of all sequences needed to be trimmed.

step5-trimming.ipynb

Skipped in this project for the sake of time

step6-kallisto.ipynb

Gene expression quantified and put into a trinity matrix using Kallisto. Kallisto index built using the ensembl reference transcriptome for Oncorhynchus kisutch, located in data/Oncorhynchus_kisutch.Okis_V2.ncrna.fa Outputs directed to analyses/step6-kallisto.idx and analyses/step6-output/

step7-deseq2visualization.Rmd

Used DESeq2 to identify DEGs, and visualize DEGs with volcano plot and heatmap. Images of the volcano plot and heatmap are in images/

step8-blast.ipynb

Ran blastx for the reference transcriptome to identify what the DEGs' functions were.

Note: only one of 12 DEGs had a match to the reference transcriptome. The remaining 11 were identified using the web version of blastn using default settings and the fasta file data/Oncorhynchus_kisutch.Okis_V2.ncrna_11-DEGs.fa

step9-joining.ipynb

joins the DEG statistics generated using DESeq2 with the expression levels of the 12 DEGs. Output is analyes/step9-DEGandBlastTable.tab

Name	Name	Last commit message	Last commit date
Latest commit meganewing project update Mar 19, 2021 707be28 · Mar 19, 2021 History 34 Commits
.ipynb_checkpoints	.ipynb_checkpoints	project update - heatmap and blast	Mar 10, 2021
analyses	analyses	project update	Mar 11, 2021
blast	blast	project update - heatmap and blast	Mar 10, 2021
code	code	project update	Mar 19, 2021
data	data	add 11 DEGs FastA	Mar 10, 2021
images	images	project update	Mar 11, 2021
.DS_Store	.DS_Store	draft product update	Feb 10, 2021
._README.md	._README.md	project update	Mar 19, 2021
.gitignore	.gitignore	Draft product update	Feb 11, 2021
Megan-project.Rproj	Megan-project.Rproj	kallisto, trinity, deseq2, and preliminary visualization	Mar 5, 2021
README.md	README.md	project update	Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Megan-project

Class project for FISH 546 winter 2021

Software information

RNA seq data: coho salmon treated with a steroid and looked at gonadal transcriptional alterations

Data Location

General Workflow

Origin of Data:

step1-gettingDataFromGannet.ipynb

step2-md5sums.ipynb

step3-fastqcForMultipleFiles.ipynb

step4-multiqcOnFastqc.ipynb

step5-trimming.ipynb

step6-kallisto.ipynb

step7-deseq2visualization.Rmd

step8-blast.ipynb

step9-joining.ipynb

About

Releases

Packages

Contributors 2

Languages

fish546-2021/Megan-project

Folders and files

Latest commit

History

Repository files navigation

Megan-project

Class project for FISH 546 winter 2021

Software information

RNA seq data: coho salmon treated with a steroid and looked at gonadal transcriptional alterations

Data Location

General Workflow

Origin of Data:

step1-gettingDataFromGannet.ipynb

step2-md5sums.ipynb

step3-fastqcForMultipleFiles.ipynb

step4-multiqcOnFastqc.ipynb

step5-trimming.ipynb

step6-kallisto.ipynb

step7-deseq2visualization.Rmd

step8-blast.ipynb

step9-joining.ipynb

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages