Skip to content

Latest commit

 

History

History
41 lines (20 loc) · 2.13 KB

README.md

File metadata and controls

41 lines (20 loc) · 2.13 KB

Google Colab notebook for downloading and viewing aligned sequence reads via IGV, all in browser.

Open In Colab

image

Motivation

I am currently an undergraduate research volunteer in a lab that studies East African cichlid brains. When I began in October, I found out we have data, but I had to wait to be granted access to our lab's gene expression files. To my surprise, I found several projects on NCBI GEO that had publicly available datasets on cichlid brains already. I knew that whatever code I used to analyze that public data would extend to our lab's private data, as well.

I was searching all around the web for a notebook that simply runs IGV, with no luck. By sheer force (and lots of trial and error), I found all of the packages, resources, and pipeline that would consolidate my steps into a single notebook, rather than having several confusing files and folders on my hard drive. This also saves me some disk space, as some reference genomes can be up to 8GB. I'm hoping this will save some other researchers/enthusiasts some headaches.

Features

  • Download SRA Data: Access raw genetic sequencing data from NCBI's Sequence Read Archive (SRA).

  • Genome Alignment: Align SRA data to a reference genome, providing insights into gene locations and variations.

  • IGV Visualization: Utilize the Integrative Genomics Viewer (IGV) to visualize sequencing data directly in your browser.

Screenshot

image

Data pipeline

Research Significance

  • Analyze gene expression patterns.

  • Investigate non-coding genome regions.

  • Identify genetic variations (mutations, SNPs).

Ideas for further exploration

  • Compare genomes across species or strains.