Skip to content

ttdtrang/data-rnaseq-Dmel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data package for Drosophila melanogaster RNA-seq

Sources

  • Original data source: GSE60314
  • Original citation:
    • Lin Y, Golovnina K, Chen ZX, Lee HN et al. Comparison of normalization and differential expression analyses using RNA-Seq data from 726 individual Drosophila melanogaster. BMC Genomics 2016 Jan 5;17:28. PMID: 26732976

Usage

Install the package, import the library and load the data set

devtools::install_github('ttdtrang/data-rnaseq-Dmel')
library(data.rnaseq.Dmel)
data(dmel.rnaseq)
dim(dmel.rnaseq@assayData$exprs)

The package includes 2 data sets resulted from alignment to 2 different versions of D. melanogaster genome, version 5.57 and 6.01.

For genome version 6.01

data(dmel.rnaseq.full)

For genome version 5.57

data(dmel.rnaseq.full.5.57)

To list all data sets available with the package

data(package = 'data.rnaseq.Dmel')

The data sets included in the package are following

|-- v 5.57
  |-- dmel.rnaseq.full.5.57 (17238 genes x 851 samples)
  |-- dmel.rnaseq.78A.5.57 (ERCC pool 78A: 356 samples)
  |-- dmel.rnaseq.78B.5.57 (ERCC pool 78B: 247 samples)
|-- v 6.01
  |-- dmel.rnaseq.full (17119 genes x 851 samples)
  |-- dmel.rnaseq.78A (ERCC pool 78A)
  |-- dmel.rnaseq.78A (ERCC pool 78B)

Steps to re-produce data curation

  1. cd data-raw
  2. Download the sample metadata in SOFT format from GEO entry GSE60317
  3. Download the run metadata in Excel format, and convert it into tab-separated text file.
  4. Set the environment variable DBDIR to point to the path containing said files
  5. Run the R notebook parse_metadata.Rmd to generate metadata files.
  6. Run the R notebook make-data-package.Rmd to assemble parts into ExpressionSet objects.

You may need to change some code chunk setting from eval=FALSE to eval=TRUE to make sure all chunks would be run. These chunks are disabled to avoid overwriting existing data files in the folder.