Skip to content

Chapter 1: Raw read Archives and Cloud Services

Saranga Wijeratne edited this page Sep 12, 2018 · 4 revisions

Raw read Archives and Cloud services – SRA (Sequence Read Archive) and Illumina Basespace


Introduction to SRA- https://www.ncbi.nlm.nih.gov/sra

Sequence Read Archive (SRA) makes biological sequence data available to the research community

  • Enhance reproducibility

  • Allow for new discoveries by comparing data sets

  • The SRA stores raw sequencing data and alignment information from high-throughput sequencing platforms, including

    • Roche 454 GS System
    • Illumina Genome Analyzer
    • Applied Biosystems SOLiD System
    • Helicos Heliscope
    • Complete Genomics
    • Pacific Biosciences SMRT
    • Oxford Nanopore Minion sequencer???
  • Submitting data to SRA

  • SRA Toolkit

    • fastq-dump Converts SRA data into fastq format
    • prefetch Allows command-line downloading of SRA
  • Data growth


Introduction to Illumina Basespace

Basespace is a cloud-based genomic analysis and storage platform

  • Set up runs for Illumina sequencer
  • Remote monitoring for a sequence run
  • Stream data to cloud-storage directly from the sequencer
  • Analyze data with pre-configured Bioinformatics pipeline
  • Make data sharing and downloading easy