Course material for useR 2015 workshop -- Bioconductor for High Throughput Sequence Analysis
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


DNA sequence analysis generates large volumes of data presenting challenging bioinformatic and statistical problems. This tutorial introduces established and new Bioconductor packages and work flows for the analysis of sequence data. We learn about approaches for efficiently manipulating sequences and alignments, and introduce common work flows and the unique statistical challenges associated with 'RNAseq', variant annotation, and other experiments. The emphasis is on exploratory analysis, and the analysis of designed experiments. The workshop emphasizes orientation within the Bioconductor milieu; we will touch on the AnnotationHub, Biostrings, GenomicRanges, GenomicAlignments, DESeq2, and other packages, with short exercises to illustrate the functionality of each package.


  • Gain overall familiarity with Bioconductor packages for high-throughput sequence analysis, including Bioconductor vignettes and classes.

  • Obtain experience running bioninformatic work flows for data quality assessment, RNA-seq differential expression, and manipulating variant call format files

  • Appreciate the importance of ranges and range-based manipulation for modern genomic analysis

  • Learn 'best practices' for working with large data


  • Introduction to Bioconductor -- packages and classes

  • Short work flows

    • Exploring sequences and alignments
    • RNA-seq: a high-level tour
    • Annotating variants


The workshop assumes an intermediate level of familiarity with R, and basic understanding of biological and technological aspects of high-throughput sequence analysis. Participants should come prepared with a modern wireless-enabled laptop and web browser installed.

We will use pre-configured Amazon machine instances during the course, so no package installation is necessary. For use after the course, install necessary software with the following commands

biocLite(c("shiny", "airway", "AnnotationHub", "Biostrings",
    "DESeq2", "GenomicAlignments", " GenomicFiles", "GenomicRanges",
    "Rsamtools", "TxDb.Hsapiens.UCSC.hg19.knownGene", "",
    "Homo.sapiens", "RNAseqData.HNRNPC.bam.chr14"))

Intended Audience

This workshop is for professional bioinformaticians and statisticians intending to use R / Bioconductor for analysis and comprehension of high-throughput sequence data.