Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A biological sequence file (fasta, fastq, qseq) parser for Ruby
Ruby
tag: v0.1.2

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bin
lib
spec
.gitignore
.rspec
.rvmrc
.travis.yml
Gemfile
Rakefile
dna.gemspec
readme.md

readme.md

DNA

Build Status

A biological sequence file parser for Ruby

Austin G. Davis-Richardson

Features

  • Supported Formats (submit a format request)
  • Autodetection of file formats so your scripts can be format agnostic
  • Automatic Gzip support
  • Files are read from disk (not stored in memory)

Installation

With Ruby 1.8.7 or better:

$ (sudo) gem install dna

Usage

require 'dna'

# Automatic Format Detection 

File.open('sequences.fasta') do |handle|
  records = Dna.new handle

  records.each do |record|
    puts record.length
  end
end

File.open('sequences.fastq') do |handle|
  records = Dna.new handle

  records.each do |record|
    puts record.quality
  end
end

File.open('sequences.qseq') do |handle|
  records = Dna.new handle
  puts records.first.inspect
end

# Even supports Gzip 
File.open('sequences.fasta.gz') do |handle|
  records = Dna.new handle

  records.each do |record|
    puts record.length
  end
end

Bonus Feature

The DNA gem is also a command-line tool with grep-like capabilities. Print records with (Ruby) regexp match in header.

$ dna spec/data/input.fastq "[1-2]"

@1
TGAAACTTATTGATCACCCCGCTTGGCGTTGGGGAGAAATTCAGAAAAGAGTGCTTGATGGGGCGCCACATGCCGTGCAACCCACTCTCTTTCACGCAGCGCGCCCCA
+1
5888.6778888650/-//&,(,./*-11'//0&,-0.(.,,,,/2/&-,,,,,.(.,(,..&---&-,,,((*-----*+.&,,,,,(//&,,,-(,,+(,,,--&(
@2
GTCGCGGCTTACCACCCAACGATTTTTTTTAGAGGTGCTGGTTTCA
+2
2550//*-1./4.--/'+.2.,,,,,,,,&(/00.11426554+13

$ dna spec/data/test.fasta "\d"

>1
GAGAGATCTCATGACACAGCCGAAG
>2
GAGACAUAUCCNNNAA

Something went wrong with that request. Please try again.