Permalink
Browse files

initial commit

  • Loading branch information...
0 parents commit 0b10067d0c44904af5e2e53f873ebf03b7c69ee9 @lomereiter committed Jun 21, 2012
11 Gemfile
@@ -0,0 +1,11 @@
+source "http://rubygems.org"
+
+gem "bio", "~> 1.4.2"
+gem "oj", "~> 1.2.9"
+
+group :development do
+ gem "bundler", "~> 1.1.4"
+ gem "jeweler", "~> 1.8.3"
+ gem "rspec", "~> 2.7.0"
+ gem "cucumber", "~> 1.2.0"
+end
@@ -0,0 +1,20 @@
+Copyright (c) 2012 Artem Tarasov
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,47 @@
+# bio-sambamba
+
+[![Build Status](https://secure.travis-ci.org/lomereiter/bioruby-sambamba.png)](http://travis-ci.org/lomereiter/bioruby-sambamba)
+
+Full description goes here
+
+Note: this software is under active development!
+
+## Installation
+
+```sh
+ gem install bio-sambamba
+```
+
+## Usage
+
+```ruby
+ require 'bio-sambamba'
+```
+
+The API doc is online. For more code examples see the test files in
+the source tree.
+
+## Project home page
+
+Information on the source tree, documentation, examples, issues and
+how to contribute, see
+
+ http://github.com/lomereiter/bioruby-sambamba
+
+The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.
+
+## Cite
+
+If you use this software, please cite one of
+
+* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
+* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)
+
+## Biogems.info
+
+This Biogem is published at [#bio-sambamba](http://biogems.info/index.html)
+
+## Copyright
+
+Copyright (c) 2012 Artem Tarasov. See LICENSE.txt for further details.
+
@@ -0,0 +1,47 @@
+# encoding: utf-8
+
+require 'rubygems'
+require 'bundler'
+begin
+ Bundler.setup(:default, :development)
+rescue Bundler::BundlerError => e
+ $stderr.puts e.message
+ $stderr.puts "Run `bundle install` to install missing gems"
+ exit e.status_code
+end
+require 'rake'
+
+require 'jeweler'
+Jeweler::Tasks.new do |gem|
+ # gem is a Gem::Specification... see http://docs.rubygems.org/read/chapter/20 for more options
+ gem.name = "bio-sambamba"
+ gem.homepage = "http://github.com/lomereiter/bioruby-sambamba"
+ gem.license = "MIT"
+ gem.summary = %Q{Ruby wrapper for Sambamba tool}
+ gem.description = %Q{New Sambamba library comes with a command-line tool for working with SAM/BAM files. This gem brings some of its functionality to Ruby.}
+ gem.email = "lomereiter@gmail.com"
+ gem.authors = ["Artem Tarasov"]
+ # dependencies defined in Gemfile
+
+ gem.files.include "lib/bio-sambamba/*.rb"
+ gem.files.include "lib/bio-sambamba.rb"
+end
+Jeweler::RubygemsDotOrgTasks.new
+
+require 'cucumber/rake/task'
+Cucumber::Rake::Task.new do |features|
+end
+
+task :test => :cucumber
+
+task :default => :test
+
+require 'rdoc/task'
+Rake::RDocTask.new do |rdoc|
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
+
+ rdoc.rdoc_dir = 'rdoc'
+ rdoc.title = "bio-sambamba #{version}"
+ rdoc.rdoc_files.include('README*')
+ rdoc.rdoc_files.include('lib/**/*.rb')
+end
@@ -0,0 +1 @@
+0.0.0
@@ -0,0 +1,40 @@
+Feature: iterating alignment records
+
+ In order to have access to all information contained in a BAM file,
+ As a bioinformatician,
+ I want to be able to iterate alignment records from Ruby
+ And have access to all their fields and tags.
+
+ Scenario: accessing alignment records
+ Given I opened a valid BAM file
+ When I use its 'alignments' method
+ Then I should be able to iterate the returned object with 'each'
+ And the objects which I iterate over should represent the alignments
+ And I should be able to access all fields mentioned in SAM/BAM format specification
+
+ Scenario: access existing alignment tag
+ Given I have an alignment
+ And it contains some tags
+ When I access it like a hash
+ And I use 2-character string as a key
+ And the alignment has such tag
+ Then I should be able to see corresponding value
+ And it should be a simple Ruby object (Array, Numeric, or String)
+
+ Scenario: invalid tag key (not of length 2)
+ Given I have an alignment
+ When I access it like a hash
+ But I use string of length different than two, as a key,
+ Then exception should be thrown.
+
+ Scenario: accessing non-existing alignment tag
+ Given I have an alignment
+ And it contains some tags
+ When I access it like a hash
+ But it doesn't contain the requested tag
+ Then nil should be returned.
+
+ Scenario: fetching all tags as a hash
+ Given I have an alignment
+ When I use its 'tags' method
+ Then I should be able to work with the returned object just like with Hash
@@ -0,0 +1,10 @@
+Feature: random access to BAM file
+ In order to retrieve information about specific regions,
+ I want to be able to quickly fetch alignments overlapping a region.
+
+ Scenario: fetching alignments
+ Given I have a BAM file
+ And it's sorted by coordinate
+ And I have its index as well
+ When I specify reference sequence and region (0-based beginning and end positions)
+ Then I should be able to immediately have access to alignments overlapping it
@@ -0,0 +1,23 @@
+Feature: access to information from SAM header
+
+ In order to work with BAM file,
+ I want to see what its header contains.
+
+ Background:
+ Given I opened a valid BAM file
+ And it contains SAM header
+
+ Scenario: getting raw text
+ When I call 'header' method
+ Then I should see text of SAM header
+
+ Scenario: accessing version and sorting order
+ When SAM header contains @HD line
+ Then I should be able to see format version
+ And I should be able to see sorting order
+
+ Scenario: getting information about reference sequences
+ When SAM header contains @SQ lines
+ Then I should be able to iterate them
+ And I should be able to see sequence names
+ And I should be able to see their lengths
@@ -0,0 +1,83 @@
+Before do
+ @bam = Bio::Bam::File.new 'test/data/ex1_header.bam'
+end
+
+When /^I use its 'alignments' method$/ do
+ @bam.should respond_to(:alignments)
+end
+
+Then /^I should be able to iterate the returned object with 'each'$/ do
+ @bam.alignments.should respond_to(:each)
+end
+
+Then /^the objects which I iterate over should represent the alignments$/ do
+ @bam.alignments.take(100).each do |read|
+ read.should be_instance_of(Bio::Bam::Alignment)
+ end
+end
+
+Then /^I should be able to access all fields mentioned in SAM\/BAM format specification$/ do
+ @read = @bam.alignments.first
+ @read.read_name.should == 'EAS56_57:6:190:289:82'
+ @read.sequence.should == 'CTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA'
+ @read.position.should == 100
+ @read.flag.should == 69
+ @read.mapping_quality.should == 0
+ @read.cigar_string.should == '*'
+ @read.reference.should == 'chr1'
+ @read.quality.should == [27, 27, 27, 22, 27, 27, 27, 26, 27, 27, 27, 27, 27, 27, 27, 27, 23, 26, 26, 27, 22, 26, 19, 27, 26, 27, 26, 26, 26, 26, 26, 24, 19, 27, 26]
+end
+
+Given /^I have an alignment$/ do
+ @read = @bam.alignments.first
+end
+
+Given /^it contains some tags$/ do
+end
+
+When /^I access it like a hash$/ do
+ @read.should respond_to(:[])
+end
+
+When /^I use 2-character string as a key$/ do
+ @key = 'MF'
+end
+
+When /^the alignment has such tag$/ do
+ @read[@key].should_not be_nil
+end
+
+Then /^I should be able to see corresponding value$/ do
+ @read[@key].should be == 192
+end
+
+Then /^it should be a simple Ruby object \(Array, Numeric, or String\)$/ do
+ @read[@key].should be_kind_of Numeric
+end
+
+When /^I use string of length different than two, as a key,$/ do
+ @key = 'key'
+end
+
+Then /^exception should be thrown\.$/ do
+ expect{@read[@key]}.to raise_error(RuntimeError)
+end
+
+When /^it doesn't contain the requested tag$/ do
+ @key = 'hq'
+end
+
+Then /^nil should be returned\.$/ do
+ @read[@key].should be_nil
+end
+
+When /^I use its 'tags' method$/ do
+ @tags = @read.tags
+end
+
+Then /^I should be able to work with the returned object just like with Hash$/ do
+ @tags.should be_kind_of Hash
+ @tags['MF'].should be == 192
+ @tags.keys.should be == ['MF']
+ @tags.values.should be == [192]
+end
@@ -0,0 +1,22 @@
+Before do
+ @bam = Bio::Bam::File.new './test/data/ex1_header.bam'
+end
+
+Given /^it's sorted by coordinate$/ do
+ @bam.header.sorting_order.should == 'coordinate'
+end
+
+Given /^I have its index as well$/ do
+ @bam.should have_index
+end
+
+When /^I specify reference sequence and region \(0-based beginning and end positions\)$/ do
+ @region = (1400 ... 1500)
+ @chr = "chr2"
+end
+
+Then /^I should be able to immediately have access to alignments overlapping it$/ do
+ @alignments = @bam.fetch @chr, @region
+ @alignments.should respond_to(:each).with(0).arguments
+ @alignments.to_a.length.should == 75
+end
@@ -0,0 +1,56 @@
+Given /^I opened a valid BAM file$/ do
+ filename = './test/data/ex1_header.bam'
+ File.exists?(filename).should be_true
+ @bamfile = Bio::Bam::File.new filename
+end
+
+Given /^it contains SAM header$/ do
+ @bamfile.header.raw_contents.length.should be > 0
+end
+
+When /^I call 'header' method$/ do
+ @header = @bamfile.header
+end
+
+Then /^I should see text of SAM header$/ do
+ @header.raw_contents.should be_kind_of String
+end
+
+Given /^SAM header contains @HD line$/ do
+ @header = @bamfile.header
+ @header.raw_contents.should =~ /^@HD/
+end
+
+Then /^I should be able to see format version$/ do
+ @version = @header.version
+ @version.should be_kind_of String
+ @version.length.should be > 0
+end
+
+Then /^I should be able to see sorting order$/ do
+ @sorting_order = @header.sorting_order
+ @sorting_order.should be_kind_of String
+ @sorting_order.length.should be > 0
+end
+
+Given /^SAM header contains @SQ lines$/ do
+ @header = @bamfile.header
+ @header.sq_lines.length.should be > 0
+end
+
+Then /^I should be able to iterate them$/ do
+ @sq_lines = @header.sq_lines
+ @sq_lines.should be_kind_of Array
+end
+
+Then /^I should be able to see sequence names$/ do
+ @line = @sq_lines.first
+ @line.should respond_to(:sequence_name).with(0).arguments
+ @line.sequence_name.should be_kind_of String
+ @line.sequence_name.length.should be > 0
+end
+
+Then /^I should be able to see their lengths$/ do
+ @line.should respond_to(:sequence_length).with(0).arguments
+ @line.sequence_length.should be_kind_of Numeric
+end
Oops, something went wrong.

0 comments on commit 0b10067

Please sign in to comment.