Skip to content
a BioRuby plugin: handling genomic interavals and overlaps
Ruby
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib
spec
.document
.gitignore
.rspec
Gemfile
Gemfile.lock
LICENSE.txt
README.rdoc
Rakefile
VERSION
bio-genomic-interval.gemspec Bio::GenomicInterval.parse now ignore "," in strings Apr 20, 2011

README.rdoc

bio-genomic-interval

Author

MISHIMA, Hiroyuki (hmishima AT nagasaki-u.ac.jp, missy AT be.to)

Version

0.1.2

Copyright

Copyright © MISHIMA, Hiroyuki, 2011

License

the MIT/X11 license

A BioRuby plugin: handling genomic intervals,such as “chr1:123-456”, and overlaps between two intervals.

Install

$ gem install bio-genomic-interval
(sudo or switching to root may be required)

Usage

Generation of interval objects

a interval object be generated by like the following:

a = Bio::GenomicInterval.new("chr1", 123, 456)
a.to_s # => "chr1:123-456"
b = Bio::GenomicInterval.parse("chr1:123-456")
b.to_s # => "chr1:123-456"
b2 = Bio::GenomicInterval.parse("chr1:1,234,567-2,345,678")
b2.to_s # => "chr1:1234567-2345678" # ignoring ","
c = Bio::GenomicInetrval.zero_based("chr1", 122, 456)
c.to_s # => "chr1:123-456"
c.zero_start # => 122

Last one is generated from a “Zero-based half-closed[start, end)” interval , which used in UCSC Genobe Browser's BED format, instead of usual “One-based full-closed [start, end]” intervals.

d = Bio::GenomicInetrval.zero_based("chr1", 100, 100)
d.to_s # => "chr1:101-101"
d.length # => 1

In the BED format, an insertion position is like “chr1:100-100”, whose size is 0. This interval is converted into “chr1:101-101” in the one-based format. Note that size is changed to 1.

Comparison

ref = Bio::GenomicInterval.parse("chr1:123-456")
cmp = Bio::GenomicInterval.parse("chr1:234-567")
ref.compare(cmp) # => :right_overlapped 

ref.adjacent # => 20
near = Bio::GenomicInterval.parse("chr1:458-567")
ref.compare(cmp) # => :right_adjacent

ref.adjacent = 1
ref.compare(cmp) # => :right_off

Bio::GenomicInterval.compare returns one of the followings:

:different_chrom, :left_adjacent, :right_adjacent
:left_off, :right_off, :equal
:contained_by, :contains, :left_overlapped, :right_overlapped

Overlap metrics

  • When a overlap exist, return a positive integers (>1) for the overlap length.

  • When a overlap does not exist, return a zero or a negative (<= 0) for the space size between the intervals.

ref = Bio::GenomicInterval.parse("chr1:10-20")
cmp = Bio::GenomicInterval.parse("chr1:15-25")
ref.overlap(cmp) # => 6
cmp2 = Bio::GenomicInterval.parse("chr1:25-35")
ref.overlap(cmp2) # => -4

Expansion (or integration)

ref =   Bio::GenomicInterval.parse("chr1:400-600")
other = Bio::GenomicInterval.parse("chr1:650-800")
ref.expand(other).to_s # => "chr1:400-800"

Center

obj1 = Bio::GenomicInterval.parse("chr1:1-3")
obj1.center # => "chr1:2-2"
obj2 = Bio::GenomicInterval.parse("chr2:1-4")
obj2.center # => "chr1:2-2"

And others

ref =   Bio::GenomicInterval.parse("chr1:400-600")
other = Bio::GenomicInterval.parse("chr1:605-800")
ref.overlapped?(other) # => false
ref.nearly_overlapped?(other) # => true
ref.size # => 201
ref.chr_start -= 100
ref.chr_end += 100
ref.chrom = "chrX"
ref.to_s # => "chrX:300-700"

See also the Rspec file.

Contributing to bio-genomic-interval

Please do not hesitate to contanct the author by emails.

Copyright

Copyright © 2011 Hiroyuki Mishima. See LICENSE.txt for further details.

Something went wrong with that request. Please try again.