Skip to content

misshie/bioruby-genomic-interval

Repository files navigation

bio-genomic-interval

Author

MISHIMA, Hiroyuki (hmishima AT nagasaki-u.ac.jp, missy AT be.to)

Version

0.1.2

Copyright

Copyright © MISHIMA, Hiroyuki, 2011

License

the MIT/X11 license

A BioRuby plugin: handling genomic intervals,such as “chr1:123-456”, and overlaps between two intervals.

Install

$ gem install bio-genomic-interval
(sudo or switching to root may be required)

Usage

Generation of interval objects

a interval object be generated by like the following:

a = Bio::GenomicInterval.new("chr1", 123, 456)
a.to_s # => "chr1:123-456"
b = Bio::GenomicInterval.parse("chr1:123-456")
b.to_s # => "chr1:123-456"
b2 = Bio::GenomicInterval.parse("chr1:1,234,567-2,345,678")
b2.to_s # => "chr1:1234567-2345678" # ignoring ","
c = Bio::GenomicInetrval.zero_based("chr1", 122, 456)
c.to_s # => "chr1:123-456"
c.zero_start # => 122

Last one is generated from a “Zero-based half-closed[start, end)” interval , which used in UCSC Genobe Browser’s BED format, instead of usual “One-based full-closed [start, end]” intervals.

d = Bio::GenomicInetrval.zero_based("chr1", 100, 100)
d.to_s # => "chr1:101-101"
d.length # => 1

In the BED format, an insertion position is like “chr1:100-100”, whose size is 0. This interval is converted into “chr1:101-101” in the one-based format. Note that size is changed to 1.

Comparison

ref = Bio::GenomicInterval.parse("chr1:123-456")
cmp = Bio::GenomicInterval.parse("chr1:234-567")
ref.compare(cmp) # => :right_overlapped

ref.adjacent # => 20
near = Bio::GenomicInterval.parse("chr1:458-567")
ref.compare(cmp) # => :right_adjacent

ref.adjacent = 1
ref.compare(cmp) # => :right_off

Bio::GenomicInterval.compare returns one of the followings:

:different_chrom, :left_adjacent, :right_adjacent
:left_off, :right_off, :equal
:contained_by, :contains, :left_overlapped, :right_overlapped

Overlap metrics

  • When a overlap exist, return a positive integers (>=1) for the overlap length.

  • When a overlap does not exist, return a zero or a negative (<= 0) for the space size between the intervals.

ref = Bio::GenomicInterval.parse("chr1:10-20")
cmp = Bio::GenomicInterval.parse("chr1:15-25")
ref.overlap(cmp) # => 6
cmp2 = Bio::GenomicInterval.parse("chr1:25-35")
ref.overlap(cmp2) # => -4

Expansion (or integration)

ref =   Bio::GenomicInterval.parse("chr1:400-600")
other = Bio::GenomicInterval.parse("chr1:650-800")
ref.expand(other).to_s # => "chr1:400-800"

Center

obj1 = Bio::GenomicInterval.parse("chr1:1-3")
obj1.center # => "chr1:2-2"
obj2 = Bio::GenomicInterval.parse("chr2:1-4")
obj2.center # => "chr1:2-2"

And others

ref =   Bio::GenomicInterval.parse("chr1:400-600")
other = Bio::GenomicInterval.parse("chr1:605-800")
ref.overlapped?(other) # => false
ref.nearly_overlapped?(other) # => true
ref.size # => 201
ref.chr_start -= 100
ref.chr_end += 100
ref.chrom = "chrX"
ref.to_s # => "chrX:300-700"

See also the Rspec file.

Contributing to bio-genomic-interval

Please do not hesitate to contanct the author by emails.

Copyright © 2011 Hiroyuki Mishima. See LICENSE.txt for further details.

About

a BioRuby plugin: handling genomic interavals and overlaps

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages