github
Advanced Search
  • Home
  • Pricing and Signup
  • Explore GitHub
  • Blog
  • Login

jandot / locustree

  • Admin
  • Watch Unwatch
  • Fork
  • Your Fork
  • Pull Request
  • Download Source
    • 9
    • 0
  • Source
  • Commits
  • Network (0)
  • Issues (0)
  • Downloads (0)
  • Wiki (1)
  • Graphs
  • Branch: master

click here to add a description

click here to add a homepage

  • Branches (3)
    • binary
    • master ✓
    • old_dev
  • Tags (0)
Sending Request…
Enable Donations

Pledgie Donations

Once activated, we'll place the following badge in your repository's detail box:
Pledgie_example
This service is courtesy of Pledgie.

Ruby library to search genomic loci — Read more

  cancel

  cancel
  • Private
  • Read-Only
  • HTTP Read-Only

This URL has Read+Write access

Features and nodes now loaded faster, using SQL import file. Fixed bug in 
query. 
jandot (author)
Mon Aug 03 09:06:39 -0700 2009
commit  03471ab1a599e0658480a773b1153984c65e8d9c
tree    c235062dc7294e98bb76e2982b9c469465a6166d
parent  e93a144179d61c71a849067b476a6cffab72c9be
locustree /
name age
history
message
file .gitignore Loading commit data...
file README.textile Wed Jul 29 06:14:20 -0700 2009 Changed README file [jandot]
directory doc/
directory lib/
file locustree.gemspec
directory samples/
directory test/
README.textile

LocusTree – Ruby library to search genomic loci

The LocusTree library helps inworking with large numbers of genomic features. It
has two clear uses.

Fast searching

The naive way of fetching all features in a given genomic locus is to go through
all of them and check whether they are in the target locus or not. This works
fine for smaller datasets, but is very slow for very large ones.

There are several ways for speeding this up. One is using a binary search on the
sorted features; another is to use a tree. LocusTree uses this second approach.
A whole chromosome is split up in, say, 100 chunks. Each of these chunks is split
again in 100 parts, and so on. To get all features in a given region, we can
do a top-down search. We check all 100 of the nodes at the top level that overlap
with the target locus. For those that match we check the 100 subnodes, and so on.

Aggregation

For some purposes we don’t want to get the results at the highest resolution.
This is especially true for genome browsers. Suppose you want to display 15
mission SNPs on a genome and your display is 800 pixels wide. The simplest way
of doing this is to take each of the 15 million SNPs, calculate it’s pixel
position and draw it. But this means doing 15 million fetches/calculations to
draw only 800 datapoints. Using LocusTree, we can select that level of the tree
that has about 800 nodes. If, while building the tree, we added a value to each
node that states how many SNPs are within that node and its subnodes, we only
have to fetch 800 datapoints and show the SNP density for each of them.

More information

For more information, see the project wiki page

Usage

require 'locus_tree'
container = LocusTree::Container.new(100, 'genes.txt.index100', 'genes.txt')
positive_nodes = container.query('2', 143570750, 143570790, 10)
positive_nodes.each do |node|
  puts node.to_s + "\t" + node.value.to_s
end
puts container.query_single_bin('24', 49_050, 49_500).to_s
Blog | Support | Training | Contact | API | Status | Twitter | Help | Security
© 2010 GitHub Inc. All rights reserved. | Terms of Service | Privacy Policy
Powered by the Dedicated Servers and
Cloud Computing of Rackspace Hosting®
Dedicated Server