A bioruby wrapper for parsing and reading CD-HIT cluster reports
Ruby
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
bin
lib
test
.document
.gitignore
.travis.yml
Gemfile
LICENSE.txt
README.md
Rakefile
VERSION

README.md

[[#]] bio-cd-hit-report

Build Status

Clustering sequences with CD-HIT produces a cluster file(.clstr) containing sequence names and their respective clusters. This plugin provides methods for parsing this file.

Note: this plugin is under active development!

Installation

    gem install bio-cd-hit-report

Usage

    require 'bio-cd-hit-report'

    cluster_file = "cluster95.clstr"
    report = Bio::CdHitReport.new(cluster_file)

      #print total number of clusters in the report
      puts report.total_clusters  

      #print the cluster members for cluster with id 1
      puts report.get_cluster(1)

      #information for each cluster
      report.each_cluster do |c|
        puts c.name        #print the full cluster name
        puts c.members     #print respective sequence names in the cluster
        puts c.cluster_id  #print the cluster id only
        puts c.size        #print the total number of entries in the cluster
        puts c.rep_seq     #print the name of the representative sequence in this cluster
      end

Project home page

Information on the source tree, documentation, examples, issues and how to contribute, see

http://github.com/georgeG/bioruby-cd-hit-report

The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.

Cite

If you use this software, please cite one of

Biogems.info

This Biogem is published at #bio-cd-hit-report

Copyright

Copyright (c) 2013 George Githinji. See LICENSE.txt for further details.