Skip to content
This plugin extends Bioruby's AminoAcid sequence class by adding methods for analysing PfEMP1 DBL-alpha sequence tags
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Build Status


DBL-alpha tags are small regions of the PfEMP1 protein that can be PCR amplified and are classified into six expression groups depending on the number of cysteines and presence of certain motifs within the tag region (Bull et al 2007). This plugin extends bioruby's amino acid class, Bio::Sequence::AA by adding methods to analyze DBL-alpha sequences tags. If you use this plugin please quote, Bull et al “An approach to classifying sequence tags sampled from Plasmodium falciparum var genes..” Molecular and Biochemical Parasitology 154 (1) (July): 98–102. doi:10.1016/j.molbiopara.2007.03.011.


Ruby must be installed on your system. See for information on Ruby and how to install it on your system. Once Ruby 1.9.2 is installed type the following command in the terminal to install the gem. This will install the bioruby gem if it is not already installed on your system. The plugin has been tested on Ruby 1.9.2-p290.

gem install bio-dbla-classifier


gem uninstall bio-dbla-classifier


#create an instance of Bioruby's Bio::Sequence::AA class with methods to classify and describe DBL-alpha tags.

require 'bio-dbla-classifier'

dbl_seq =

#get the positions of limited variability
puts dbl_seq.polv1 #=> MFKS
puts dbl_seq.polv2 #=> LRED
puts dbl_seq.polv3 #=> KAIT
puts dbl_seq.polv4 #=> PTNL

#get the number of cysteines in the tag
puts dbl_seq.cys_count #=> 2

#get the distinct sequence identifier
puts dbl_seq.dsid #=>MFKS-LRED-KAIT-2-PTNL-115

#get the cyspolv group for this tag
puts dbl_seq.cyspolv_group #=> 1

#get the block sharing group for this tag
puts dbl_seq.bs_group #=> 1

#get the length of the tag
puts dbl_seq.size #=> 115

#determine whether the tag is a var1
dbl_seq.is_var1? #=> false

#is this tag a groupA like sequence tag?

Finding the Position Specific Polymorphic Blocks(PSPB)

The pspb methods take 2 arguments, an anchor position and a window length that defines the length of the pspb.The default anchor position is 0 and the default window length is 14

#get pspb1
puts seq.pspb1 #=> NPEVEKGLKAVFRK

#get pspb2
puts seq.pspb2 #=> THYADEDGSGNYVK

#get pspb3
puts seq.pspb3 #=> CKAPQSVHYFIKTS

#get pspb4
puts seq.pspb4 #=> FTSHGKCGRNETNV

#get a pspb as an amino acid
puts seq.pspb4_as_dna

Finding polv2 to polv3 regions

Only the Amino acid regions are supported for now 

#get polv1 to polv2 region
puts seq.polv1_to_polv2

#get polv3 to polv4 region
puts seq.polv3_to_polv4

Processing a flatfile for example fasta, genbank or embl

seq_file = "sequences.fasta"

#for each entry in the file do |entry|
 tag =
 puts "#{entry.definition},#{tag.cyspolv_group},#{tag.dsid},#{tag.bs_group},"


See LICENSE.txt for further details

Something went wrong with that request. Please try again.