Skip to content
Browse files

Bio::Cluster::SequenceFamily - bug fixes and performance

* fix bug when accessing the Bio::Species object. If a Bio::Seq object
  had no Bio::Species information, can('species') still returned true
  but the method would then return undef, and give an error next.
* when using multiple criteria, objects matching more than one criteria
  would appear repeated. Not anymore.
* if for some reason there's two equal objects on the family (same
  reference), they will also appear only once in the results.
* some improved performance is expected for large families when using
  multiple criteria now.
  • Loading branch information...
1 parent db792bd commit 9f42b6973813074d2442fc7c0849f92e80d26a79 @carandraug carandraug committed Mar 26, 2013
Showing with 21 additions and 22 deletions.
  1. +16 −22 Bio/Cluster/SequenceFamily.pm
  2. +5 −0 Changes
View
38 Bio/Cluster/SequenceFamily.pm
@@ -104,10 +104,9 @@ methods. Internal methods are usually preceded with a "_".
package Bio::Cluster::SequenceFamily;
use strict;
-
+use warnings;
use base qw(Bio::Root::Root Bio::Cluster::FamilyI);
-
=head2 new
Title : new
@@ -323,27 +322,22 @@ sub description{
sub get_members {
my $self = shift;
- my @ret;
-
- if(@_) {
- my %hash = @_;
- foreach my $mem ( @{$self->{'_members'}} ) {
- foreach my $key ( keys %hash){
- my $method = $key;
- $method=~s/-//g;
- if($mem->can('species')){
- my $species = $mem->species;
- $species->can($method) ||
- $self->throw("$method is an invalid criteria");
- if($species->$method() eq $hash{$key} ){
- push @ret, $mem;
- }
- }
- }
- }
- return @ret;
+ return @{$self->{'_members'}} unless @_;
+
+ ## since the logic behind the checks is OR, we keep the ids in an hash for
+ ## performance (skip the test if it's already there) and to avoid repats
+ my %match;
+ my %filter = @_;
+ foreach my $key (keys %filter) {
+ (my $method = $key) =~ s/^-//;
+ %match = (%match, map { $_ => $_ } grep {
+ ! $match{$_} && $_->species &&
+ ($_->species->can($method) ||
+ $self->throw("$method is an invalid criteria")) &&
+ $_->species->$method() eq $filter{$key}
+ } @{$self->{'_members'}});
}
- return @{$self->{'_members'}};
+ return map {$match{$_}} keys (%match);
}
=head2 size
View
5 Changes
@@ -66,6 +66,9 @@ CPAN releases are branched from 'master'.
- moved from bioperl-live into the separate distribution Bio-FeatureIO
* Bio::SeqFeature::Annotated
- moved from bioperl-live into the separate distribution Bio-FeatureIO
+ * Bio::Cluster::SequenceFamily
+ - improved performance when using get_members with overlapping multiple
+ criteria
[Bug fixes]
@@ -117,6 +120,8 @@ CPAN releases are branched from 'master'.
[fangly]
* Fixed bp_genbank2gff3.pl crash when missing source feature date [fangly]
* Bio::PrimarySeq constructor -direct works for -seq or -ref_to_seq [fangly]
+ * Bio::Cluster::SequenceFamily - checks if the sequence has a Bio::Species
+ object before trying to access, and no longer returns repeated sequences.
1.6.901 May 18, 2011

0 comments on commit 9f42b69

Please sign in to comment.
Something went wrong with that request. Please try again.