Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Adding molecular cloning methods to Bio::SeqUtils #31

Merged
merged 10 commits into from

2 participants

Frank Schwach Chris Fields
Frank Schwach

As discussed on the mailing ist:

I needed to manipulate Bio::Seq objects with annotations and sequence
features to simulate molecular cloning techniques, e.g. to cut a vector
and insert a fragment into it while preserving all the annotations and
moving the features accordingly.
My main aim was to split features that span deletion/insertion sites in
a meaningful way, which can not be done with the currently availble
methods.
I have modified Bio::SeqUtils so that I have the following new methods:

delete

removes a segment from a sequence object and adjusts positions and types
of locations of sequence features:

  • locations of features that span the deletion sites are turned into Splits.
  • locations that extend into the deleted region are turned to Fuzzy to indicate that their true start/end was lost.
  • locations contained inside the deleted regions are lost.
  • other features are shifted according to the length of the deletion.

insert

adds a Bio::Seq object into another one between specified insertion
sites. This also affects the features on the recipient sequence:

  • locations of features that span the insertion site are split but position types are not turned to Fuzzy because no part of the original feature is lost.
  • other features are shifted according to the length of the insertion.

ligate

just for convenience. Supply a recipient, a fragment and one or two
sites to cut the recipient. Can also flip the fragment if required.
Simply calls delete [, reverse_complement_with_features] and insert in
turn.

One situation I haven't handled yet is a deletion that spans the origin
of a circular molecule but that should be a rare thing to do anyway. The
code currently throws an error if this is attempted.

Frank Schwach added some commits
Frank Schwach Added methods for in-silico molecular cloning to Bio::SeqUtils.
delete: remove a segment from a sequence object, preserving annotations
and features.
insert: insert a fragment sequence object into a recipient sequence object,
preserving features and annotations
ligate: combine delete and insert to simulate digestion of a recipient
and ligation of a fragment into the recipient.
f489238
Frank Schwach Added to Contributors 6c1db90
Frank Schwach Added POD for new features d31ea3f
Frank Schwach hanged recursive acquisition of sub features for deletion/insertion f…
…rom using remove_SeqFeatures to get_SeqFeatures. The former has the side-effect of modifying the original sequence object, which should be avoided
54322ef
Frank Schwach added tests for delete/insert/ligate methods 7c9e48d
Frank Schwach modified existing methods _coord_revcom and _coord_adjust to make use…
… of the new methods _single_loc_object_from_collection and _location_objects_from_coordinate_list to reduce code-duplication
e0e063f
Frank Schwach

One thing I was considering while writing the code was to use Clone::Fast to generate the new objects for the recipient sequence. Currently, the code asks if the sequence object is allowed to call "new" on its class and if not, creates a PrimarySeq object instead. If we could simply clone the object, we would not have to do this.
I'm just wondering if there is any reason (which I have not come across) why Clone::Fast (or any other Clone module) should not be used here - could there be problems in a threaded environment?

Frank Schwach added some commits
Frank Schwach corrected call to revcom_with_features in "ligate" and corrected POD …
…for method "ligate"
8b26e80
Frank Schwach corrected call to 'ligate' method with named parameters 1190cca
Frank Schwach changed behaviour of feature ends in deletions: a deletion no longer
turns truncated feature ends Fuzzy. Instead, as suggested by Roy
Chaudhuri and Chris Fields, they don't change type but a note is added
to the feature, informing about the length and position of the deletion.
Notes are now also added to features that have received an insertion.
The notes refer to the affected feature end as 3'/5' if the feature has
a strand, or start/end if it doesn't.
Also corrected an error in calculating the start position of subfeatures
that are created by insertions (was off by 1).
Added tests for the notes and removed tests for changed location
types
72ac9a8
Frank Schwach Added feature for deletion sites and bugfixes
'delete' method now adds a misc_feature with a note about
the length of the deletion site. The location type of this feature
is IN-BETWEEN.

Features of type IN-BETWEEN must have adjacent start/
end pos, so they are now deleted in the 'insert' method if they
co-localise with the insertion site. This happens when 'delete' is
followed by 'insert' or when using the 'ligate' shortcut method.
'insert' now also handles deleted features like 'delete', which
only applies to features with IN-BETWEEN locations.

Other fixes:

 - 'ligate' now skips the 'delete' step if 'left' and 'right' are
   adjacent because no deletion actually occurs.

 - '_coord_adjust_deletion': fixed test for splitting a feature.
   can not use 'contains' because that returns true if one or both
   coordinates of the feature and deletions co-localise but a split
   only makes sense when both ends overlap.

 - added and modified tests accordingly
014eda4
Chris Fields
Owner

BioPerl objects can be cloned (there is a Bio::Root::Root::clone() method). This will use either Clone or Storable (Clone preferentially, Storable as the core fallback). I haven't used Clone::Fast, but it appears to use Clone as well.

Chris Fields
Owner

Merging this in, btw.

Chris Fields cjfields merged commit e2b0616 into from
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 10, 2012
  1. Added methods for in-silico molecular cloning to Bio::SeqUtils.

    Frank Schwach authored
    delete: remove a segment from a sequence object, preserving annotations
    and features.
    insert: insert a fragment sequence object into a recipient sequence object,
    preserving features and annotations
    ligate: combine delete and insert to simulate digestion of a recipient
    and ligation of a fragment into the recipient.
  2. Added to Contributors

    Frank Schwach authored
  3. Added POD for new features

    Frank Schwach authored
  4. hanged recursive acquisition of sub features for deletion/insertion f…

    Frank Schwach authored
    …rom using remove_SeqFeatures to get_SeqFeatures. The former has the side-effect of modifying the original sequence object, which should be avoided
  5. added tests for delete/insert/ligate methods

    Frank Schwach authored
  6. modified existing methods _coord_revcom and _coord_adjust to make use…

    Frank Schwach authored
    … of the new methods _single_loc_object_from_collection and _location_objects_from_coordinate_list to reduce code-duplication
  7. corrected call to revcom_with_features in "ligate" and corrected POD …

    Frank Schwach authored
    …for method "ligate"
  8. corrected call to 'ligate' method with named parameters

    Frank Schwach authored
Commits on Jan 11, 2012
  1. changed behaviour of feature ends in deletions: a deletion no longer

    Frank Schwach authored
    turns truncated feature ends Fuzzy. Instead, as suggested by Roy
    Chaudhuri and Chris Fields, they don't change type but a note is added
    to the feature, informing about the length and position of the deletion.
    Notes are now also added to features that have received an insertion.
    The notes refer to the affected feature end as 3'/5' if the feature has
    a strand, or start/end if it doesn't.
    Also corrected an error in calculating the start position of subfeatures
    that are created by insertions (was off by 1).
    Added tests for the notes and removed tests for changed location
    types
  2. Added feature for deletion sites and bugfixes

    Frank Schwach authored
    'delete' method now adds a misc_feature with a note about
    the length of the deletion site. The location type of this feature
    is IN-BETWEEN.
    
    Features of type IN-BETWEEN must have adjacent start/
    end pos, so they are now deleted in the 'insert' method if they
    co-localise with the insertion site. This happens when 'delete' is
    followed by 'insert' or when using the 'ligate' shortcut method.
    'insert' now also handles deleted features like 'delete', which
    only applies to features with IN-BETWEEN locations.
    
    Other fixes:
    
     - 'ligate' now skips the 'delete' step if 'left' and 'right' are
       adjacent because no deletion actually occurs.
    
     - '_coord_adjust_deletion': fixed test for splitting a feature.
       can not use 'contains' because that returns true if one or both
       coordinates of the feature and deletions co-localise but a split
       only makes sense when both ends overlap.
    
     - added and modified tests accordingly
Something went wrong with that request. Please try again.